Design of Genetically Evolved Artificial Neural Network Using ...

4 downloads 0 Views 568KB Size Report
Abstract — This paper deals with designing an Artificial. Neural Network (ANN) whose weights are genetically evolved using the proposed Enhanced Genetic ...
POSTER PAPER International Journal of Recent Trends in Engineering, Vol. 1, No. 2, May 2009 

Design of Genetically Evolved Artificial Neural Network Using Enhanced Genetic Algorithm M. NirmalaDevi1, N. Mohankumar1,M. Karthick1, *Nikhil Jayan1, R. Nithya1, S. Shobana1, M. Shyam Sundar1 and S.Arumugam2 1

VLSI Design Research Group, Department of Electronics & Communication Engineering, School of Engineering, AmritaVishwaVidyapeetham,Coimbatore,TamilNadu,India,Ph:+91-422-2656422, [email protected],[email protected],[email protected],*[email protected], [email protected],[email protected],[email protected]. 2

NandhaInstitutions,Erode,TamilNadu,India

been a popular approach to address the shortcomings of Back Propagation (BP).The typical approach uses a population of Gradient-learning ANN, undergoing weight adaptation through BP training and structure evolution through EA [6]. At the same time,there are approaches that rely solely on EA for both of ANNs structure evolution and weight adaptation. Based on these observations and in relation to the proposed algorithm, these current approaches can be classified into two major types: “non invasive” which refers to the former approaches where EA selection is used but fitness evaluation requires BP or other gradient training; “invasive” which refers to the latter approaches where the system uses only EA for both. ANN can be trained using varying training algorithms which generate a particular weight set for the desired output.The trained network can be simulated to identify the other patterns or any set of inputs. There are two types of learning viz. i) supervised learning where the network is trained by the knowledge of the output and ii) unsupervised learning where the network is trained without the knowledge of the output.The network is created for the 4 bit odd or even parity. It is trained with the help of supervised learning.

Abstract — This paper deals with designing an Artificial Neural Network (ANN) whose weights are genetically evolved using the proposed Enhanced Genetic Algorithm (EGA), thereby obtaining optimal weight set. The performance is analysed by fitness function based ranking. The ability of learning may depend on many factors like the number of neurons in the hidden layer, number of training input patterns and the type of activation function used. By varying each parameter, the performance of the proposed EGA algorithm is compared with normal NN training. The results show that the proposed algorithm is better in terms of error convergence. Index terms –ANN,EGA,GAs

I.

INTRODUCTION

Neural networks (NNs) have emerged as a successful tool in the fields of pattern recognition, system identification, prediction, signal processing etc. The design of NN has two distinct steps: • Choosing proper network architecture [1] and, • Adjusting the parameters of the network so as to minimize a certain fit criterion Feed forward neural networks form the most preferred architecture in the field of ANN. Weight adaptation and the structure evolution of a NN network can be done using any gradient based technique. The advantages of the gradient-based techniques are their efficient implementations and they are good at fine-tuning. The drawback of the gradient based techniques is that the objective function can only be decreased locally.Thus the gradientbased approach either trains slowly or cannot find a near global minimum [2] in the complex space. Instead of using gradient-based learning techniques, one may apply other optimization methods such as Genetic Algorithms (GAs).

III. STEPS INVOLVED 1) Get the number of neurons in all the layer and which determines the network complexity. This can be user defined, depending on the requirement of the user [7]. 2) The maximum number of epochs is specified to prevent the network to be trained for an infinite period. 3) The goal to be met is also specified 4) The activation function for the hidden neurons can also be given by the user.The activation functions used here are tan-sigmoid, log-sigmoid and pure linear. The mean square error or the error is calculated by finding the difference between the target and the output obtained. When the network is trained fully then the error becomes zero [8].

II.TRAINING METHOD The use of Evolutionary Algorithms (EAs) to aid in Artificial Neural Networks (ANNs) learning [3]–[5] has

84    © 2009 ACADEMY PUBLISHER

POSTER PAPER International Journal of Recent Trends in Engineering, Vol. 1, No. 2, May 2009  one offspring obtained by average of parents and other is the sum of average of parents and average of lower and upper limits,and the other two offsprings are moved to be near the domain boundary (one offspring near the lower limit and other near upper limit).The offspring with least error is chosen for mutation. Three new offsprings will be generated by the mutation operation. The first offspring (nos1) is created by adding a random number to any one gene which is randomly selected. This is uniform mutation. The second offspring (nos2) is created by adding random numbers to more than one gene. The third offspring (nos3) is created by adding random numbers to all the genes. The second and third mutation allows multiple genes to be changed. The searching domain is larger than that formed by changing a single gene. The genes will have a larger space for improving when the fitness values are small[12]-[13].

5) The weight set is chosen for optimum conditions of the network such as the number of neurons in the hidden layer, epochs, and goal. The weight set thus obtained can further be optimised and the number of neurons in the hidden layer can be reduced using GA. IV. NEED FOR GA GAs is global search algorithms based on the mechanics of natural genetics and natural selection. Natural selection guarantees that chromosomes with the best fitness will propagate in future populations. Unlike the other methods, GA requires no knowledge or gradient information about the search space [9]. The main feature of GA is its capability to exploit accumulating information about an initially unknown search space in order to bias subsequent search towards feasible regions. With this property, GAs are particularly useful in searching largescale, complex, and poorly understood search spaces, where classical search techniques are often inappropriate[10]. The process involved in GA are population set is randomly generated, individuals are selected for process of reproduction which involves Crossover and Mutation. The offsprings produced by Crossover and Mutation replaces certain individuals in the population set. GA supports generic implementation of its major operations such as Crossover, Mutation, Selection, and Replacement.GA manipulates on chromosomes. They collectively comprise the population set. The basic units of individuals in the population (chromosomes) are called as genes. Reproduction takes place by means of selection, recombination and Mutation. It results in transformation of population set. In selection, we select two chromosomes from the population, based on fitness function to perform Crossover and Mutation. They are called as Parents. Crossover is the exchange of one or set of genes between parents and the newly generated chromosomes are called as offsprings. Mutation is the process of changing the genes of the chromosomes. Thus these offsprings generated results in addition of new gene values thereby leading to better population set[11].

A. The Enhanced Genetic Algorithm Step1:  Let a population set P with individuals pi be randomly generated. P= {p1, p2 …pi ,…pz} Step2: Get the probability of acceptance ‘pa’, difference between weight set of tth and (t-1) th iteration ‘m’ and difference between the errors for consecutive iterations ‘m1’ from the user Step3: Fitness function is the error function which is given by Qfit (pi) =a*Qacc + b*Qnmse , where a and b are user defined constants )  Qacc=100*(1Qnmse= ∑∑ (Ti –Oi) 2 Ti –target; Oi –network output Step 4: Fitness is evaluated for each individual pi in the population P as Qfit (pi). Step 5: The two individuals which have the highest fitness value is chosen from the entire population P. Step 6: Cross over is performed for the selected individuals as follows:

V. PROPOSED TECHNIQUE A different technique for the genetic operations is proposed this paper. The proposed Enhanced Genetic Algorithm for the selection, mutation and the crossover are explained as follows. For selection, error function is chosen as the fitness function and two individuals with least error are chosen as parents.Sum of accuracy error ‘Qacc’ and normalised mean square error ‘Qnmse’ is chosen as error function. The two parents selected undergo crossover and four offsprings are produced. The potential offsprings spreads over the entire domain,that is, two offsprings result in searching around the center region of the domain.

Step 7: Fitness is evaluated for the individuals and the individual with the highest fitness value(least error) say oscfit is taken. Step 8

85  © 2009 ACADEMY PUBLISHER

 

POSTER PAPER International Journal of Recent Trends in Engineering, Vol. 1, No. 2, May 2009  To this oscfit, mutation is performed as follows.

3. 4.

Where j=1, 2, 3 , can only take the value

Fitness is evaluated before crossover in simple GA but in EGA it is also evaluated after crossover . In EGA during mutation off springs are obtained from local and wide search area

VI. RESULTS AND COMPARISON

of 0 or 1, are randomly generated numbers. Three new offspring will be generated by the mutation operation Step 9: If the random number generated(g) < pa (user defined), then the nos1 replaces the individual with the least fitness value in the population. The population set before crossover and mutation is P= {p1, p2 …pi, pleasfitval…pz} The population set after crossover and mutation is P= {p1, p2 …pi, nos1 …pz} Else If g>pa then replace the individual with the least fitness value by nos2 P= {p1, p2 …pi , pleasfitval…pz} P= {p1, p2 …pi , nos2 …pz} Else if g=pa then replace the individual with the highest fitness value by nos3 P= {p1, p2 …pi , phigfitval…pz} P= {p1, p2 …pi , nos3…pz} Step 10: The fitness Qfit (•) is evaluated for each of the individuals in the new population. Step 11: The network is run after these updations; the output thus obtained is Oi, • The difference between Pt of tth iteration and Pt-1 of (t-1) th iteration is determined. If the difference is less than ‘m’ (for 100 consecutive iterations) then stop. • The difference between the errors for 100 consecutive iterations is determined. If the difference is less than ‘m1’, then stop. • When the defined number of iterations has been reached the execution is stopped.

A. Output of 4-bit (even or odd) Parity The network is created for the 4-bit odd or even parity and it is shown below. The architecture is 4-4-1. It consists of 4 hidden neurons with tan-sigmoid as activation function and 1 output neuron with pure-linear as activation function. inputs output

Input layer

hidden layer

output layer

Fig1: 4-4-1 neural network for 4-bit parity.

The weight set for the above architecture is generated using two different methods. The first method uses only the NN tool for the architecture evolution and weight learning. The second method uses EGA for the same. i. Optimised weight set generation using NN only Performance of 4-4-1 feed forward neural network with tan-sigmoid activation function in the hidden layer and pure linear activation function in the output layer is present. The network is trained for the following conditions with number of neurons in hidden layer as 4, maximum number of epochs as 100; performance limit is 1 and for the type of parity, even parity is chosen. The goal is met in 204 epochs. The bias and weights are generated for the hidden and output neurons of the above network and the values are shown in Table1 and Table 2. Table 1:weight set for the input to hidden layer

B. Contributions of EGA Hidden

The following are the contributions of EGA 1. In simple GA, normally by crossover only two offsprings and by mutation only two offsprings are obtained but in EGA four offsprings are obtained by crossover and three offsprings by mutation, which gives larger search area and error convergence is better. The plot in section(VI) shows the improvement in fitness 2. Error function is the fitness function and user can obtain weights of desired accuracy and training error (by means of user defined constants a and b).

1

2

3

4

1

-1.7303

-0.5556

1.8395

3.7526

2

-0.5949

2.8912

-6.2608

1.9826

3

5.1023

5.0886

-9.8268

4.2920

4

-0.2358

-4.8105

-1.8472

-1.4168

2.2237

-0.4830

-0.1997

3.1644

Input

Bias for hidden neuron

86  © 2009 ACADEMY PUBLISHER

 

POSTER PAPER International Journal of Recent Trends in Engineering, Vol. 1, No. 2, May 2009  The weight set for the same network described above is generated using the proposed algorithm is shown in Table3 and Table4. Thus the weight set obtained is fed to the network whose number of neurons in the hidden layer is chosen as10; maximum number of epochs of training is 50;the goal to be met2e-19, the type of parity chosen is odd. logsigmoid is chosen for the hidden layer.. The mean square error for the network before and after training is as shown in table:

The generated weights are thus fed to the network for evaluating its performance by calculating the error . Table2: weight set for the hidden layer to output layer Output Hidden

1 1

1.1465

2

-0.3861

3

Table 5:Comparison of error due to NN and EGA

0.2523

4

Input

0.6188

before 0000

ii. Optimised weight set generation using EGA only The same application is run using the EGA. The optimal weight set is obtained using EGA which gives the final best fit for the application. Table 3:weight set for the input to hidden layer Hidden 1

2

3

4

Input 1

0.7000

0.4269

1.4548

-0.5102

2

-0.0067

-0.5255

0.7177

1.0884

4

Bias for hidden neuron

-1.5479

2.3696

0.5198

-0.1215

-0.3902

-0.0958 0.4105 -0.8443

0.9077

0010

1

-0.0515

-1.0515

0.7270

0011

0

-1.0242

-1.0242

1.1302

0100

1

-1.2751

-1.2751

-0.7229

0101

0

-0.6489

-0.6489

0.3081

0110

0

-0.0743

-0.0743

0.5717

0111

1

-2.0599

-2.0599

-0.3362

1000

1

-0.5202

-0.5202

0.6086

1001

0

-0.3787

-0.3787

1.5937

0.3159

0.3159

1.6615

-1.9011

-1.9011

0.1027

0.5457

0.5457

-0.8644

-2.1249

-2.1249

-0.6940

-0.8089

-0.8089

-0.1969

-0.9238

-0.9238

1.0718

1101

1111

Output 1

2 3 4 Output neuron bias

0 1 0 1 1 0

Error After EGA training 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

The Table5 shows the performance comparison of the network with NN and GA with tan-sigmoid activation functio in the hidden layer. It is seen in fig2 that the error reduces to a greater extent in EGA when compared to NN along with the reduction in the number of epochs and increased performance value. It is therefore preferable o use EGA instead of NN for the best fitness.

Table 4: weight set for the hidden to output layer

1

0.5430

-0.1164

1110

Hidden

1.7715

1

1100

-1.7378

-0.0437

0001

1011

1.0526

0

After

-0.0437 -1.1164

1010

0.5006

Error due to NN Actual

-0.0886

Output neuron bias

3

target

-1.8884 -0.1080 -1.3161 -0.6726 1.4298

Fig2. Performance of NN and GA for 4bit parity

87  © 2009 ACADEMY PUBLISHER

 

POSTER PAPER International Journal of Recent Trends in Engineering, Vol. 1, No. 2, May 2009  B.Performance of TCLX 9-4-2: i. Error using NN:

Table 8:comparison of error using NN and EGA for TCLX

The number of neurons in the hidden layer is entered as 4, the number of epochs of training is given as 100, the goal to be met is 1e-19, the transfer function for the hidden layer can be any one of the following: 1.Tan sigmoid 2. Pure linear 3.Logsigmoid, and Tan-sigmoid is chosen Table 6:weight set for the input to hidden layer Hidden 1

Input 1

2

0.8500

3

Error-NN Inputs 111010010

0

0

111100111

0

1

100100111

1

0

101010101

1

1

1.2881

-0.4793

1.0130 1.1665 0.3060 0.2794

ErrorEGA After training

Before training

1.1 919 0.0362 0.1603 0.8252

1.0130 1.1 919 1.1665 1.0362

1.3060 0.1603 1.2794 1.8252

0.13 0 .10 0 .33 0.11

0.12 0.01 0.06 0.00

Table 9: weight set for the i/p to hidden layer

2

-0.6217

-0.5664

0.8579

-1.2640

3

1.0066

-0.1479

0.9224

1.8862

hidden

4

-0.4412

1.0111

0.6211

0.6912

input

5

-1.5799

0.6752

1.8476

0.4317

6

0.9078

-2.1763

1.2323

0.5405

7

-1.3016

0.1288

-1.1671

0.1832

8

-1.4175

0.3837

-0.3481

-1.1951

9

-1.3993

-1.8625

-0.7883

1.6926

0.2302

0.3869

-1.6885

-2.8766

Bias for hidden layer

actual

It is found that the learning is incomplete.since , there is a value for error even after training.

4

0.6918

target

The Table6 and Table7 shows the weights and the bias generated for hidden and output neurons of the TCLX 94-2 network with the above mentioned inputs. The performance of the network is evaluated for the network by calculating the error between the actual and the target values. The error values are compared before and after training and the values are shown in Table8..

1

2

1

-1.7908

-0.3738

-0.4372

-0.4071

2

-0.3132

-0.3838

2.0421

0.7024

3

1.8310

-1.0512

1.2465

-05199

4

-0.1814

-1.4392

0.3130

-0.5271

5

1.1607

0.9208

-0.5870

-1.0205

6

-0.6227

1.4591

-0.1012

-0.5435

-0.3832

2.0068

-1.3068

-1.3020

0.0798

-1.6912

1.5448

0.5170

1.8859

1.2914

-3.0843

0.0807

7

1.0489

8

-0.5221

9

0.9329 0.8615

Bias for hidden

3

4

ii. Error Using EGA The weights for the same network 9-4-2for TCLX pattern recognition are obtained using the proposed EGA. The weights and biases for the hidden and output neurons is shown in Table9 and Table10.

The weight set thus obtained is fed to the network and the error is calculated before and after training. Table11 shows the error values before and after training. It is found that there is a complete reduction in error of the network by using the proposed EGA

Table 7:weights from hidden to output neuron

Table 10:weight set for hidden to output layer

Hidden 1

2

3

4

Output

Hidden

Output neuron bias

1

0.9615

0.9771

-0.5167

0.8690

2

0.1102

0.3831

0.6196

-0.7424

3

4

Output

0.3737 1

2

Output neuron bias

1

-0.8339

-0.5867

0.0657

-0.0123

1.4298

2

1.0302

2.1905

1.4231

-0.4494

0.5629

-0.4055

88  © 2009 ACADEMY PUBLISHER

 

POSTER PAPER International Journal of Recent Trends in Engineering, Vol. 1, No. 2, May 2009  [2] S.Himavathi,D Anita, and A.Muthuramalingam. “Feed forward Neural Network Implementation in FPGA Using Layer Multiplexing for Effective Resource Utilization”. IEEE TNN MAY2007

Table 11: comparison of error using NN and EGA inputs

111010 010 111100 111 100100 111 101010 101

targets

Actual

0

0

0

1

1

0

1

1

Error-EGA

Error-EGA

Before training

After training

-0.3221

1.9172

-0.3221

1.9172

0.00

0.00

-1.9717

0.4060

-1.9717

-0.5940

0.00

0.00

-0.8893

0.0063

-1.8893

-00633

0.00

0.00

-1.0588

0.3067

-2.0588

-1.3067

0.00

0.00

[3] S. Grossberg, Ed., Neural Networks and Natural Intelligence. Cambridge,MA: MIT Press, 1988. [4] D. Rumelhart and J. McClelland, Eds., Parallel Distributed Processing:Explorations in Microstructure of Cognition. Cambridge, MA: MIT Press, 1986. [5] J. M. Zurada, Ed., Introduction to Neural Systems. St. Paul, MN:West,1992. [6] N. Nikolaev and H. Iba, “Learning polynomial feedforward neural networks by genetic programming and backpropagation,” IEEE Trans.Neural Netw., vol. 14, no. 2, pp. 337–350, Mar. 2003. [6]. Yutaka Maeda, Member, IEEE, and Masatoshi Wakamura “Simultaneous Perturbation Learning Rule for Recurrent Neural Networks and Its FPGA Implementation” IEEE TNN NOV2005 [7].Maeda, Y. & Tada, T. “FPGA Implementation of a Pulse Density Neural Network with Learning Ability Using Simultaneous Perturbation” IEEE Trans. on Neural Networks, 2003, 14(3), 688-695 [8].D.Goldberg, “Genetic Algorithms in Search, Optimization, and Machine Learning” Reading, MA: AddisonWesley, 1989. [9].Yao, X. “A review of Evolutionary Artificial Neural Networks” Int. J. Intell. Syst., 1993, 8(4), 539–567. [10]. Paulito P. Palmes, Taichi Hayasaka, and Shiro Usui, “Mutation-Based Genetic Neural Network” Fellow, IEEE TNN MA. [ 11]. Colin R. Reeves and Jonathan E.Rowe, “Genetic algorithms - principles and perspectives,aguide to GA theory [12] Frank H. F. Leung, Member, IEEE, H. K. Lam, S. H. Ling, and Peter K. S. Tam, “Tuning of the Structure and Parameters of a Neural Network Using an Improved Genetic Algorithm” IEEE Transactions on neural networks, vol. 14, no. 1, january 2003 [13].Reyneri, L. M. “Implementation Issues of Neuro– Fuzzy Hardware: Going Toward HW/SW Codesign” IEEE Trans. Neural Networks, 2003, 14(1),17

The Fig3 shows the performance comparison of the TCLX 9-4-2 network and it is seen that the error is reduced to maximum in EGA when compared to NN along with the reduction in the number of epochs and increased fitness value. It is therefore preferable to use EGA instead of NN for the best fitness.

Fig3: Performance of NN and GA for TCLX pattern recogni-

toin.

VIII. CONCLUSION: The Artificial Neural Networks can be used for wide range of applications. The optimal weight set can be obtained by combining GA with NN so that the percentage of accuracy of the output obtained with respect to the target is high. The proposed method is being investigated in other domains and results will be reported in the future. The proposed algorithm shows the increased possibility of inheritance of important parental tratits to the offsprings which could be missed when ordinary GA is used. The problem that could be faced while implementing the algorithm when the population size is large is that it could be time consuming.This problem would be overcome in future,by attempting some novel methods.

REFERENCES: [1] . Liu, J., & Liang D. “A Survey of FPGA-Based Hardware Implementation of ANNs”Proc.IEEEConf. 2005,915-918.

89  © 2009 ACADEMY PUBLISHER