Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA 1
1, 2
Assist. Prof. Dr. Hanan A. R. Akkar, 2M. Sc. Student Firas R. Mahdi Department of Electrical and Electronic Engineering, University of Technology, Baghdad, Iraq 1 Email:
[email protected],
[email protected] doi: 10.4156/ijact.vol2.issue2.6
Abstract This paper constructs fully parallel NN hardware realization of Artificial Neural Network (ANN) depends on the efficient execution of single neuron. Field Programmable Gate Array (FPGA) reconfigurable computing architecture is appropriated for hardware achievement of ANN. Numerous implementation of ANNs have been reported in scientific documents, trying to reduce Neural Networks NNs hardware circuitry. This paper constructs fully parallel NN hardware architecture reduces neuron hardware to perform an efficient NN through two main parts; the first part covers network training using Particle Swarm Optimization (PSO) modified MATLAB tools utilized of PSO advantages, the second part represents the hardware implementation of the trained network through Xilinx high performance Virtex FPGA schematic entry design tools.
Keywords: Artificial Neural Network (ANN), Particle Swarm Optimization (PSO), Field Programmable Gate Array (FPGA).
1. Introduction 1.1 Artificial Neural Network (ANN) An Artificial Neuron (AN) is a model of biological neuron. Where each AN receives signals from the environment or other ANs, gathers these signals applying some activation function to the signals sum and when fired transmits signal to all connected ANs. Input signals are inhibited or excited through positive or negative numerical weights associated with each connection to AN the firing of the AN and the strength of the exciting signal are controlled via a function referred to as activation function. The AN collects all incoming signals and computes a net input signal as a function of the respective weights. The net input serves to the activation function which calculated the output signal of the AN. An ANN is a layered network of ANs. ANN may consist of input, hidden and output layers. ANs in one layer are connected fully or partially to the ANs in the next layer [1].
sssssssNi (x1 , .. , xmi )=ai (
)
…(1)
Where xj input signal, wji the weight, aj the activation function, bj bias. There are several methods of training ANN Backpropagation is by far the most common. In this research PSO are supposed as the best training algorithm to our application.
1.2 Particle Swarm Optimization (PSO) Particle swarm optimization is a population based evaluation optimization technique developed by J. Kennedy and R. Eberhart in 1995 motivated by the social behavior of bird flocking or fish schooling [2]. PSO is a kind of random search algorithm that simulates nature evolutionary process and performs good characteristic in solving some difficulty optimization problems. The basic concept of PSO comes from a large number of birds flying randomly and looking for food together. Each bird is an individual and called a particle. As the birds looking for food, the particles fly in a multidimensional search space looking for the optimal solution. Here all the particles are composed of a family rather than the isolated individual for each other. They can remember their own flying experience and share theirs
64
International Journal of Advancements in Computing Technology Volume 2, Number 2, June, 2010 companions’ flying experience [3]. The basic PSO model consists of swarm of particles moving in a Ddimensional search space the direction and distance of each particle in the hyper dimensional space is resolute by its fitness and velocity. In general the fitness is primarily related with the optimization
objective and velocity is updated according to an urbane rule [4]. In PSO, populations starts with random initialization of individuals in the search space and then repeat the social behavior of the particles in the swarm till achieves the best possible result by iterative searching. At each iterative step the velocity (position change) is updated and the particle is moved towards a new position. The best previously visited position at the nth particle is denoted by the personal best position (pbest), while the position of the best individuals of the whole swarms is denoted as the global best position (gbest). In other words, the particle swarm optimization idea consists of at each time step, changing the velocity and location of each particle towards its (pbest) and (gbest) locations according to equations 2 and 3:-
Vid = W*Vi+C1*rand1*(Pid-Xid) +C2*rand2*(Pgd-Xid)
… (2)
Xid = Xid + Vid … (3) Where W is the inertia weight which bring a stability between global exploration and local exploration, C1 and C2 are two constants called learning factors [5], rand1 and rand2 are two independent random numbers uniformly spread in the range of [0,1], for equation 2 the first part represents the inertia of previous velocity; the second part is the cognition part which represents the private thinking by itself; the third part is the social part, which represents the assistance among the particles [6]. During the past few years PSO has been shown successful for many applications [7-9] several papers discuss how to apply PSO in training NNs and their advantages [10-13]. For the purpose of NN learning the empirical error referred to as the objective function to be optimized by the optimization method (minimized to 0) is given by:… (4)
M.S.E =
Where n is the number of training patterns, m is the number of outputs, T is the target and Y is the actual value [1].
1.3 Field Programmable Gate Array (FPGA) FPGA is especially designed IC that is often used for prototyping. Each FPGA has three main parts; the Configurable Logic Block (CLB) is the most significant part. CLB provides physical support for the program downloaded on FPGA. Another part is the Input Output Block (IOB) which provides input and output for FPGA and makes it possible to communicate outside the FPGA. The last part is the Programmable Interconnect (PI) which connects the different part of FPGA and allows them to communicate with each other [14]. FPGA are an excellent technology for implementing NNs hardware. Executing a NN on FPGA is a relatively easy process. For lessening the design circuitry the training will be done independently off line the FPGA, once the training is completed and the correct network weights is obtained these weights will be hard implied on FPGA. The accuracy in which these weights can be coded will depends upon the number of bits existing to implement the weights. Parallelism and dynamic adaption are two computational characteristics typically related with ANN FPGA - based reconfigurable computing architecture are well suited to implement ANNs as one can develop concurrency and rapidly reconfigure to adapt the weights and topologies of an ANN [15]. FPGA realization of ANN with large number of neurons is still a not easy task because ANN algorithm is wealthy with multiplication process and it’s relatively expensive to realize. Various work reported in this area includes new multiplication algorithm for ANN, NNs with some constraints to achieve higher speed of process at lower price and multichip realization [16-18].
2. Theory Description
65
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi
There has been significant increase in research and development in the area of applying evolutionary computation techniques for the purpose of evolving one or more aspects of ANNs. These evolutionary techniques have usually been used to develop NN structure or the network learning algorithm [19]. A new learning algorithm combined ANN was proposed to determined the optimal weights, these weights adjusted by PSO. PSO has its own advantages and disadvantages over other computational algorithms, advantages like its probabilistic mechanism and multi starting points, hence PSO can avoid getting into the local optimal solution [20], but the most utilized property PSO in this paper is its free derivative activation function, which means that we will train feed forward NNs using PSO as the learning algorithm with only Hardlimit activation function (shown in Figure 1) for all network layers. According to the Hardlimit activation function properties the output will be either one or zero, this property will be very helpful simplifying the network multiplication process.
Figure 1. Hard-limit transfer function. Numerous studies have further explored the power of PSO as a training algorithm, these studies shown that NNs trained by PSO have more accurate results than other training algorithms, but in the same time its slowest than other algorithms like Back propagation. For the purpose of neural network implementation of digital logic circuits we have modified the MATLAB PSO tools to be suitable to our application. The modified PSO tools will give us the exactly weights needed for the network training these weights will be only integers numbers this will be obliging in execution the multiplication process using only AND gates.
3. The Proposed Design of PSO Neuron Neuron is the main part of each ANN. Each neuron can have countless inputs; however it will only have one output value. Any digitally design for the neuron must take into consideration to minimize the neuron hardware circuitry. Since our application is to construct digital circuits using NNs, then the inputs to the network will be blocked between only two values either one or zero, while the weights will be just an integer values for the facility of representation and also to reduce complexity. From these two facts the multiplication process will do not needs more than AND gates where each input will multiplied each time with a single bit of the weight using an AND gate. Suppose we have a single input neuron and the weights are restricted in the integer range of [-3, 3] which means that each weight will be represented using only two bits and a single bit for the representation of the weight sign, therefore three AND gates are sufficient for the sign multiplication process representation, where 0 stands for positive sign and 1 stands for negative sign as shown in Figure 2.
66
International Journal of Advancements in Computing Technology Volume 2, Number 2, June, 2010
Figure 2. Two bit with sign multiplication digital circuit. This structure will be repeated for each input. The product produced by the multiplication process will be added or subtracted according to the weights signs using a special design adder\ subtracter
with sign as illustrated in Figure 3.
Figure 3. Two Bit Adder/Subtracter circuit. As we said, Hardlimit transfer function will be applied for all network layers, therefore the output of the neuron will be 1 if the net (final neuron inputs weights product summation) greater or equal to 0 and 0 if the net less than 0. The neuron structure is shown in Figure 4. The output of the neuron will be the same as the inverted sign of the final neuron net as shown in Figure5.
Figure 4. Two input neuron with two bit weight circuit.
67
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi
Figure 5. Two input neuron with two bit weight circuit.
4. The Proposed Design of Digital Circuit In this paper we will implement the trained by PSO ANN that performed two digital circuits; the first will be the one bit full adder while the other will be a 4 bit Arithmetic Logic Unit (ALU).
4.1 Full Adder To design 1 bit full adder we must first obtained the right integer weights using MATLAB modified PSO tools. The parameters used for this purpose: W =0.6, C1 and C2 =1.7, No. of particles 1000. 3 neurons in the input layer, 3 neurons in the hidden layer and 2 neurons in the output layer. Weights arranged between integer [-3, 3]. Where Figure 6. shows the particle dynamics through the training process, while Figure 7. shows the Full Adder Gbest against epochs until minimize the error to zero diagram, Figure 8. shows the ANN Full Adder circuit, finally Figure 9 shows logic simulator timing diagram for the ANN Full Adder circuit. Error goal reached successful termination after 18 iterations. W {1, 1} = [1 0 -3; -3 -3 1; -1 -2 -2] B {1} = [0; 2; 2] W {2, 1} = [-2 -1 2; 0 0 -3] B {2} = [0; 2]
Figure 6. Particles dynamics.
68
International Journal of Advancements in Computing Technology Volume 2, Number 2, June, 2010
Figure 7. Gbest vs. epochs for Full adder circuit.
Figure 8. The ANN Full Adder circuit.
69
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi
Figure 9. Logic simulator timing diagram ANN Full Adder circuit.
4.1 4-Bit ALU Arithmetic Logic Unit ALU is a 4-bit high speed parallel ALU, controlled by the four Function Select inputs (S0–S3) and the Mode Control input (M). It can perform all the 16 possible logic operations or 16 different arithmetic operations on active HIGH or active LOW operands. The function table lists these operations. When the Mode Control input (M) is HIGH, all internal carries are inhibited and the device performs logic operations on the individual bits as listed. When the mode control input is LOW, the carries are enabled and the device performs arithmetic operations on the two 4-bit words [21]. Table1. ALU operation [21].
Since we have a large data set 16,384 due to 14 inputs we will divide the network into 5 parts with maximum 8 inputs and output to simplify the design. Figure 10. shows the first ALU part, while Figure 11. shows Gbest against epochs until minimize the error to zero diagram and Figure 12. shows part 1. trained Neural Network digital circuit. W=0.6, C1 and C2=1.7, No. of particles 1000
70
International Journal of Advancements in Computing Technology Volume 2, Number 2, June, 2010
Figure 10.
Part 1 ALU digital circuit.
6 neurons in the input layer and 2 neurons in the output layer. Weights range blocked between integer [-7, 7]. Error goal reached successful termination after 19 iterations. W {1, 1} = [-4 -3 0 0 -1 -2; -5 -7 2 -2 1 0] B {1} = [7; 5]
Figure 11.
Gbest vs. epochs part 1 ALU.
71
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi
Figure 12. Part 1 trained Neural Network ALU. Figure 13. shows the second part ALU digital circuit, while Figure 14. shows Gbest against epochs for part 2 NN training until reached 0 error and Figure 15. shows trained neural network part 2 digital circuit.
Figure 13. Part 2 ALU digital circuit
No. of particles 10000. 4 neurons in the input layer, 3 neurons in the hidden layer and 1 neuron in the output layer. Weights range blocked between integer [-3, 3]. Error goal reached successful termination after 31 iterations. W{1,1}=[2 2 1 -1;2 1 -1 0;-1 1 2 -1] W {2,1}=[2 -2 -2] , B{1}=[-3;-1;-2] , B{2} = 1
Figure 14. Gbest vs. epochs Part 2 ALU.
72
International Journal of Advancements in Computing Technology Volume 2, Number 2, June, 2010
Figure 15. Part 2 trained NN ALU. Figure 16 shows part 3 ALU digital circuit, while Figure 17 shows Gbest vs. epochs for part 3 ALU training, and Figure 18 shows the trained part 3 NN ALU digital circuit.
Figure 16. Part 3 ALU. No. of particles 10000. 6 neurons in the input layer, 3 neurons in the hidden layer and 1 neuron in the output layer. Weights range blocked between integer [-7, 7]. Error goal reached successful termination after 55 iterations. W{1,1}=[4 0 7 -4 1 -4; -6 -1 -3 -3 -1 7; 2 1 -5 5 1 -3] , B{1} = [-2;4;-2] , B{2} = 7 W {2, 1} = [-5 -5 -3]
73
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi
Figure 17. Part 3 Gbest vs. epochs ALU.
Figure 18.
Part 3 trained NN ALU.
Figure 19 shows part 4 ALU digital circuit, while Figure 20 shows Gbest vs. epochs for the NN training, and Figure 21 shows the trained part 3 NN ALU digital circuit.
Figure 19. Part 4 ALU. No. of particles 10000. 8inputs, 3hidden neurons and 1 output neuron. Weights range blocked between integers [-31, 31]. Error goal reached successful termination after 64 iterations.
74
International Journal of Advancements in Computing Technology Volume 2, Number 2, June, 2010
Figure 20. Part 4 Gbest vs. epochs ALU.
Figure 21. Part 4 trained NN ALU. Figure 22 shows part 5 ALU digital circuit, while Figure 23. shows part 5 Gbest vs. epochs, and Figure24. show part 5 trained NN ALU digital circuit.
Figure 22. Part 5 ALU.
75
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi No. of particles 1000. 8 inputs and 1 output neuron.Weights range blocked between integer [-31, 31]. Error goal reached successful termination after 27 iterations. Figure 25 shows the trained 4- bit ALU NN.
Figure 23. Part 5 Gbest vs. epochs ALU.
Figure 24. Part 5 trained NN ALU.
76
International Journal of Advancements in Computing Technology Volume 2, Number 2, June, 2010 Figure 25.
4-bit ALU.
Figure 26. Logic simulater timing diagram for the trained NN 4-bit ALU three rondom output readings.
5. Conclusions This paper has presented the implmentation of ANNs trained by PSO learning algorithm on FPGA. The proposed learning algorithm was very useful for reducing the neuron circuitry by lessining the multiplication process to only AND gates. PSO learning algorithm was advanced over other learning algorithms in decreasing the number of neurons needed for minimizing the mean squre error to zero which means 100% accuracy. The drawback of the pso training algorithm was its slowness espicially for large number of particles over other training algorithm, but this is a normal matter because PSO is a multi starting points algorithm unlike other algorithm like Backpropagation, however even this slowness is not big essue because as we said before the training of the network will be totally outside the FPGA and for only once time, once the network have been trained the weights will be presented and hard codad as a constants 1 and 0 which means VCC and Ground.
6. References [1] A. P. Engelbrecht, Computational Intelligence: An Introduction , John Wiley & Sons Ltd, 2007. [2] J. Kennedy and R. Eberhart, “ Particle Swarm Optimization”, IEEE Int. Conf. on Neural Networks, 1995, PP 1942-1948. [3] L. Wang, X. Wang, J. Fu and L. Zhen, “A Novel Probability Binary Partical Swarm Optimization Algorithm and its Application”, Journal of software, Vol. 3, No. 9, 2008.
77
Implementation of Digital Circuits Using Neuro - Swarm Based on FPGA Assist. Prof. Dr. Hanan A. R. Akkar, M. Sc. Student Firas R. Mahdi [4] X. Feng Xie, W. Jun Zhang and Z. Lian Yang, “Adaptive Particle Swarm Optimization on Individual Level”, IEEE 0-7803-7488-6, International Conference on Signal Processing (ICSP),2002, PP 1215-1218. [5] R. Eberhart and Y. Shi, “ Particle Swarm Optimization: Developments, Applications and Resources”, IEEE Int. Conf. on Evolutionary Computation, 2001, PP 81-86. [6] J. Kennedy, “The Particle Swarm: Social Adaptation of knowledg”, IEEE Int. Conf. on Evolutionary Computation, 1997, PP 303-308. [7] V. G. Gudise and G. K. Venayagamoorthy, “FPGA Placement and Routing Using Particle Swarm Optimization”, IEEE ,Computer Society Annual Symposium on VLSI Emerging Trends in VLSI Systems Design, 0-7695-2097-9/04, 2004. [8] G. Kendall and Y. Su, “ A Particle Swarm Optimization Approach in the Construction of Optimal Risky Portfolios”, Proceedings of a 23rd IASTED International Multi-Conference Artificial Inteligence and Applications, 2005. [9] K. Chandramouli and E.Izquierdo, “Image Classification using Chaotic Particle Swarm Optimization”, IEEE 1¬4244¬0481, ICIP 2006. [10] V.G. Gudise and G.K. Venayagamoorthy, “Comparison of Particle Swarm Optimization and Backpropagation as Training Algorithms for Neural Networks”, IEEE Symposium on Swarm Intelligence, Indianapolis, IN, USA, 2003, pp 110-117. [11] F. Bergh and A. Engelbrecht, “ Cooperative Learning in Neural Networks using Particle Swarm Optimization”, SACJ/SART, 2000. [12] W. Zha, G. K. Venayagamoorthy, “Neural Networks Based Non-Uniform Scalar Quantizer Design with Particle Swarm Optimization”, IEEE 0-7803-8916, 2005. [13] J. Ru Zhang, J. Zhang, T. MingLok and M. Lyu, “ A hybrid Particle Swarm Optimization BackPropagation Algorithm for Feedforward Neural Network Training”, Applied Mathematics and Computation185, 2007, PP 1026-1037. [14] R. Raeisi and A. Kabir, “Implementation of Artificial Neural Network on FPGA”, American Society for Engineering Education, Indiana and North Central Joint Section Conference, 2006. [15] A. Muthuramalingam, S. Himavathi and E. Srinivasan, “Neural Network Implementation Using FPGA: Issues and Application”, International Journal of Information Technology, Vol. 4, No. 2, PP 86-92. [16] R. H. Turner, R. F. Woods, “Highly efficient limited range multipliers for LUT- based FPGA architecture”, IEEE transactions on vary large scale integration system Vol. 15, No.10, 2004, PP 1113-1117. [17] M. Marchesi, G. Orlandi, F. piazza and A. Uncini, “Fast neural network without multipliers”, IEEE transactions on neural network, Vol.4, No.1,1993. [18] B. Noory and V. Grozo, “A Reconfigurable Approach to Hardware Implementation of Neural Network”, Canadian conference on Electrical and Computer Engineering IEEE, CCGE1, 2003, PP1861-1863. [19] G. K. Venayagamoorthy and R. G. Harley, “Swarm Intelligence for Transmission System Control”, IEEE 1-4244-1298-6, 2007. [20] R. Mendes, P. Cortez, M. Rocha and J. Nevers, “Particle Swarm for Feedforward Neural Network Training”, IEEE 0-7803-7278-6, 2002, pp 1895-1899. [21] Fairchild Semiconductor Corporation DS00982, “DM74LS181 4-Bit Arithmetic Logic Unit”, 2000.
78