2015 43(2 )
Efficient Numeral VG-RAM Pattern Recognition Using Manhattan Distance Calculation and Minimization Algorithm
Ivan A. Hashim
Jafar W. Abdul Sadah
Thamir R. Saeed
Electrical Eng. Dept. Communications Eng. Dept. Electrical Eng. Dept. University of Technology University of Baghdad University of Technology Baghdad – Iraq Baghdad – Iraq Baghdad – Iraq
[email protected] [email protected] [email protected] FPGA System Design and Pattern Recognition Group
Abstract Pattern recognition is one of the important tools in the automation industry. Many techniques have been used to achieve this task. One of these techniques is the Virtual Generalizing Random Access Memory (VG-RAM). The weakness of this technique appears when the input is not binary. Therefore, to overcome the VG-RAM weakness, the Manhattan distance has been used instead of Hamming distance in this paper. Also, a reduction in the classification time was achieved using a minimization algorithm. The combination of these two methods takes 0.03 sec. to classify 283 input sets compared to 5.913 and 0.551 sec. using MLP and SVM methods respectively. the number of training sets has been reduced from 300 to 32 with a similarity measure reduction from 1 to 0.3. in addition, the number of occupied slices in FPGA implementation was reduced from 1557 to 976 with a probability of correct classification from 98.6% to 96.4%. Keywords: Pattern Recognition, VG-RAM, Manhattan Distance, FPGA
1.0 Introduction Automation industry has been an important research topic for many years now. Pattern recognition is one of the efficient tools for it [1]. Many techniques can be used for pattern recognition purpose; one of the efficient techniques is the Virtual Generalizing Random Access Memory (VG-RAM). In this technique artificial neurons, which are based on VG-RAM has been addressed as input by Boolean to produced Boolean output [2]. The output is stored as knowledge in the RAM inside the nodes of the network and is not in the network connection. The RAM is used as a lookup table when the input is binary. At each neuron synapses, the network's input bits are collected and used as the address of the RAM. The stored value that is pointed by this address represents the neuron's output as in figure (1)[3 ]. The storage process represents the training phase while the comparison between the incoming input with the digits in the lookup table is occurring in the testing phase [4,5].
111
1
2015 43(2 )
Input-Output Pair In Training Phase HD
01
HD
10
HD
11
HD
MHD=1 OT= 0
10 Input in Testing Phase
00
I
&
V
&
A
A OR
&
N
Output
HD-HAMMING DISTANCE MINIMUM HAMMING DISTANCE (MHD) =1 OTHER(OT)=0
&
Figure (1) Weightless neural network This technique can be considered as an efficient tool for pattern recognition, and it can represent a one-shot process, this means fast and easy implementation [4,6]. However, the memory required for storing the pairs of input-output are depending to these pairs. The aim of the current work is to reduce memory requirement as well as reducing the recognition time.
2.0 Review of existing techniques In this field, many works have been presented, these works are concerned with three issues; reduction of the memory size as [7] and multi-label problems as [4,8], and third problem is the recognition time [1,6,9,10]. In [7] attempted to reduce the memory size by using the compressed technique. Fast response time has been gained with a small reduction in performance. In [1] has used a saccadic eye system. His calculation has shown on efficient detection with few samples in the training phase. [6] used a VG-RAM Weightless Neural Network (WNN) as a predictor, the results showed that the aVG-RAM is faster than ARNN. Multi-label problems have been solved by using VG-RAM in [4,8,9].
3.0 VG-RAM WNN Operation Description The operation of VG-RAM WNN has been made with two phases, training, and testing. The storage of input-output pairs represented a learning stage of a training phase while associative search has been occurring in the testing phase. The searching process is based on the Hamming distance between the incoming and all stored data to select the associative output with a minimum distance of stored input. When more than one pair has the same minimum distance, the selection occurred randomly[3,8]. The disadvantage of VG-RAM WNN appears when many training data for large multi-class are used. The testing time turned out to be high and the memory size is increased dramatically [7]. Table(1) VG-RAM WNN Neuron Lookup Table Lookup table Entry #1 Entry #2 Entry #3 Entry#4 Input
X1
X2
X3
HD
output
0 1 1 1 ↑ 1
0 1 0 1 ↑ 0
1 0 0 0 ↑ 1
1 2 1 2
I * V A * N ↓ A or I 112
2
2015 43(2 )
Figure (1) represent a VG-WNN neuron circuits, and Table (1) is a lookup table with (X1, X2, and X3) as synapses. From the lookup table, it is clear, there are four input-output pairs, which are stored through the training phase. In the testing phase, two important processes are performed, the first one, is the calculation of Hamming distance (HD) between the input vector and of the each stored inputs. The second process, is finding the minimum of the Hamming distance between them. In case of more than one minimum of (HD), the selection is random. Table (1) shows this case, where the input vector has two same minimum HD with two entries #1 and #3. Then, the output is selected randomly between them[3,4,6,8,9,10,11]
4.0 Definition of the Problem In VG-RAM, the input and the stored input-output pairs are binary vectors. So, how to represent the VG-RAM if the elements of vectors are numeral? If the suggestion is to represent each element value in a binary vector form, what will happen? To explain that, let us take two vectors ( and ). The binary form of these two vectors will be ( and ) . Now, if the input vector is which it is close to the the input vector will be converted in binary form as ( ). The Hamming distance between the input and the first vector is 4, and between the input and second vector is 3. Thus, the nearest distance will be between input and second vector, but the input is nearest to the first vector. Therefore, the minimum Hamming distance is not effective in this case. To make VG-RAM deals with a numeral value, it must replace the matching treatment from Hamming distance by Manhattan distance for determination the matching of inputoutput pairs in recall phase.
5.0 The proposed Numeral VG-RAM weight-less neural network The Manhattan distance is the distance between two points in Manhattan space. In Cartesian coordinates, the Manhattan distance ( ) between point and point is calculated by Pythagorean formula: (1) In one dimension, the Manhattan distance ( ) between two points on the real line is the absolute value of their numerical difference: (2) Therefore, the distance ( ) between two vectors and , each vector consists of a set of numeric elements can be determined by taking sum the Manhattan distance for these elements: (3) where;
- the number of elements in each vector.
113
3
2015 43(2 )
If same example is taken, ( and ) with input vector , the distance between the input and the two vectors according to Manhattan distance will be ( and ). Thus, minimum distance appears between the input vector and first vector which is correct. Hence, the minimum Manhattan distance is the best efficient method to obtain the nearest pairs in Numeral VG-RAM (NVG-RAM). Furthermore, this method is easy to implement in hardware using FPGA.
5.1 Database Breast cancer database is used for measuring the performance of proposed NVG-RAM. This database was obtained from the University of Wisconsin Hospitals, Madison, Wisconsin, USA from Dr. William H. Wolberg [12]. The database consists of 683 sets of an instance's data (vector); each set contains ten Attributes (elements). The first nine attributes have been used to represent instances, which are: clump thickness, uniformity of cell size, uniformity of cell shape, marginal adhesion, single epithelial cell size, bare nuclei, bland chromatin, normal nucleoli, and mitoses. The attribute characteristic of these nine elements is an integer with a range from 1 to 10. The tenth element represents instance class, which is one of the two possible classes: benign or malignant.
5.2 performance measurement of the NVG-RAM For performance measurement of the proposed NVG-RAM classifier, the database has been divided into two groups, one for training phase that consists of 400 sets and 283 sets for the testing phase. Two types of a popular classifier are compared with the proposed classifier's performance accuracy. The first classifier is Multi-Layer Perceptron neural network (MLP) with two hidden layers and the second one is multiclass Support Vector Machine (SVM). These classifiers and the proposed one are used for learning with various numbers of training sets to measure and compare the accuracy under low learning sets. The performance of the three classifiers is shown in figure (2), using simulation results. As appearing in the figure, the training sets are taken from 50 to 400 sets in step 50. It is clear that the proposed classifier has the higher probability of correct classification ( ) than other classifiers for the overall range of learning set. Therefore, the NVG-RAM has an efficient performance and high accuracy. Moreover, the time of the simulation is calculated for the classification of 283 input sets and found that the new NVG-RAM takes a less time than others. The time required to classify 283 input sets using proposed classifier is 0.03 second, whereas using MLP is 5.913 seconds and using SVM is 0.551 seconds. Hence, the NVG-RAM is a high-speed low consuming time in recalling phase.
114
4
2015 43(2 )
99
Probability of Correct Classification
98 97 96 95 94 93
NVG-RAM MLP SVM
92 91 50
100
150
200 250 Number of Training Sets
300
350
400
Figure (2) Performance of the three classifiers
6.0 Proposed Algorithm for Minimizing the Number of Training Sets The size of memory used to store the input-output pairs in NVG-RAM is determined by the number of training sets because all training pairs are stored in this memory. Moreover, the speed of this neural network also limits the number of comparisons that also depend on the number of training sets. Thus, to reduce the size of memory and increase the speed of the network, the number of learning sets must be minimized while maintaining a high level of the performance accuracy. To minimize the memory size, the proposed algorithm reduces the number of training sets. The algorithm is based on the similarity measure (sm) between corresponding element pairs in the sets in the same class. If the sm between any two sets in the same class is greater or equal the desired value, these two sets will be replaced by one set that is equal the mean of the corresponding element pairs of these two sets. The replacement will be done in one important condition that is the resulted set must be not classified as other class instead of its class. The following definitions are made:
sm= similarity measure C=number of the first class. I=number of the first vector in class C. J=number of the next vector in C.
Figure (3) shows the flowchart of the proposed algorithm's steps. The numbers of the training set are reduce depending on the value of the similarity measure. As shown in figure (4), the number of sets minimized from 400 to 300 for 1 sm. This means there are many instances in which sets are identical in the overall training set. The same number is still when sm is changed from 1 to 0.85, which indicates there is no similarity for this value. However, when sm reduces to less than 0.85, the number of sets reduces to reach six sets only for sm= 0.05. 115
5
2015 43(2 )
Figure (5) shows the performance of the NVG-RAM under a wide range of the similarity measure values. As it is clear from this figure, the probability of correct classification ( is remaining constant, although sm is reduced from 1 to 0.6 and the number of training sets is minimized to 208 vectors. Even though, the value of sm steps down, the are decrease is not substantial. At sm=0.05, the value of is 91.87 %, which is appropriate with the number of sets, in this case is 6 sets only.
Figure (3) Proposed algorithm flowchart
116
6
2015 43(2 )
300
Number of Training Sets
250
200
150
100
50
0
0
0.1
0.2
0.3
0.4 0.5 0.6 Similarity Measure
0.7
0.8
0.9
1
Figure (4) relation between number of training sets and similarity measure 99
Probability of Correct Classification
98 97 96 95 94 93 92 91
0
0.1
0.2
0.3
0.4 0.5 0.6 Similarity Measure
0.7
0.8
0.9
1
Figure (5) Performance (Pcc) of the NVG-RAM against the similarity measure The time required to classify one input set is changed according to the similarity value. The number of comparisons is reduced by decreasing the number of input-output pairs, this is clear from the figure (6). The required time is when the sm is 1, while the number of sets is 300. However, the classifier takes when the sm= 0.05 because the number of input-output pairs is 6 only. Hence, the proposed algorithm has offered the increase in classification speed. The time is calculated depends on the speed of the computer CPU. However, in hardware implementation, the classification time will be faster and, variation of it will be clearer.
117
7
2015 43(2 )
-4
1.3
x 10
1.25
Recall Phase Time (second)
1.2 1.15 1.1 1.05 1 0.95 0.9 0.85 0.8
0
0.1
0.2
0.3
0.4 0.5 0.6 Similarity Measure
0.7
0.8
0.9
1
Figure (6) Classification time against similarity measure The performance of the NVG-RAM under learning at various numbers of training sets is shown in figure (7). This performance is taken in two cases; the first one, when the number of training sets is decreased according to the proposed minimization algorithm, while, the second case, when the number of training sets reduces randomly. As shown in the figure, the in case of the suggested minimization algorithm is higher than the other. This figure leads to conclude that this minimization algorithm produces a better performance under low number of training sets than the randomly reducing algorithm. 99
Probability of Correct Classification
98.5 98 97.5 97 96.5 96 Reduce training sets randomly Reduce training sets based on minimization algorithm
95.5 95 50
100
150
200 250 Number of Training Sets
300
350
400
Figure (7) Variations of Pcc against Number of training sets The probability of correct classification ( is the important parameter which can be used to measure the preference of different algorithms. Therefore, of three algorithms, NVG-RAM, MLP, and SVM has been presented in figure (8). From this figure, the advance of NVG-RAM to the other is about 2% over the range of similarity measure.
118
8
2015 43(2 )
99
Probability of Correct Classification
98 97 96 95 94 93
NVG-RAM MLP SVM
92 91 0.3
Figure (8),
0.4
0.5
0.6 0.7 Similarity Measure
0.8
0.9
1
against sm for three algorithms, NVG-RAM, MLP, and SVM
Then, to make sure, the combination of our proposed algorithms (NVG-RAM and minimization algorithms) are better than the others, the proposed minimization technique has been applied to other algorithms. Figure (9) shows various classification algorithms (SVM, MLP, and proposed NVG-RAM) probability of correct classification with some training sets, with and without combining the minimization technique. From this figure, it is clear, the advanced of our proposed algorithm with and without applying the minimization technique, but with the minimization is better. Another important point can be observed the behavior of the algorithms in figure (8) and figure (9) is nearly the same because the training sets are affected by the same value of the similarity measure. 99
Probability of Correct Classification
98 97 96 95 94
NVG-RAM MLP SVM NVG-RAM MLP SVM
93 92 91 50
100
150 200 Number of Training Sets
250
300
Figure (9) Variations of Pcc against Number of training sets for various algorithms continuous line - with a minimization algorithm dashed line without minimization algorithm The above results represent a theoretical analysis, and, this analysis proved the preference of the proposed combined algorithms (NVG-RAM and minimization). The minimization proposed approach is aimed to the hardware side. The implementation of it was done by using the Spartan-3A/3AN FPGA Starter Kit Board [13]. Table (2) 119
9
2015 43(2 )
shows the results of using similarity measure with the NVG-RAM accompanied by the number of training sets with the related clock. An important point in this table is clear when the sm is reduced the training sets also reduced, but must be noted that, the usage of this reduction depends on the case of the training classes. The reduction of the number of slices is clear, and this reduction was done by using the minimization algorithm. Table(2) Number of occupied slices related to the similarity measure similarity measure
No. of training
No. of clock
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
300 300 293 276 208 113 73 32 18 6
300 300 293 276 208 113 73 32 18 6
Number of occupied Slices 1557 1557 1537 1506 1400 1360 1136 976 859 753
Total Number of 4 input LUTs 2796 2796 2760 2722 2514 2515 2092 1774 1597 1390
7.0 Conclusion Two proposed algorithms have been presented in this paper; the first one can be considered as an improvement to the conventional VG-RAM. Where, this improvement was concentrated to the dealing with numeral data by using the Manhatten distance instead of Hamming distance. This modification has increased the probability of correct classification by 2%. The second proposed minimization algorithm was important in the implementation size and recalling or training time. It reduces the required memory size by 48%, and this is followed by reduced recalling time. In this context, the comparison has been made between the proposed algorithm and two conventional algorithms (SVM and MLP). The proposed algorithm has over-perform than the others by nearly 2.8%.
References [1]
Alberto F. De Souza, Cayo Fontana, Filipe Mutz, Tiago Alves de Oliveira, Mariella Berger, Avelino Forechi, Jorcy de Oliveira Neto, Edilson de Aguiar, and Claudine Badue, "Traffic Sign Detection with VG-RAM Weightless Neural Networks", IEEE, International Joint Conference on Neural Networks (IJCNN), Dallas, 4-9 Aug. 2013.
[2]
F.M.G. França, M. De Gregorio, P.M.V. Lima, W.R. de Oliveira, "Advances in Weightless Neural Systems", ESANN 2014 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), 23-25 April 2014,
120
11
2015 43(2 )
[3]
Alberto F. De Souza, Claudine Badue, Felipe Pedroni, Elias Oliveira, Stiven Schwanz Dias, Hallysson Oliveira, and Soterio Ferreira de Souza, "Face Recognition with VG-RAM Weightless Neural Networks", Springer Berlin Heidelberg, 18th International Conference, Prague, Czech Republic, September 36, 2008, Proceedings, Part I, pp 951-960, DOI 10.1007/978-3-540-87536-9_97, 2008
[4] Alberto F. De Souza, Felipe Pedroni, Elias Oliveira, Patrick M. Ciarelli, Wallace Favoreto Henrique, Lucas Veronese, and Claudine Badue, "Automated multi-label text categorization with VG-RAM weightless neural networks", Elsevier, Neurocomputing 72 , 2209–2217, 2009 [5]
Mitsuki Kimtura, Atsuhiro Takasu and Jun Adachi, "FPI: A Novel Indexing Method Using Frequent Patterns for Approximate String Searches", EDBT/ICDT ’13, ACM 978-1-4503-1599-9/13/03 Genoa, Italy, March 18 - 22 2013.
[6]
De Souza, Alberto Ferreira, Freitas, Fabio Daros and de Almeida, Andre Gustavo Coelho " IEEE Workshop on High Performance Computational Finance (WHPCF), Page(s): 1- 8, 2010.
[7]
De Aguiar, E .; Veronese, L. ; Berger, M. ; De Souza, A.F. ; Badue, C. and Oliveira-Santos, T., "Compressing VG-RAM WNN memory for lightweight applications", IEEE, International Joint Conference on Neural Networks (IJCNN), Beijing, 6-11 July 2014.
[8]
Claudine Badue, Felipe Pedroni, and Alberto F. De Souza "Multi-Label Text Categorization using VG-RAM Weightless Neural Networks", 10th Brazilian Symposium on Neural Networks, Pages: 105 - 110, DOI: 10.1109/SBRN.2008.29, 2008.
[9]
Alberto F. De Souza, Claudine Badue, "Improving VG-RAM WNN Multi-label Text Categorization via Label Correlation", Eighth International Conference on Intelligent Systems Design and Applications, Volume: 1, Pages: 437 - 442, DOI: 10.1109/ISDA.2008.298
[10] Mariella Berger, Avelino Forechi, Alberto F. De Souza, Jorcy de Oliveira Neto, Lucas Veronese, Victor Neves and Claudine Badue, "Traffic Sign Recognition with VG-RAM Weightless Neural Networks", 12th International Conference on Intelligent Systems Design and Applications (ISDA), Pages: 315 - 319, DOI: 10.1109/ISDA.2012.6416557, 2012. [11] Lucas de Paula Veronese, Lauro José Lyrio Junior, Filipe Wall Mutz, Jorcy de Oliveira Neto, Vitor Barbirato Azevedo, Mariella Berger, Alberto Ferreira De Souza and Claudine Badue, "Stereo Matching with VG-RAM Weightless Neural
121
11
2015 43(2 )
Networks", IEEE 12th International Conference on Intelligent Systems Design and Applications (ISDA), Pages: 309 - 314, DOI: 10.1109/ISDA.2012.6416556 , 2012. [12]
M. Lichman, "{UCI} Machine Learning Repository-DataBase", University of California, Irvine, School of Information and Computer Sciences, http://archive.ics.uci.edu/ml, 2013.
[13] Spartan-3A/3AN FPGA Starter Kit Board Web Page http://www.xilinx.com/s3astarter and http://www.xilinx.com/s3anstarter
122
12