Adobe Photoshop PDF

3 downloads 45178 Views 206KB Size Report
Figure 1: The block diagram of feature selection using IBGSA. After training the ANN, encountering every im- age, target object are extracted and detected us-.
9740

Object detection in images using artificial neural network and improved binary gravitational search algorithm Farzaneh Azadi pourghahestani, Esmat Rashedi Department of Electrical engineering, Graduate university of Advanced Technology, Kerman, Iran [email protected], [email protected].

Abstract: In this paper, artificial neural network (ANN) and improved binary gravitational search algorithm (IBGSA) are utilized to detect objects in images. Watershed algorithm is used to segment images and extract the objects. Color, texture, and geometric features are extracted from each object. IBGSA is used as a feature selection method to find the best subset of features for classifying the desired objects. The purpose of using IBGSA is to decrease complexity by selecting salient features. At last, selected features are used in the ANN for detecting objects. Experimental results on detecting hand tools show that the proposed method could find salient features for object detection. Keywords: object detection, feature extraction, artificial neural network, improved binary gravitational search algorithm, feature selection.

phological, textural, temporal, and periodical features and SVM classifier for classification. Ref [11] described a method for circular object detection in color images that an isotropic edge detector merged with spatial information and region based analysis is employed to extract image edges. Ref [12] developed a practical and rotation-invariant framework for multi-class geospatial object detection and geographic image classification based on collection of part detectors (COPD). Ref [13] represented feature selection using improved binary gravitational search algorithm (IBGSA) with the goal of improving classification accuracy. IBGSA with choosing appropriate features for detecting distinct objects could reduce the processing time and increase the precision.

1 INTRODUCTION Object detection is a challenging problem in the clutter background; also, objects can be in different poses and lighting conditions. Object detection has different application such as robotics, image mining, surgery, quality control, etc. In this regard, part based object recognition [4, 5, 6] and affine invariant features [7, 8, 9] have shown promising results. Part based approaches encode the object structure by using a set of patches covering important parts of the object. Ref [1] was proposed an object detection algorithm in a new edge color distribution space (ECDS). In the 3D ECDS, the edges of different objects are segregated and the spatial relation of a same object is kept as well, which make the object detection easier. Ref [2] utilized a method for object detection that combined the feature reduction and feature selection abilities of Kernel PCA and AdaBoost. Ref [3] proposed an approach for object detection using spatial histogram features. Ref [10] represented object detection for realtime video surveillance systems that used mor-

2 The proposed method In the current paper, watershed segmentation, artificial neural network (ANN), and IBGSA are used for object detection. A lot of features (texture, color, and geo-

276

metric features) are extracted from objects. Applying all of these features is time consuming and could increase calculation complexity of training the ANN. Determining appropriate features for recognizing objects is vital. In some cases, previous knowledge can be used for this goal. For example, there are some objects that their colors are so different from each other. Here, a method is proposed which automatically finds proper features for object detection. The flow diagram of the proposed method is as Fig. 1. In this method, selecting appropriate features is achieved with IBGSA. Subsets of extracted features from training ob-

jects are evaluated according to a classification based fitness function. K-Nearest neighbor classifier (KNN classifier) that works based on Euclidean distance is used to obtain fitness function. KNN classifier has low accuracy but because of high frequency of using the classifier in the feature selection process and the high speed of KNN, it is used in this step. By using IBGSA, the optimal set of features is selected with the aim of optimizing the evaluation function that is the precision of KNN classifier. After selecting features, due to its high efficiency, ANN is used as a classifier. Selected features are used for training ANN.

Training objects Feature extraction Feature selection

IBGSA Yes No

KNN classification

Iteration< max_it

Object detection rate

ANN

Figure 1: The block diagram of feature selection using IBGSA

After training the ANN, encountering every image, target object are extracted and detected using an algorithm depicted as Fig. 2. At first, images are enhanced by Illumination compensation. After that, object boundaries are determined by watershed algorithm. Selected features are extracted and fed to ANN to determine the class of each object.

Input image

Illumination compensation

Watershed segmentation

Feature extraction

Classification

Object detection Figure 2: The block diagram of the object detection process.

277

2.1 Illumination compensation The edge histogram descriptor captures the spatial distribution of edges [17]. Here, eighty EHD features are extracted of each object.

Due to environmental conditions, Illumination compensation is used to get good quality images. The mathematical description of the illumination compensation is as follows [14]:

Ɛ=minimum(R (Mean),G(Mean),B(Mean)) R=R×R (Mean)/Ɛ G=G×G (Mean)/Ɛ B=B×B (Mean)/Ɛ

The wavelet decomposition is an alternative representation of image data. The filter outputs are LL, LH, HL and HH. Variance and average of output vectors are extracted as features. Also, the average of the object image is another extracted feature.

Where, R, G, and B are color components and R (Mean), G (Mean), and B (Mean) are average of these components.

The Height and width ratio of the rectangle that surrounds the object are extracted as geometric features.

2.2 Segmentation

2.4 Feature selection

To separate objects in each frame, watershed segmentation algorithm [15] is used.

To select appropriate features for classification, IBGSA is used. The aim of IBGSA in feature selection is to find an optimal binary vector that each bit corresponds to a feature. Each subset of features is evaluated according to a classification based fitness function [13]. KNN is used as a classifier because of its high speed.

2.3 Feature extraction Color, texture and geometric features are utilized for object detection. Adopted features are as follow.

3 Experiments and results

One of the features is Hue. Under the different light condition, hue is almost invariant, so, average value of hue has been used as a feature.

In this paper, 6 objects including hand and some hand tools like meter, nipper, screwdriver, spanner, and tongs are used as target objects. Also, one class for other objects is considered. So, this is a classification problem with 7 classes. Some examples of target objects are presented in Fig. 3. The dataset contains of RGB images with the size of 480 x 720 in which each image contains several number of objects. Training objects are separated after applying illumination compensation and watershed algorithm.

The histogram of oriented gradients (HOG) is a feature descriptor that counts occurrences of gradient orientation in localized portions of an image [16]. To have an equal number of features, our data have been resized to 50×50 and for dimension reduction; PCA is used to get dominant directions. Setting values of HOG descriptor parameters are in the Table 1. TABLE 1: Setting of HOG parameters Cells size 8 Blocks size 8 orientations 0_180 Bins size 8

The Scalable Color Descriptor (SCD) is defined in the HSV color space [17]. The HSV space is uniformly quantized into a total 256 bins.

278

All features are given to KNN and ANN and recognition rate in the both cases is calculated. Also, according to method illustrated in Fig. 1, appropriate features are selected. Features are reduced by 49.21 percent and these features are given to ANN and KNN. Recognition rate in these cases is calculated. Results are in the Table 2. TABLE 2: Performance of ANN with selected features in object detection (Results are averaged over 5independent runs of training) ANN KNN KNN ANN with with sewith sewith tototal fealected lected tal features features features tures Recognition rate

Elapsed time for training(sec)

91.7

91.71

61.4285

63.82

76.75

288.8157

-

-

According to Table 2, recognition rate of KNN classifier is very low and not acceptable. Also, training time for ANN with selected features and consuming time for running IBGSA 50.64+76.75=127.39 sec which is much less than training time for ANN with total features whereas recognition rate is almost same in the both cases. Therefore, ANN with IBGSA is preferred for object detection. An example of hand detection results by using the proposed method is shown in Fig. 4.

Figure 3: Some examples of target objects (hand, nipper, screwdriver, spanner, tongs, and meter).

The number of samples for each target object is 50 (350 objects in total). 80% of samples are used for feature selection and training ANN and 20% are used for testing the ANN. In total, 447 features have been extracted from each object. These features are 257 color features, 89 texture features, 100 HOG features and 1 geometric feature. By setting values of Table 1, 4500 HOG features are extracted. After using PCA, these values are decreased into 100. After using IBGSA, the number of features is reduced from 447 to 227 features. Selected features are fed to ANN. The ANN is built with 227 nodes in input layer, 25 nodes in the hidden layer, and 7 nodes in the output layer. The node with maximum output is determined the objects class. After training the ANN with back propagation algorithm, according to the ANN's output, categories of the test data are determined.

279

Recognition, vol. 1, pp. 608 – 611. [2] S.Ali and M.shah ,"A Supervised Learning Framework for Generic Object Detection in Images," in Proc. 2005 IEEE 10th International Conference on Computer Vision, vol. 2, pp. 1347 – 1354. [3] H.Zhang and W. Gao, "Object detection using spatial histogram features," Image and Vision Computing, vol. 24, pp. 327–341, Apr. 2006. [4] S.Agarwal, A.Awan and D.Roth, "Learning to detect objects in images via a sparse, partbased representation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, pp. 1475-1490, Nov. 2004. [5] H.Schneiderman and T.Kanade, "Object Detection Using the Statistics of Parts," International Journal of Computer Vision, vol. 56, pp. 151-177, February. 2004.

Figure 4: Example images for correct and failed hand detecting, right - success and left- failure.

[6] E.Bart, E.Byvatov and S.Ullman, "ViewInvariant Recognition Using Corresponding Object Fragments," 8th European Conference on Computer Vision, vol. 3022, pp. 152-165, May. 2004.

4 Conclusions In this paper, object detection with ANN and IBGSA has been illustrated. KNN classifier is used to obtain fitness function in feature selection because of its high speed. After this stage, selected features are utilized for classification but due to low accuracy of KNN, ANN is used as the classifier. Target objects are hand and some hand tools. Recognition rate and elapsed time in this method proof that this approach is appropriate for automatically selecting salient features for object detection in specific jobs. This approach could be used in machine vision and robotic applications.

[7] DG.Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," International Journal of Computer Vision, vol. 60, pp. 91-110, Nov. 2004. [8] K. Mikolajczyk and C. Schmid, "An affine invariant interest point detector," 7th European Conference on Computer Vision Copenhagen, vol. 2350, pp. 128-142, Apr. 2002. [9] T.Tuytelaars and L.V.Gool, "Content-Based Image Retrieval Based on Local Affinely Invariant Regions," Visual Information and Information Systems, vol. 1614, pp. 493-500, 1999.

References [1] J.Song, M.Cai, and M.R. Lyu, "Edge Color Distribution Transform: An Efficient Tool for Object Detection in Images," in Proc. 2002 IEEE 16th International Conference on Pattern

[10] Y.Gurwicz, R.Yehezkel and B.Lachover,

280

"Multiclass object classification for real-time video surveillance systems," Pattern Recognition Letters, vol. 32, pp. 805–815, Apr 2011. [11] Y.Liu and S.Goto, "An efficient and accurate approach of circular object detection in color images," Computers and Electrical Engineering, vol. 40, pp. 26–36 Nov. 2014. [12] G.Cheng, J.Han, P.Zhou and L. Guo, "Multi-class geospatial object detection and geographic image classification based on collection of part detectors," ISPRS Journal of Photogrammetry and Remote Sensing, vol. 98, pp. 119–132, 2014. [13] E.Rashedi and H.Nezamabadi-pour, "Feature subset selection using improved binary gravitational search algorithm," Journal of Intelligent & Fuzzy Systems, vol. 26, pp. 1211–1221, 2014. [14] A.Gupta, V,Kumar Sehrawat and M.Khosla, "FPGA Based Real Time Human Hand Gesture Recognition System," 2nd International Conference on Communication, Computing & Security, vol. 6, pp. 98–107,2012. [15] L.Vincent and P.Soille, "Watersheds in digital spaces: an efficient algorithm based on immersion simulations, " IEEE Trans. pattern analysis and machine intelligence, vol. 13, pp. 583-598, Jun 1991. [16] N.Dalal and B.Triggs, "Histograms of Oriented Gradients for Human Detection," in Proc. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition , vol. 1, pp. 886 – 893. [17] B. S. Manjunath, J.Ohm, V. Vasudevan and A.Yamada, "Color and Texture Descriptors," IEEE Trans. Circuits and Systems for Video Technology, vol. 11, pp. 703 – 715, Jun 2001.

281