Improved Object Tracking Using Radial Basis Function Neural Networks Alireza Asvadi, MohammadReza Karami-Mollaie
Yasser Baleghi, Hosein Seyyedi-Andi
Department of ECE, DSP Lab. Babol University of Technology Babol, Iran
[email protected],
[email protected]
Department of ECE, DSP Lab. Babol University of Technology Babol, Iran
[email protected],
[email protected]
Abstract—In the present paper, an improved method for object tracking is proposed using Radial Basis Function Neural Networks. Here, the Pixel-based color features of object are used to develop an extended background model. The object and extended background color features are then used to train RBF Neural Network. The trained RBFNN will detect and track object in subsequent frames. The performance of the proposed tracker is tested with many video sequences. The proposed tracker is illustrated to be suitable for real-time object tracking due to its low computational complexity. Keywords-component; object tracking, k-means segmentation, radial basis function neural networks
I.
INTRODUCTION
Tracking is basic task for several applications of computer vision, e.g., video monitoring systems, traffic monitoring, automated surveillance, and so on. In its simplest form, tracking can be defined as the problem of estimating the trajectory of an object in the image plane as it moves around a scene. Some of important challenges encountered in visual tracking are non-rigid objects, complex object shapes, occlusion and scale/appearance change of the objects and realtime processing. Considerable work has already been done in visual tracking to address the mentioned challenges [1]. Many existing algorithms [2]-[5] segment each video frame to determine the objects; this action can be computationally expensive, and it is not necessary if the goal is to determine the moving objects. In the last few decades, neural networks have been successfully used in a number of applications such as pattern recognition, remote sensing, dynamic modeling and medicine. The increasing popularity of neural networks in many fields is mainly due to their ability to learn the complex nonlinear mapping between the input-output data and generalize them. As one of the most popular neural network models, radial basis function network attracts lots of attentions on the improvement of its approximation ability as well as the construction of its architecture. Learning-based tracking algorithms were rarely used for general purpose object tracking. This is due to the difficulty in adapting the neural networks for tracking purpose [6]. In [7] a global approach is proposed for adapting the neural networks for tracking purpose. In [6] and [8] neural networks are used for object tracking too, but these methods are computationally expensive.
Here a fast algorithm of tracking is proposed that uses k-means color segmentation for detecting an object in first frame. Next color features from object and background is extracted. Then neural network trained by this features to separate object from background in other frames. The rest of the paper is organized as follows: Section 2 describes the proposed algorithm that presents k-means segmentation, feature extraction, background extension and radial basis function neural network. The experimental results are shown in Section 3, and the conclusions are given in Section 4. II.
PROPOSED ALGORITHM
Fig. 1 shows the process of the presented method. Each block is described briefly in this section. We discuss about them in following sections. Start
Select object in first frame
K-means Segmentation
Background Extension
RBFNN training
Locating object centroid
Insert next frame
Detect object using RBFNN
Locating object centroid No
Does centroid change?
Yes
Figure1: system overview
Assign current centroid as a object centroid
In the first step in object tracking needs to deevelop a model for the object of interest from the given initiall video frame that starts with selecting object approximately in the first frame manually (Fig. 2).
Figure2: selecting the object manuually Next, k-means color segmentation is sepaarating the object from background this results in a binary im mage. The features used for k-means segmentation are simple pixel color based features, which correspond to the values in color spaces R-GB. In next step the background that is moodeled in previous stage is extended. Then, both of the object and extended background features are used to train the RBFNN. R For next frames, the trained RBFNN is used for sepaarating object from background which is done using locating cenntroid.
Step 3: compute new cluster ceenters z as follows: ∑ x , j 1,2 z (1) Where n is the number of elem ments belonging to cluster c . Step 4: if z z , i 1,2 thenn terminate. Otherwise continue from step 2. Note that in case the processs does not terminate at Step 4 normally, then it is executed for fo a maximum fixed number of iterations [9]. In the presented work w this number is chosen 50. Fig. 3 shows the manually selected object and k-means color segmentation result. B. Background Extension After segmenting first frame and detecting an object, color features of object and backgroound is extracted. The simplest way is to train RBFNN by feattures of object and background. Whereas background may be changed in consecutive frames then extracted background feattures from first frame could not be reliable alone. In presented work a new approach proposed. In presented work points distrributed randomly in the whole feature space except in object feature points. This distributed points considered as a backgground features. We call this process background extension. Fig. 4 illustrates this process. This extended background features and Object features are used to train RBFNN which wiill be discussed in next section.
A. Segmentation At first frame we select a box that contaains the object. In that box we should segment the image to separate s the object from the background. In this case we have tw wo classes and we need a segmentation algorithm that is compputationally simple and fast. K-means segmentation is one of the popular algorithms in clustering that we use it forr segmentation. A brief description of K-means algorithm that we use in this paper is as following: Three color features (R-G-B) are used for clustering. So we have two classes and three features. We naame the points by {x , x , … , x }. x R ,G ,B Step 1: choose 2 initial cluster centers z , z randomly from the n points {x , x , … , x }. Step 2: assign point x , i 1,2, … , n to cluuster C , j 1,2 if x z x z , p 1,2 , and j p
Figure4: (a) Object and backgrounnd color features in R-G-B color space (b) Object features and extended e background features
Figure3: (a) Manually selected object (b) k-means collor segmentation result
C. Neural Network The purpose of this procedure is to train neural network how to segment object. This requirees less computational effort over those that use segmentation in every frame. For the NN input, we use R-G-B value of pixeels of the segmented object as
foreground features and random R-G-B value of pixels as extended background pixels. A radial basis function network is an artificial neural network that uses radial basis functions as activation functions. Figure 5 shows the structure of the RBF Neural Networks. The radial basis function neural network is three layered feed-forward neural network. The first layer is linear and only distributes the input signal, while the next layer is nonlinear and uses Gaussian functions. The third layer linearly combines the Gaussian outputs. Only the tap weights between the hidden layer and the output layer are modified during training. RBF Neural Networks have 5 parameters to optimization: 1- The weights between the hidden layer and the output layer. 2- The activation function. 3- The center of activation functions. 4- The distribution of center of activation functions. 5- The number of hidden neurons. The weights between the hidden layer and the output layer are calculated by using Moore-Penrose generalized pseudoinverse. This algorithm overcomes many issues in traditional gradient algorithms such as stopping criterion, learning rate, number of epochs and local minima. Due to its shorter training time and generalization ability, it is suitable for real-time applications. The activation function selected is commonly a Gaussian kernel for pattern recognition application. The center of activation functions should have characteristic similar to data so random number between 0,255 is used for center of activation functions. The distribution of center of activation functions is chosen random number between 0,255 similarly. Based on universal approximation theory center and distribution of activation functions are not deterministic if the numbers of hidden neurons being sufficient enough. Using universal approximation property, one can say that the single hidden layer feed-forward network with sufficient number of hidden neurons can approximate any function to any arbitrary level of accuracy.
the output neuron and the th Gaussian neuron. The th output ( ) of RBF Neural Network with neurons has the following form: y
∑
In Figure 5, U is N 3 dimensional input feature vector which 3 representing R-G-B features and is number of pixels. Let and σ be the center and width of th Gaussian hidden neuron and be the interconnection weight between
U µ
(2)
Eq. (2) can be written in matrix form as: Y
φ α
(3)
Where φK µ, σ, U
φ µ ,σ ,U
φK µK , σK , U
φ µ , σ , UN
φK µK , σK , UN
Where φ is the Gaussian function. The matrix φK (dimension N K) is called the hidden layer output matrix of the neural network; the th row of φ is the th hidden neuron output with respect to inputs , , … , . For most practical problems, it is assumed that the numbers of hidden neurons are always less than of number of training samples. The number of hidden nodes in hidden layer is smaller than the number of the examples available for training for preventing the curse of dimensionality and over fitting. If φ then the output weights are calculated using the least squares solution, as: α
φK
(4)
φTK φTK is the Moore–Penrose Where φK generalized pseudo-inverse [26] of the hidden layer output matrix [6] and [10]. D. Object Location At first frame we select object approximately then we should segment the selected region to separate the object from the background more accurately. In next frames in order to find the object pixels, the features are extracted from this location and are tested with RBFNN classifier. Finding the object location starts at the center of segmented object. The displacement of the object is given by the shift in centroid of the object pixels. Object location in a given frame, is achieved by iteratively seeking the object centroid estimated by the RBFNN. The iteration is terminated when centroid location for any two consecutive iterations remain unchanged [6]. III.
Figure5: The structure of Radial Basis Function Neural Networks
α exp
EXPERIMENTAL RESULTS
The proposed algorithm has been tested on several complex video sequences. Here we just present PETS 2001 video which is used in our experiments. The tracked object is shown into rectangular region surrounded by an orange border. The tracking result for the PETS 2001 sequence is shown in Fig. 6. Here, the object walks such that his body undergoes partial occlusion, as well as, appearance and illumination and background changes, over time. It is observed that the
proposed tracker is able to track the object when background changes or the object undergoes partial occlusion. Fig. 7 shows the tracking results in a binary format. One advantage of the NN is that if an object is temporarily occluded, it will not
adversely affect the Ultimate classification. A further advantage of this method is that it is robust with regard to background clutter such as leaves blowing in the wind, rotation and scale of the object.
Figure6: Tracking result of proposed system for ‘pets’ sequence.
Figure7: Tracking result of proposed system for ‘pets’ sequence in binary format.
IV.
CONCLUSIONS
In this paper, we have proposed an improved object tracker using k-means color segmentation and radial basis function neural network. The k-means color segmentation is used to detect object in the initial frame. Background extension is used to improve RBF neural network performance when background colors change during tracking. RBF neural network is trained in offline stage by result of background extension. Trained RBF neural network is used for classifying object and nonobject pixels in other frames. Object localization is achieved by estimating object in each of the frames by using RBF neural network. Our method is able to track the detected moving objects in a scene. Since the k-means color segmentation and RBF neural network incurs very low computational load, the proposed algorithm is suitable for real-time applications. We are actively continuing to evaluate the performance of our approach. REFERENCES [1]
A. Yilmaz, O. Javed, M. Shah, Object Tracking: A Survey, ACM Computing Surveys (CSUR) 38 (4, Article 13).
[2]
D. Wang, Unsupervised Video segmentation Based On Watersheds And Temporal Tracking, IEEE Trans. Circuits Syst. Video Technol., 8(5):539-546, September 1998. [3] G. L. Foresti, Object Recognition And Tracking For Remote Video Surveillance, IEEE Trans. Circuits Syst. Video Technol., 9(7):10451062, October 1999. [4] P. Salembier, F. Marques, M. Pardas, J. R. Morros, I. Corset, S. Jeannin, L. Bouchard, F. Meyer, B. Marcotegui, Segmentation-Based Video Coding System Allowing The Manipulation Of Objects, IEEE Trans. Circuits Syst. Video Technol., 7(1):60-74, February 1997. [5] A. J. Lipton, H. Fujiyoshi, R. S. Patil, Moving Target Classification And Tracking From Real-time Video, Applications of Computer Vision, 1998. WACV ’98. Proceedings., Fourth IEEE Workshop on, pp. 8-14, 1998. [6] R.V. Babu, S. Suresh and A. Makur, Online adaptive radial basis function networks for robust object tracking, Comput. Vision Image Understanding 114 (3) (2010), pp. 297–310. [7] J. B. Kim and H. J. Kim, Unsupervised moving object segmentation and recognition using clustering and a neural network, Proceedings of the 2002 International Joint Conference on neural networks (IJCNN '02) (2) (2002), 1240-1245. [8] M. Bouzenada, M.C. Batouche, Z. Telli, Neural network for object tracking, Information Technology Journal 6 (4) (2007) 526–533. [9] Ujjwal Maulik and Sanghamitra Bandyopadhyay, Genetic algorithmbased clustering technique, Pattern Recognition, 33, 2000, 1455-1465. [10] Neural Networks A Comprehensive Foundation, Simon Haykin,2E -Pearson Education (1999).