blob, x2 LBP distance is calculated between the blob and its candidates. ... KEYWORDS Object tracking; texture feature; local binary patterns. æ è¦ å¨æ¬æä¸æåº ...
一种使用纹理的实时多目标跟踪算法
A Multi-target Tracking Algorithm using Texture for Real-time Surveillance 一种使用纹理的实时多目标跟踪算法 Zhixu Zhao,Shiqi Yu ABSTRACT In this paper, we present a texture-based multi-target tracking algorithm. Moving objects are described by Iocal Binary Patterns (LBP), which is a kind of discriminative texture descriptor. The Kalman filter is introduced into the algorithm to predict the blob's new position and size. Blobs are searched in the neighborhood of the Kalman predictions. If more than one are found, the LBP distance, which has been evaluated valid for blob distinguishing in our experiments, is applied to locate the tracking target. Cooperates with the LBP distance, the Kalman filter is efficient in dealing with collisions. Tracking results demonstrate the effectiveness of the algorithm. This algorithm has been implemented on PC and DSP platforms and achieved real-time performance. KEYWORDS Object tracking; texture feature; local binary patterns 摘 要 在本文中提出了一种基于纹理特征的多目标跟踪算法。运动的物体被检测出来后,使用局部二值模式特征(LBP)对 物体进行描述。LBP是一种非常具有区分能力的特征。为了提高搜索效率,本方法使用了Kalman滤波来预测运动物体的新 位置,然后在新位置周围搜索物体的精确位置。如果预测的位置周围有多个物体,则用LBP特征来比较物体的相似性。跟 LBP结合,Kalman滤波可以很好的处理遮挡问题。实验结果验证了本方法的有效性,且复杂度底,能够在PC和嵌入式平 台上实现实时跟踪。 关键词 物体跟踪;纹理特征;局部二值模式
I Introduction
appropriate object model [2]. Color features are very useful in tracking. Meanshift [6] is color-based and has been used in various occasions. However, Color alone can't provide all-around invariance to different imaging conditions especially in an outdoor environment where the background is complex. Texture is an important characteristic for image analyzing. The LBP is introduced by Ojala et al. [7]. Because of its powerful capability and low computational complexity, the LBP operator has been widely used in many applications, such as face recognition [8], motion analysis [9] . In the paper, we choose texture feature for multi-target tracking. The useful texture descriptor, LBP, is chosen because DSP is not good at floating-point operations. First of all, blobs are searched in the neighbor based on the Kalman predictions and the blobs which satisfy certain conditions are added into a candidate sequence. If just one blob is found, the blob is considered as the tracking target. In the case of the existence of multiblob, x2 LBP distance is calculated between the blob and its candidates. The candidate, which has the min x2, is set to be the tracking target. Temporary collisions are partially avoided by the employment of the Kalman Filter in cooperation with the LBP distance, referring
M
oving object tracking gains increasing interests in recent years because of the need in many applications. In visual surveillance, object tracking can be used for intrusion detection, abnormal behavior analysis, traffic violation detection, etc. Object tracking can also be used for robots to improve their perception capability, such as to follow a user walk and monitor the intruder. Moving object tracking, especially multitarget tracking, is challenge because there are many factors can decrease the performance. The factors include illumination changing, complex background, leafshaking, occlusion, etc. In recent years, many approaches have been proposed for object tracking. Assuming an object will generate one measurement in each frame and a measurement could have originated from at most one object [1], data association techniques such as Nearest Neighbor (NN), Joint Probabilistic Data Association Filter (JPDAF), Multiple Hypothesis Tracking (MHT) and their derivations are popularly used for data filtering and managing track states [2,3,4,5]. However, the computational complexity is a big problem for JPDAF and MHT methods. NN serves lower computational complexity but gives low reliability, which can be improved by selecting
55
Vol. 3 No.5 / May. 2009 [10]
to the tracking results. Experiments validate that the LBP distance is feasible for blob distinguishing and and moreover, the system runs 25 fps on the DSP platform. In the next section, Section 2, we will introduce the texture descriptor, local binary pattern. In Section 3, the tracking algorithm is presented, and In Section 4, the experimental results and analysis are given. The last section, Section 5, is conclusion.
. A pattern is called uniform if there is at most one bitwise transition from 0 to 1 in the binary patterns, e.g. 00000000 b, 00011110 b. Ojala et al. found that uniform patterns provide a vast majority, nearly 90% in the (8,1) neighborhood and around 70% in the (16,2) neighborhood, of the texture patterns in their experiments [10] . Nonuniform patterns, in contrast with the uniform ones, provide statistically unreliable information. Only uniform patterns are used in our algorithm. For each blob in the foreground, we set up a W×H window to collect texture information. W and H are width and height of the blob respectively. The LBP code for each pixel in the window is computed, but only uniform patterns are reserved for the histogram of the
II Local Binary Pattern
T
he LBP operator is a powerful means for describing texture. The LBP texture operator is defined as an illumination intensity invariant texture measure, derived from a general definition of texture in a local neighborhood. Besides, there is no float point operation in LBP feature extraction, and LBP based method can be easily implemented on a DSP platform. In a (P,R) neighborhood, a binary code that describes the local texture pattern is built by thresholding of the gray value from its center. The binary code viz. the LBP operator is denoted as LBP (P,R). The notion (P,R) means there are P sampling points on a circle of radius R. The LBP neighborhood can be divided into 2 groups, the square neighborhood and the circle neighborhood, as shown in Fig 1.a and Fig.1.b. Fig 1.c illustrates how the LBP operator is computed in the (8,1) square neighborhood. The binary code of this example is 11010010b. Thus, the LBP operator at gc can be defined as:
. In the (P,R)
blob. The histogram is denoted as
neighborhood, if (P,R) is known, uniform pattern codes will be determined, which also means uniform pattern codes can be pre-calculated. For example, there are 59 uniform patterns in the (8,1) neighborhood. The texture histogram
, can be defined as:
where
(2)
is the number of the patterns, and
only if . Among the 256 bins, only the 59 bins for uniform patterns are reserved to describe an object.
(1)
III Multi-target Tracking
where
O
ur objective is to implement the algorithm on a TI DM642 DSP platform, on which limited computational resources, compared with standard PC, are great challenges for real-time tracking. Thus, in order to reduce computational complexity, we choose single Gaussian method for background modeling and foreground segmentation instead of Mixture of Gaussians (MoG), though MoG is better at handling leaf-shaking and illumination changes. Morphology operations are applied to remove isolated small spots and fill holes in the foreground (Fig.2). After these steps, we get some isolated regions in the foreground of every frame, which are called blobs. Undesirable blobs, too small ones and too large ones, are deleted by a size filter. Each blob is characterized by the following features:
1.
Fig 1 (a) the square neighborhood (b) the circle neighborhood, the value of g0, g2, g4, g6 are interpolated (c) LBP operator computing in a (8, 1) square neighborhood.
centroid of the blob,
2. blob width W and height H, 3. uniform LBP histogram HLBP. so, the ith blob in the mth frame can be represented as:
Uniform LBPs, extended from the basic LBPs, can describe the fundamental property of local image texture
56
一种使用纹理的实时多目标跟踪算法
white, and normally distributed in the tracking. In our system, blob X-Y coordinated position (x, y), size (w, h) and position changes (dx, dy) are used as states, which can be represented by a vector
(3)
. Measurements vector is . Thus, the state transition matrix is:
Fig.2. Moving object segmentation
The measurement matrix is:
Two steps are completed in the Kalman filter process: time update step and measurement update step. The time update step is in charge of prediction, and the measurement step is responsible for feedback or correction. In normally tracking, predicted blob is used for target searching. While in the case of collision, blob's position and size properties are updated by Kamlman predictions. We notice that the Kalman filter is efficient in handling temporary collision problem.
Fig. 3. Flowchart of the tracking system
The whole tracking procedure can be seen in Fig 3. Firstly, the Kalman filter is applied to predict each blob's new position, and then collision detecting is executed. If just one candidate is found in the neighborhood, the candidate is considered as the tracking target. Otherwise, more than one blobs are detected without collisions, the LBP histogram is chosen as feature to distinguish blobs.
3.1 Kalman Filter The Kalman filter is an efficient recursive filter that can estimate the state of a dynamic system from a series of incomplete and noisy measurements [11,12]. Suppose y t represents the output data or measurements in a system, whose corresponding states is referred as xt, the state transition from t-1 to t in the Kalman filter can be described with the equation:
(4)
Its measurement yt is:
Fig. 4. Tracking results.
(5)
3.2 LBP Histogram Distance
A is termed state transition matrix and H is measurement matrix. wt and vt are process and measurements noise respectively, which are assumed to be independent,
As described in Section 2, the LBP histogram is used to describe blobs, and a N-length texture histogram
57
Vol. 3 No.5 / May. 2009 is created for each blob. There are many kinds of measurements can be used to measure the similarity between blobs. The most commonly used are correlation, intersection, Chi-square distance, etc. These measurements are defined as follows: ●Correlation:
move too fast can also be found. If just one blob is in the candidate sequence, the blob is the tracking target and its properties will be added to the track of sequence of blob . If multiple blobs exist, we will choose a blob which is the most similar with tracking target. The similarity is measured using Chi-square (x2) distance which is introduced in previous subsection.
IV Experimental Results and Analysis
(6)
where
The proposed algorithm has been implemented on PC and TI DSP platforms and can run at a frame rate of 25 FPS. We also test our algorithm using PETS2001 dataset 2001 and “OUTDOOR” videos captured from a fixed camera outside our lab. PETS datasets [13] are designed for test visual surveillance algorithms, and have been widely used by many researchers. PETS2001 dataset used in our experiments is multi-viewed (2 cameras) with a size of 768×576, and the size of the “OUTDOOR” series is 320×240. The detailed information is listed in Table 1. There are pedestrians, vehicles, bicycles in the test videos, and collisions are existed in PETS2001 and “OUTDOOR” datasets. Table 1 Test Video Details
Correlation:
●
(7)
Chi-square (x2) distance:
●
(8)
Among these measures, correlation, whose computational complexity is
and the others' are
, is most time-consuming. According to Ahonen's experiments, Chi-square (x2) distance performs slightly better than histogram intersection [8], thus we choose Chisquare (x2) distance in our experiments.
3.3 Blob Classification Suppose a blob Bt is detected in the tth frame and its t 1 Kalman prediction in the (t+1)th frame is B , we want th to track it in the (t+1) frame. The blobs in the (t+1)th frame are where . The distance t 1 vector between blob B and blob is: where (XB, YB) is the center of blob B. If a blob is near to the predicted blob
, that is
and , the blob
is considered as a candidate of
.
All candidates are added to a sequence. Parameter S is used to adjust the search window so that blobs which
(a) LBP Distance of PETS 2001 Dataset1 camera 1
58
一种使用纹理的实时多目标跟踪算法 are correctly predicted and most trajectories under collision or occlusion are continued reliably in our system. Fig 4.b is an example. As shown, blob 19 and 22 collided at frame 3936. They were properly tracked and their IDs were correctly preserved with our algorithm. A Limitation of the algorithm is the problem of quick illumination variations. This problem can be coped with MoG. Typical tracking failures as shown in Fig 4.c will happen if blobs suddenly change their motions. These failures generally occur at the edge of the camera view. In this experiment, the white car (blob 432) made a stop at frame 46164, where a silver car was passing by. The unexpected motion changes led to track lost of the white car.
(b) LBP Distance of PETS 2001 Dataset1 camera 2 Fig. 5. LBP Distance of PETS 2001 Dataset1
V Conclusion
Experiments on PETS dataset show that LBP distance is valid for blob distinguishing. We collect each blob's texture histogram from each frame and calculate its LBP distance with blob texture histograms which are collected in the next frame using equation 8. The LBP neighborhood is P = 8, R =1 in our experiment. In order to assure that all the collected data is valuable, leafshaking and illumination variation caused spurious blobs are abandoned, and mistaken tracks, such as target track losing or targets' IDs confusing, are fixed manually. Fifteen blobs were detected in PETS dataset1 camera 1. Except incorrect foreground detections (Blob 0, 2, 11 and 12), eleven blobs were correctly detected in fact. Among the 11 blobs, blob 1 and 14 exchanged their IDs at frame 2245. Thus, 9 blobs were correctly tracked. And the same operation is applied to the tracking of camera 2. The experimental results are shown in Fig 5. In most cases, the average LBP distance of different blobs is greater than the one of the same blob. Tracking results are illustrated in Fig 4. In all results, the tracking blobs are indicated by black rectangles in each frame and trajectory of each blob is shown by the same color of its corresponding blob. The following scenes can be found in the test videos: (1) moving in one direction without collision, (2) wandering in the screen, and (3) collision. The tracking results also testify that obvious differences are existed between the same blob LBP distance and the different blobs'. As a whole, the same blob LBP distance is far less than the different blobs LBP distance. In our “OUTDOOR” tracking experiments, about 85% the same blob LBP distance is smaller than 70, while most of the different blobs are greater than 100. The Kalman filter, combining with the LBP distance, demonstrates satisfactory performance on dealing with collision or occlusion problems. If collision is detected, blobs' properties are updated by the Kalman predictions. The LBP distance is used to pick out the most similar tracking target before blobs' mergence and split. Blobs
I
n this paper, we have presented a texture-based multi-target tracking algorithm for real-time outdoor tracking. Uniform LBPs are used to describe moving objects. LBP-based feature methods have low computational complexity, so the proposed method can be implemented on a DSP platform efficiently. Experimental results indicate that LBP is feasible for blob distinguishing. Due to the application of texture, the algorithm can cope with slow illumination changes and slight leaf-shaking. The Kalman filter, working together with the LBP distance, is employed to handle collision problem and shows effectiveness in collision tracking. Experimental results show that the algorithm works well for multi-object tracking. Currently the robustness of the proposed method greatly depends on foreground segmentation. But the single Gaussian background model can not handle sudden light change. In the future we will improve the robustness of the foreground segmentation algorithm. We will also focus on fusion of texture features and color features for tracking. Feature fusion will make tracking more accuracy.
Acknowledgement The authors would like to acknowledge Ms. Zhiyan Hu in Xinxiang Medical University and Ms. Jia Wu, Mr. Shushang Chai in Shenzhen Institute of Advanced Technology, for their insightful suggestions.
References [1] P. Kumar, S. Ranganath, K. Sengupta, et al.Cooperative multitarget tracking with effcient split and merge handling[J].IEEE Transactions on Circuits and Systems for Video Technology, vol. 16, no. 12, pp. 1477–1490, December 2006. [2]
59
K. Choeychuen, P. Kumhom, and K. Chamnongthai, “An effcient implementation of the nearest neighbor based visual objects tracking,” in Proc. Inernational Symposium on Intelligent Signal Processing and Communication Systems (ISPACS2006), Yonago Convention Center, Tottori, Japan, 2006, pp. 574–577.
Vol. 3 No.5 / May. 2009 [3]
C. Kreucher, K. Kastellla, and A. O. Hero, “Multitarget tracking using the joint multitarget probability density,” IEEE Transactions on Aerospace and Electronic Systems, vol. 41, no. 4, pp. 1396–1414, 2005.
[4]
K. chu Chang and Y. Bar-Shalom, “Joint probabilistic data association for multitarget tracking with possibly unresolved measurements and maneuvers,” IEEE Transactions on Automatic Control, vol. 29, no. 7, pp. 585–594, 1984.
[5]
D. B.Reid, “An algorithm for tracking multiple targets,” IEEE Transactions on Automatic Control, vol. 24, no. 6, pp. 843–854, 1979.
[6]
D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-based object tracking,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 5, pp. 564–577, 2003.
[7]
T. Ojala, M. Pietik¨ainen, and D. Harwood, “A comparative study of texture measures with classification based on featured distributions,” Pattern Recognition, vol. 29, no. 1, pp. 51–59, 1996.
[8]
T. Ahonen, A. Hadid, and M. Pietik¨ainen, “Face recognition with local binary patterns,” European Conference on Computer Vision (ECCV’04), vol. 3021, pp. 469–481, 2004.
[9]
V. Kellokumpu, G. Zhao, and M. Pietik¨ainen, “Texture based description of movements for activity analysis,” in Proc. International Conference on Computer Vision Theory and Applications (VISAPP’07), 2008, pp. 206–213.
[10] T. Ojala, M. Pietik¨ainen, and T. M¨aenp¨a¨a, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Trasactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971–987, 2002. [11] G. Welch and G. Bishop, “An Introduction to the Kalman Filter,” ACM SIGGRAPH 2001 Course Notes, 2001. [12] N. Funk, “A study of the kalman filter applied to visual tracking,” University of Alberta, 2003. [13] PETS01 datasets, http://visualsurveillance.org, The University of Reading, UK.
作者简介 赵志旭 电子科技大学机械电子工程学院2006级研究 生,智能仿生研究中心客座学生,研究方向为 工业自动化、智能视频监控。 于仕琪 作者简介见本期46页。
60