and sample based joint probabilistic data association filters to perform the assignment between the features detected in the input sensor data and filters.
1
An Efficient and Robust Tracking System using Kalman Filter Raquel R. Pinho FEUP - Faculdade de Engenharia da Universidade do Porto, Portugal INEGI - Instituto de Engenharia Mecˆanica e Gest˜ao Industrial Jo˜ao Manuel R. S. Tavares FEUP (DEMEGI - Dep. de Engenharia Mecˆanica e de Gest˜ao Industrial), INEGI Miguel V. Correia FEUP (DEEC - Dep. de Engenharia Elect´onica e de Computadores) INEB - Instituto de Engenharia Biom´edica
Abstract— In this paper we address the problem of tracking features efficiently and robustly along image sequences. To estimate the undergoing movement we use an approach based on Kalman filtering. The measured data is incorporated by optimizing the global correspondence set based on an efficient approximation of the Mahalanobis Distance (MD). Along the image sequence, to deal with the incoming and previously existing features a new management model is considered, so that each occluded feature may be kept on tracking or it may be excluded depending on its historical behavior. This approach handles adequately occlusion, disappearance and (re)appearance of features while tracking efficiently movement in the image scene. It also allows feature tracking in long sequences at low computational cost. Some experimental results are presented. Index Terms— Tracking, Kalman Filter, Occlusion, Multiple Features, Mahalanobis Distance.
I. I NTRODUCTION EATURE tracking is a complex problem whose automatic detection and execution evolved considerably in the past decade. Applications of movement tracking are numerous: surveillance, object deformation analysis, traffic monitoring, etc. [1], [2], [3], [4]. Many tracking applications require tracking of several objects simultaneously, and involve problems related to their appearance and disappearance of the image scene that is acquired during eventually long periods of time. The complexity of the tracked objects and the interactions of the various underlying parameters have stimulated new technologies, such as highspeed cameras, and also the appearance computerized movement analysis laboratories, which have allowed new insights into tracking. Automated movement observation systems can often provide very significant advantages: in particular, events are recorded more reliably, because the computerized algorithm always applies the same criteria, and the system does not suffer from observer fatigue or drift, so the observation and analysis processes can continue almost indefinitely. Moreover, movement analysis with motion capture video systems and interactive modeling systems can help the analysis, diagnosis, and documentation of movements with tools that may be very useful in many application areas.
F
A. Related Work To track features in image sequences a compromise should be achieved between the accuracy of the motion estimation and the related computational cost; for instance, to obtain on-line results or to be able to track multiple features simultaneously. Although the computational performance has improved significantly in the last years, the tracking systems which are able to capture, analyze and output real-time results often use some kind of simplification to speed-up the process. For instance, the Pfinder [5] is a real-time system for tracking people and interpreting their behavior. It does not require accurate initialization and segments the tracked person from the image background in real-time with a standard computer, but it expects only one individual to be in the image scene, and the scene is supposed to be significantly less dynamic than the tracked person. In [6] Bayesian networks (BN) are used to perform tracking even in occlusions, group formation and splitting situations. The on-line object tracking is performed by gradually forgetting the influence of past information on the current decisions, thereby avoiding a combinatorial explosion and keeping the network complexity within reasonable bounds. In fact, many current works use a probabilistic representation of the uncertainty and stochastic filters to fuse and validate sensor information, as well as to estimate parameters describing the environment. For instance, in [7] the Kalman filter (KF) is used with a statistical background model to detect moving objects and a 3D coarse human shape model to constrain the shape of an upright human in complex situations. By doing so, multiple humans may be tracked in case of occlusion, cast shadow or reflection. To update the position of the tracked persons, a search approach starts from the mean predicted position and is conducted for all positions in the neighborhood. If the tracked features overlap a joint likelihood is used but it is computationally expensive. In the case of occlusion the object follows the prediction of the KF. The data association problem is combined with filtering techniques to track ground targets using ground moving target indicator (GMTI) reports obtained from an airborne sensor in [8]. In [9],
2
a method for detection and tracking of multiple moving objects is presented, using particle filters to estimate the object states, and sample based joint probabilistic data association filters to perform the assignment between the features detected in the input sensor data and filters. The KF is a widespread technique for object tracking, but recently particle filters have become usual [10]. The KF rests on the assumption that the disturbances and initial state vector are normally distributed, and it is shown that under these circumstances the obtained mean of the conditional distribution of a state is an optimal estimator in the sense that it minimizes the mean square error. If the normality assumption is dropped, there is no guarantee that the KF will give the conditional mean of the state vector [11]. So, particle filters were presented as a good alternative because they represent the conditional distribution with several particles which allows multimodal state distributions [12]. However, the particle filters have revealed some problems too, such as difficulties on tracking multiple objects and articulated objects, if the system has very small system noise or the observation variance has very reduced variance the particle filter may not perform well, and the number of samples may collapse to a single (wrong) peak. To overcome these difficulties several solutions have been presented as the Scatter Search Particle Filter [13], Kernel Particle Filter [14]. But particle filters remain as an expensive solution [15]. The matching between the estimated features and the observations detected after a sensing operation is determined using data association techniques. Data association algorithms may include a hypothesis-validation step, which may be based on the Mahalanobis distance (MD) and its validation through the χ2 -distribution. The use of the MD for associating data in tracking systems is usual as it gives good performance, but few approaches have tried to speedup this procedure. In [16] is presented a method for tracking and identifying moving persons from video images taken by a fixed field-of-view camera, and the MD is used to distinguish pixels either from foreground or background. To speedup the calculation only the diagonals of the covariance matrix are used in the MD calculation. This corresponds to the same simplification we use, but applied in different tracking approaches. Instead, the MD and its simplifications are extensively used for pattern recognition. As the computation time of the MD will reach O(n2 ) for ndimensional feature vectors, to reduce the computational cost several approximations of the MD have been proposed: the quasi-Mahalanobis distance [17], the modified Mahalanobis distance [18], and the modified quadratic discriminant function [19]. All these approaches were proposed to improve recognition accuracy but not to approximate the quadratic discriminant function. On the other hand, many tracking approaches suppose that features are permanent and only temporary occlusion is considered. In [20] human motion is captured with infra-red computer aided gait analysis systems, and accurate estimation of makers positions plays an important role in estimating the different joint angles in gait analysis. During occlusion the 3D coordinates of each marker are predicted and interpolated
in each frame with Radial Basis Function Neural Networks. Multiple humans occlusion is solved in [21] with the Extended Kalman Filter (EKF) for trajectory estimation but the system expects that merged blobs will eventually separate in the next frames. The approach proposed in [22] also builds on the idea of object permanence, and long period tracking is performed at two levels: in the region level, a customized Genetic Algorithm (GA) is used to search for optimal region tracks; at the object level, each object is located based on adaptive appearance models, spatial distributions and inter-occlusion relationships. Not so many approaches consider the case of definitive disappearance of tracked features from the image scene. The definitive disappearance of features from the image scene is not an unusual matter; for instance, in a surveillance system people may move on to another compartment. The most simple approaches can keep tracking features during some previously defined number of frames or simply discard each feature when it is not detected. In [7] when occlusion occurs the object follows the prediction of the KF, and if it is occluded for a determined number of frames, then it will be discarded. To increase the robustness of tracking people in video sequences a prediction model is also used in [23] to track during occlusion. In [6] a two layer solution is used to deal with total occlusions as well as group merging and splitting. The first layer produces a set of spatio-temporal strokes based on low level operations. The second layer performs a consistent labeling of detected segments using a statistical model based on BN, if an object is not detected during some instances of time then that label ceases. In all these approaches the instances during which the tracking of missing features is maintained is user specified, and no guarantee is given on the correctness and adequateness of that choice. In [5] the Pfinder tracks people without explicit reasoning about occlusions, blobs are deleted due to occlusions and are added after occlusions. With this approach data is discarded whenever a feature is not detected; so in the case of noisy or cluttered images this approach can lead to the lose of valuable information. All of these approaches discard features independently of their previous behavior (e.g. it does not matter if they are frequently not detected), and the total computational cost is not contemplated (as this approach does not depend on the number of tracked features). B. Our Approach To track features movement we used the KF [24], [10], and combined it with optimization techniques for data association in order to increment the filters robustness to occlusion and non-linear movement. The correspondence between each features prediction and new measurement data is based on the MD, which can be used to evaluate the quality of correspondences, and the actual set of matching between predictions and measurements is obtained by optimizing the sum of all the involved MD [25]. By doing so, the best global correspondence set is guaranteed. To simplify this correspondence process between features we propose the use of an efficient approximation of the MD. The combination of the KF with the
3
optimization of correspondences allows good tracking results even if the KF restrictions are not satisfied (a often situation in many tracking applications). In this paper we also use a management model which can deal with the appearance, occlusion and disappearance of tracked features. It manages the decision to keep on tracking each feature, taking into account its previous behavior. Thus, features which keep on appearing in the image scene will obviously continue for tracking; but if no measurement data has been introduced for a feature in the previous frames, then its tracking might cease. In our previous work we had also used a management model which associated a confidence value to each tracked feature in a frame [25], [26]. While tracking a feature, if its predicted state had been updated with a new measurement, then its confidence value would increase, otherwise it would decrease. But in the previous management model to keep tracking features or not we only considered the number of frames during which it appeared and disappeared. The management model we use is based on an economics investment model - the Net Present Value (NPV) [27]. We use it because many resemblances may be found between evaluating projects for investment and choosing to keep tracking missing features (for instance projects are managed attending to their own specificities and to the global market situation, while missing features can be managed attending to each features behavior and to the global tracking system performance). The NPV method validates the use of cost functions to deal with the missing features tracking. Thereby, to keep on tracking missing features we consider the number of tracked features and the quality of the previous matching and tracking results. The simplicity of our approach allows efficient and robust tracking results, with the computational cost being reduced to the strictly necessary. C. Overview This paper is organized as follows. In the next section a brief introduction is made to the KF. In section 3, we describe our solution to the correspondence problem by using an optimization technique with an efficient approximation of the MD. In section 4 we explain how the new NPV management model deals with the tracked features, as well as their appearance and disappearance in image sequence. Then some experimental results are shown on synthetic and real movement sequences. In the last section some conclusions will be held. II. T HE K ALMAN F ILTER The KF is an optimal recursive Bayesian stochastic method. It provides optimal estimates that minimize the mean of squared error of the modeled process. In a Bayesian stochastic viewpoint, the filter propagates conditional probability density of the system state conditioned on the knowledge of the actual data acquired with the measuring devices. The equations for the KF fall into two steps: time update (or prediction) and measurement update (or correction). The
time update equations are responsible for projecting forward (in time) the current state and error covariance to obtain the a priori estimates for the next time step. The measurement update equations deal with the feedback, that is new measurements are incorporated into the a priori estimates to obtain improved a posteriori values [24]. The prediction step is based on the Chapman-Kolmogorov equation for a first order Markov process: + x− t = Φxt−1
(1)
x+ t−1
where Φ relates the system state at the previous time step + t−1 to the state x− and t at the current step t. The superscripts − indicate if measurement data have been or not incorporated, respectively. The related uncertainty is given by: + Pt− = ΦPt−1 ΦT + Q
(2)
where P is the prediction covariance matrix and Q models the process noise. The correction equations that update the predicted estimates upon the incorporation of new ut measurements are given by: −1 (3) Kt = Pt− H T HPt− H T + Rt − − x+ (4) t = xt + Kt ut − Hxt Pt+ = [I − Kt H] Pt−
(5)
where K is chosen to be the gain that minimizes the a posteriori error covariance equation, H processes the coordinates transformation between the predicted and the measurement spaces, Rt is the measurement noise involved, and I is the identity matrix [24], [10]. In our tracking system each feature’s state is composed by its position, as well as its velocity and acceleration; the captured measurements are composed by their position coordinates; and each of the tracked features has its own KF. One of the drawbacks of the KF is the restrictive assumption of Gaussian posterior density functions at every time step, as many tracking problems involve non-linear movement. In the next section we will present our solution to overcome this problem. III. C ORRESPONDENCE WITH AN O PTIMIZATION T ECHNIQUE AND AN E FFICIENT A PPROXIMATION OF THE M AHALANOBIS D ISTANCE To associate the new measurement data acquired with the previously tracked features at each step of the KF, a criterion of correspondence (matching) must be used. Thus, for each feature it is assumed there may exist at most one new measurement to correct its predicted state. By the Kalman usual approach the search area for each feature’s position in the image plane is given by an ellipse centered in the previously predicted position, whose axes are determined by the eigenvectors of the covariance reduced matrix, and its rays are given by the associated eigenvalues [26]. As the filter converges, better estimates will be given and the search areas will successively decrease, so the computational cost due to the correspondences decreases [26].
4
However, this usual approach may raise some inconveniences: there may not exist any measurement in the search area; or there might be several features in the same area; and even if there is only one correspondence for each feature, there is no guarantee that the best global set of correspondences has been achieved. Further on, we will outline an approach that surpasses such ambiguities: an optimization technique is used to obtain the best global association between the previous predictions and the actual set of measurements; and the cost of each correspondence is given by an efficient approximation of the MD. To find the best global set of correspondences between the filters predictions and the new captured measurements, several optimization methods could be used. In our work, we use the Simplex algorithm. It is a widespread iterative algebraic procedure used to determine at least one optimal solution for each problem [28]. As a linear optimization method, the Simplex algorithm optimizes a function which is subject to some restrictions. In the case of movement tracking, we wish to minimize the global cost of the association between the set of captured measurements and the features estimates given by the KF. To do so it should be noticed that for each estimate there will be at most one measurement, and that each new measurement will thereby correspond to a feature’s position. So, we use the assignment formulation of the Simplex algorithm [28]. To evaluate correspondences we use a cost function based on the MD, also known as statistical distance, which is a distance that for each of its components (the variables) takes the variability of that variable into account when determining its distance from the center. So, components with high variability receive less weight than components with low variability. This is done by rescaling the components. The MD is a standard manner of associating data for tracking features in image sequences, and for two points Xi = (x1i , x2i , ..., xni ) and Yj = (y1j , y2j , ..., ynj ) it is given by: q (6) dM = (Xi − Yj )T C −1 (Xi − Yj ) where C(n×n) is a non-singular covariance matrix (therefore symmetric positive definite). However, the calculation of the MD is one of the most time-consuming operations of the matching process. After a sensing operation, M feature location estimates, and N measurements, are available. The problem is how to associate each measurement, Xi (i = 1, ..., N ), with a feature estimate, Yj (j = 1, ..., M ). This matching procedure is time-consuming because a matrix inversion is required, as well as the computation of matrix C and vector v = Xi − Yj which involves linearizations. To save computational cost some data-association techniques perform a validation test for each correspondence hypothesis in order to work with only a reduced set of validated hypotheses. The process of the validation is performed using a statistical test based on the MD and its approximation by the χ2 distribution: (7) v T C −1 v ≤ χ2 where v is the vector between a predicted feature state and an acquired measurement. This test should theoretically be
computed for M xN hypotheses. In our tracking system the captured measurements are composed by their position coordinates in the image plane, so C and v are given by: c11 c12 (8) C= c12 c22 and V =
v1
v2
T
.
(9)
So the MD in this case is given by: dM =
c22 v12 − 2c12 v1 v2 + c11 v22 . c11 c22 − c212
(10)
Rearranging the terms in the equation above we can obtain v12 v22 2c12 v1 v2 c12 c22 v12 + c12 c11 v22 − 2c312 v1 v2 . + − + c11 c22 c11 c22 c11 c22 (c11 c22 − c212 ) (11) So, the efficient approximation of the MD that we propose consists on: v2 v2 dM ≈ dˆM = 1 + 2 (12) c11 c22 dM =
which is thereby affected of an error, ∆dˆM , of: 2c12 v1 v2 c12 c22 v12 + c12 c11 v22 − 2c312 v1 v2 ∆dˆM = − . + c11 c22 c11 c22 (c11 c22 − c212 ) (13) Although better approximations can be used, for the MD their computational cost can be questioned; on the other hand, the approximation that we propose is quite efficient as it only involves 5 arithmetic operations for each pair of features; instead of the 18 operations involved in equation (10) with the accurate calculation of the MD. IV. A NPV M ANAGEMENT M ODEL TO D EAL WITH M ISSING F EATURES To accept or reject the tracking process of each feature, according to the NPV method, a cost function is used. In the case of tracking features with the KF and global optimization of the MD, several items can be taken into account to estimate the cost and revenue of each tracked feature: • the number of tracked features - if the computational hardware is overloaded with the tracking of many features, then it would be a good idea to discard useless data as soon as possible to free computational resources; however, if a small number of features are being tracked then the tracking process is not a computational burden, and it may not be necessary to discard features so easily; • the predicted state uncertainty - if small certainty is given on a features state, then it may not be necessary to keep its tracking as we do not have reliable results; on the other hand, if the obtained estimates are very reliable, then it may be convenient to keep them; • the MD of the previous correspondences - if a feature has larger divergence from the filters estimate, then it may not be a trustful result and can be discarded easier; if it has small divergence from the predicted position, then
5
it might be kept for tracking because the filters results usually are quite accurate. So, we compiled this information to evaluate whether a feature should continue to be tracked in the expected net cash receipt and the initial investment outlay functions. We propose to evaluate the initial investment outlay for each feature i by: Pm kt ∗ cti ∗ dti (14) I0i = t=0 m where kt is the number of tracked features in instance of time (image frame) t, cti is the sum of the main entries in the filters prediction reduced covariance matrix, dti is the efficient approximation to the MD used for data association of that feature, and m is the number of frames/instances of time during which the feature has been successfully tracked. For the expected net cash receipt of feature i in instance of time t we propose: (15) Sti = kt ∗ cti ∗ d0ti where d0ti is the efficient approximation to the MD of the last data association for that feature in the previous instances of time. Thus, if I0i corresponds to a medium cost of tracking that feature, we would only choose to keep tracking it, according to the NPV method, if the actual revenue is lower than the medium cost. To employ the NPV method, the user should specify the discount rate, which corresponds to a measure, defined by the implementation user, of how vulnerable the system should be to tracking features without any need of rejecting them incorrectly. V. E XPERIMENTAL R ESULTS In each frame of the presented examples the predicted position is represented with a +, with uncertainty area circumscribed by a solid ellipse, each measurement is the center of the detected contour, and the corrected position is represented by a x. The association between each prediction/measurement is represented with a solid segment. For the first example, Figure 1, consider a synthetic sequence of 9 frames. In the beginning of the sequence only two blobs are visible. The circular blob will disappear definitively but the tracking approach keeps on trying to track it during the subsequent frames, although with gradually higher uncertainty (in frame (e) the uncertainty region surpasses the image border). In the second frame a triangular blob appears, and in the third frame the square blob disappears instantly. In the fourth frame the captured blobs overlap, and with the used image processing techniques only one measurement is captured and associated to a blob, but both features continue to be correctly tracked. From the seventh frame onwards 25 blobs are tracked. The results presented in Figure 1 were obtained by data association with the efficient MD but there exists no visual difference to those obtained with the usual MD calculation, but the computational cost associated to the efficient MD is obviously less, Figure 2. Indeed as little features are tracked the computational load of the MD is not significant, but as the number of tracked features increases the advantages of the efficient MD are more notable.
What is more advantageous to our tracking application is that the efficient approximation of the MD can efficient and adequately sort out the features to build the correct correspondences along the image sequences; only in some rare cases, the data association may differ while using the usual MD or its efficient approximation, but usually in the next frames the tracking process is correctly recovered. For the next example consider a real image sequence of 547 frames with 3 mice in a lab (figures 3 and 4). Several difficulties are associated to tracking the center of the mice’s bodies in the captured images. One of which comprehends the fast movement of the mice, as they may go back and forth changing drastically their movement direction (figure 3), or may move quickly along some direction (figure 4). This nonlinear movement, which is not undertaken by the KF usual approach, can give rise to differences up to 45 pixels (in 320x240 images) between the predicted estimates and each associated measurement (both in xx and yy components), figure 5. The results of component yy are not presented because they are analogous. The noise serie is due to a noisy measurement which was instantly captured in frame 293, but it was not validated with new measurements so it was discarded by the management model. On the other hand, the proposed approach recovers from such discrepancies, as in figure 5 the relative maximums are quite often followed by relatively low values (below 10 pixels). For the next example consider a sequence of real images of an outdoor campus scene from the PETS (Performance Evaluation of Tracking and Surveillance) 2001 datasets. In the presented sequence a person will be partially occluded by a car. As they approach each other, the used processing techniques only detect one region of motion and the only captured measurement is attributed to the car. However, the persons tracking is maintained during 3 frames if a internal rate of 0.02 is used (for higher internal rates it would be tracked longer, for instance a internal rate of 4.0 would keep tracking during 6 frames). When the person and the car split into different regions of motion the tracking of the person is initialized as a new feature to track. This example shows how the NPV model manages the missing features, by ceasing or continuing their tracking according to their historical behavior and to the global tracking process. VI. C ONCLUSION In this paper we have proposed a methodology to track robustly and efficiently moving features in image sequence. To do so, we used a KF which is able to predict and correct the tracked features position, as well as their velocity and acceleration. The correct incorporation of the data captured in each image frame was done with an optimization technique, which matches features globally in an new manner using an efficient approximation of the MD. We also use of a new NPV management model to deal with features that disappear from the image scene or not be visible during some instances of time. According to the NPV method a feature should continue to be tracked if its revenues are higher than its costs, for a user determined discount rate which depends on the application.
6
(a)
(b)
Fig. 2: Data association time while using the efficient MD or the usual MD on a Mobile AMD Atlhon(tm) 4 at 1.20 GHz and 256 MB RAM. figure
(c)
(d)
(e)
(f)
(g)
(h)
The proposed revenues/cost functions for tracking features are based on the results obtained from our tracking approach: the prediction covariance values from the KF, the efficient approximation of the MD used for the data association, and the number of features being tracked. We showed that with the proposed approach, features can be tracked correctly even in the case of quick changes of direction, as well as in cases of temporary or permanent occlusion. ACKNOWLEDGMENTS We would like to thank the help of Prof. Hemerson Pistori and Prof. Jo˜ao Bosco Monteiro (Dom Bosco Catholic University - Brasil) for gently sharing the mice images presented in the first experimental example of this paper. This work was partially done in the scope of the project Segmentation, Tracking and Motion Analysis of Deformable (2D/3D) Objects using Physical Principles, reference POSC/EEA-SRI/55386/2004, financially supported by FCT Fundac¸a˜ o para a Ciˆencia e a Tecnologia from Portugal. The first author would like to thank the support of the PhD grant SFRH/BD/12834/2003 of the FCT. R EFERENCES
(i)
Fig. 1: Tracking blobs along a 9 frame image sequence: (a) original 1st frame; (b)-(i) - KF results: search area defined by solid ellipses, the predicted position for each marker is given by a +, and the corrected position is represented with a x. figure
[1] A. Azarbayejani, C. Wren, and A. Pentland, “Real-time 3d tracking of the human body,” in IMAGE’COM 96 - International Conference Communicating by Image and Multimedia, Bordeaux, France, 1996. [2] C. W. Chen, T. Huang, and M. Arrott, “Modeling, analysis, and visualization of left ventricle shape and motion by hierarchical decomposition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, no. 4, pp. 342–356, 1994. [3] R. Cucchiara, C. Grana, M. Piccardi, and A. Prati, “Statistic and knowledge-based moving object detection in traffic scenes,” in IEEE Conference on Intelligent Transportation Systems, Dearborn, USA, 2000. [4] A. Feldman and T. Balch, “Automatic identification of bee movement using human trainable models of behaviour,” in International Conference on the Mathematics and Algorithms of Social Insects, Atlanta, USA, 2003. [5] C. Wren, A. Azarbayejani, T. Darell, and A. Pentland, “Pfinder: Realtime tracking of the human body,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 780–785, 1997.
7
(a)
(b)
(a)
(c)
(d)
(e)
(f)
Fig. 3: Tracking mice in a lab environment during 547 frames: (a)-(f) - significant changes in movement direction can be tracked correctly. figure
(a)
(b)
Fig. 4: Tracking mice in a lab during 547 frames: quick movement can be tracked correctly. figure
[6] P. Jorge, A. Abrantes, and J. Marques, “On-line tracking groups of pedestrians with bayesian networks,” in Workshop PETS, Prague, Czech Republic, 2004. [7] T. Zhao and R. Nevatia, “Tracking multiple humans in complex situations,” IEEE Transactions on Image Processing, vol. 26, no. 9, pp. 1208–1221, 2004. [8] L. Lin and Y. Bar-Shalom, “New assignment-based data association for tracking move-stop-move targets,” IEEE Transactions on Aerospace and Electronic Systems, vol. 40, no. 2, pp. 714–725, 2004. [9] A. Almeida, J. Almeida, and R. Araujo, “Real-time tracking of multiple moving objects using particle filters and probabilistic data association,” Automatika, vol. 46, no. 1-2, pp. 39–48, 2005. [10] M. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking,” IEEE Transactions on Signal Processing, vol. 50, no. 2, pp. 174–188, 2002. [11] P. Maybeck, Stochastic Models, Estimation, and Control. Mathematics In Science and Engineering, 1979, vol. 141. [12] A. Blake and M. Isard, Active contours: the application of techniques from graphics, vision, control theory and statistics to visual tracking of
(b)
Fig. 5: Tracking mice in a lab environment during 547 frames: (a) - differences between predicted positions and associated measurements in xx component; (b) - Enlargement of (a) between frames 100 and 120. figure
shapes in motion. London; New York: Springer, 1998. [13] J. Pantrigo, A. Sanchez, K. Gianikellis, and A. Montemayor, “Combining particle filter and population-based metaheuristics for visual articulated motion tracking,” Electronic Letters on Computer Vision and Image Analysis, vol. 5, no. 3, pp. 68–83, 2005. [14] C. Chang and R. Ansari, “Kernel particle filter for visual tracking,” IEEE Signal Processing Letters, vol. 12, no. 3, pp. 242–245, 2005. [15] T. Petrie, “Tracking bouncing balls using kalman filters and condensation,” University of Colorado, Tech. Rep., 2004. [16] P. O’Malley, M. Nechyba, and A. Arroyo, “Human activity tracking for wide-area surveillance,” in 2002 Florida Conference on Recent Advances in Robotics, Miami, USA, 2002. [17] M. Kurita, S. Tsuruoka, S. Yokoi, and Y. Miyake, “Handprinted ”kanji” and ”hiragana” character recognition using weighting direction index histograms and quasi-mahalanobis distance,” IEICE PRL82-79, Tech. Rep., 1983. [18] N. Kato, M. Abe, and Y. Nemoto, “A handwritten character recognition system using modified mahalanobis distance,” Transactions Institute of Electronics, Information and Communication Engineers (IEICE), vol. J79-D, no. 1, pp. 45–52, 1996. [19] F. Kimura and M. Shridhar, “Handwritten numerical recognition based on multiple algorithms,” Pattern Recognition, vol. 24, pp. 969–983, 1991. [20] H. Lakany, G. Hayes, M. Hazlewood, and S. Hillman, “Human walking: Tracking and analysis,” in IEEE Electronic & Communications, Colloquium on Motion Analysis and Tracking, vol. 5, 1999, pp. 1–14. [21] R. Rosales and S. Sclaroff, “Improved tracking of multiple humans with trajectory prediction and occlusion modelling,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Workshop on the Interpretation of Visual Motion, Santa Barbara, USA, 1998. [22] Y. Huang and I. Essa, “Tracking multiple objects through occlusions,” in IEEE Conference on Computer Vision and Pattern Recognition 2005 (CVPR 2005), vol. II, San Diego, USA, 2005, pp. 1051–1058. [23] R. Plnkers and P. Fua, “Articulated soft objects for video-based body
8
(a)
(b)
(c)
(d)
(e)
(f)
Fig. 6: Tracking a person and a car during partial occlusion: Until frames (a)-(b) the person and the car are tracked separately; In (c)-(e) their motion regions overlap, and the only captured measurement is assigned to the car, so the persons tracking is done with higher uncertainty until it is ceased; In (f) features are tracked independently. figure
[24] [25]
[26] [27]
[28]
modelling,” in International Conference on Computer Vision and Pattern Recognition, Vancouver, Canada, 2001, pp. 394–401. G. Welch and G. Bishop, “An introduction to kalman filter,” Technical Report, University of North Carolina at Chapel Hill, 1995. R. Pinho, J. Tavares, and M. Correia, “Human movement tracking and analysis with kalman filtering and global optimization techniques,” in II International Conference On Computational Bioengineering, Lisbon, Portugal, 2005. J. Tavares and A. Padilha, “Matching lines in image sequences with geometric constraints,” in 7 Congresso Portuguˆes de Reconhecimento de Padr˜oes, Aveiro, Portugal, 1995. R. Pinho, M. Correia, and J. Tavares, “An improved management model for tracking multiple features in long image sequences,” in 6th WSEAS Int. Conf. on Signal Processing, Computational Geometry Artificial Vision (ISCGAV’06), Crete; Greece, 2006. F. Hillier and G. Lieberman, Introduction to Operations Research. McGraw-Hill International Editions, 2001.