Tracking Efficiency Measurement of Dynamic Models on ... - IEEE Xplore

4 downloads 0 Views 238KB Size Report
Abstract — Particle filter has appeared as a useful tool for visual object tracking. The efficiency of the particle filter depends mostly on the number of particles ...
ICACSIS 2014

Tracking Efficiency Measurement of Dynamic Models on Geometric Particle Filter using KLD-Resampling Alexander A S Gunawan1, Wisnu Jatmiko2, Vektor Dewanto2, F. Rachmadi3, F. Jovan4 1 Mathematics Department, School of Computer Science Bina Nusantara University, Jakarta, Indonesia 2 Faculty of Computer Science, University of Indonesia, Depok, Indonesia 3 University of Edinburgh, 4University of Brimingham Email: [email protected], [email protected], [email protected] 

Abstract — Particle filter has appeared as a useful tool for visual object tracking. The efficiency of the particle filter depends mostly on the number of particles used in the estimation. This paper would like to measure the efficiency of particle filter via the Kullback–Leibler distance (KLD). The basis of the method is similar to Fox’s KLDsampling but implemented differently using resampling. The benefit of this approach is that the underlying distribution is exactly the posterior distribution. By means of batch KLD-resampling, we measure the efficiency of several dynamic models by calculating the average number of needed samples. Using experiments, we found (i) the efficiency of particle filter can be measure reliably enough using batch KLD-resampling, (ii) dynamics models affect the efficiency of particle filter, but their performance depends mostly on the case by case situationally. I. INTRODUCTION

P

filter for visual object tracking is a well-known method, which can overcome nonlinear and non-Gaussian in transition model and observation model. Particle filter is a practical tool to implement dynamic state estimation in Bayesian framework. The basic idea is to approximate the posterior distribution using a finite set of weighted samples or particles, given a sequence of sensor measurements. Since efficiency of particle filter relies on the number of samples used for state estimation, some efforts is focused to design more efficient particle filter and how to measure its efficiency. The typical implementation of particle filter keeps the number of samples fixed during the entire process. This approach may be very inefficient, since the complexity of the tracking can vary significantly over time. Fox [1] proposed KLD-sampling method for adapting the number of samples during estimation process. KLD is abbreviation of Kullback-Leibler distance, which is used to measure approximation error of particle filter representation. Recent development by Li et al [2] modified it by introducing KLD-resampling approach. The benefit of this approach over Fox’s KLD-sampling is that the underlying distribution is exactly the posterior ARTICLE

distribution, while the latter is the predictive belief. One of sources which influence the efficiency of particle filter is predictive power of state dynamics in transition model. In this paper, we use batch KLDresampling approach to measure the efficiency of several dynamic models by calculating the average number of samples which are needed to track the dataset entirely. In visual tracking, it is well-known that the evolution of states of transition model lives in certain transformation space, which is not a vector space. Thus, geometric approach is used for building transition model in order to take into account the underlying geometry of the transformation space precisely. Kwon et all [3] called the approach as geometric particle filter. In the experiments, the supporting observation model is based our previous research [4] using deep learning and extreme learning machine (ELM). The remainder of this paper is organized as follows: The next section will outline geometric particle filter and discuss different dynamic models using in experiments. In section 3, we introduce the batch KLD-resampling approach to measure the efficiency of several dynamic models on geometric particle filter. Experiment results which show the results of measurement in several datasets are presented in section 4. Finally, the last section presents the main conclusions of this work. II. GEOMETRIC PARTICLE FILTER The main purpose of visual object tracking is to estimate posterior distribution p( xt | z1: t ) . This knowledge can be derived below equations which consist of prediction and update stages [5]: Prediction: p( xt | z1: t 1 )

³ p( x

t

| xt 1 ) p( xt 1 | z1: t 1 ) dxt 1

(1)

Update: p( xt | z1: t )

p( zt | xt ) p( xt | z1: t 1 )

(2)

p( zt | z1: t 1 )

The denominator in equation (2) p( zt | z1: t 1 ) is the normalization factor that ensures posterior distribution sums up to 1, in order to satisfy the probability axioms. The integral in equation (1) is called as the

385

c 978-1-4799-8075-8/14/$31.00 2014 IEEE

ICACSIS 2014 Chapman-Kolmogorov equation. The solution of the integral gives the predicted state p( xt | z1 : t ) . After receiving the measurement at time t, the predicted state is finally corrected by likelihood factor p( zt | xt ) in equation (2). Particle filter was designed to numerically implement the recursive Bayesian scheme in equations (1) and (2) which approximates posterior distribution p( xt | z1: t ) using a finite set of weighted samples. The basic idea of particle filter is Monte Carlo simulation [5] and frequently formulated as sequential importance resampling (SIR) methods. In SIR method, the first step is to design important density q( xt | xt 1 , z t ) . The importance density can be thought as scaled version of posterior distribution with a different scaling factor at each x t . But the importance density should be chosen as an easily sampled probability distribution function. Next, SIR does drawing particles from an importance density, such that particles of the state x t are obtained by predicting from particles in t-1 and1current ntime measurement z t . Given n particles x t 1 , ... , x t 1 , and their weights w1t 1 , ... , w tn1 , particle weights at time t can be calculated by: p( xt(i ) | xt(i )1 ) p( zt | xt(i ) ) i 1...n wt(i ) { wt(i )1 q( xt(i ) | xt(i )1 , zt ) (3) And posterior distribution is: p( xt | z1 : t )

n

¦w

(i ) t

G ( xt  xt(i ) )

(4)

i 1

A. State Dynamics in Geometric Transition Model Recently, the consideration that the evolution of hidden states of transition model is not a vector space encourages the use of geometric approach. Geometric computing facilitates the building of transition model which take into account the underlying geometry of the transformation space precisely. The transformation space is exactly a curved space which possesses interesting structure as a Lie group. The common approach to model the state dynamics is by regarding the curved space as a vector space and then choosing a set of local coordinates and applying existing vector space methods. This approach frequently produces results that depend on the chosen local coordinates. The performance of such local coordinate-based approaches is unreliable mainly for the extremes and unusual cases. Based on above reason, we built the particle filter on Lie groups mainly on affine group. Lie group is an analytic manifold with a group structure such that the group operations are analytic. Lie groups arise in a natural way as transformation groups of geometric objects. The tangent space g at the identity element of a Lie group G has a rule of composition (X,Y)→[X,Y] derived from the bracket operation on the left invariant vector fields on G. The vector space g with

this rule of composition is called the Lie algebra of G. The structures of g and G are related by the exponential mapping exp: g → G which sends straight lines through the origin in g onto one-parameter subgroups of G (see Fig 1). The power of using Lie algebra, due to its linearity, the object moves along a geodesic without deforming its shape.

Fig 1. Relation of Lie group and Lie algebra (Source: opticalengineering.spiedigitallibrary.org)

In this paper, we focus to use particle filter on 2D affine transformation space & Aff(2). Given object T template coordinate point p ( p x , p y ) . The 2D affine transformation of the object coordinates is implemented & by multiplicating its homogeneous T coordinates p ( p x , p y ,1) with transformation matrix AAff(2), ªG t º (5) A « » ¬ 0 1¼ where G is an invertible 2×2 real matrix and t is ƒ2 translation vector. This matrix possesses interesting structure as a Lie group and is called as 2D affine group Aff(2). The 2D affine group Aff(2), is associated with its Lie algebra aff(2) represented as: ªU v º (6) a « » ¬ 0 0¼ where U is 2×2 real matrix and v is a ƒ2. A detailed explanation of Lie groups and Lie algebras can be found in [6]. Based on geometric approach, the discretized state equation on the affine group [3] can be written as: (7) X k X k 1. exp A( X , t ) 't  dWk 't where dWk represents the Wiener process noise on aff(2) with a covariance P∈ƒ6×6 as following:





6

6H E with H k H k ,1 ,, H k ,6 dWk

k ,i

i

(8) are a six dimensional Gaussian noise sampled from N(0,P). Ei are the basis elements of aff(2) represented by the following matrixes:

386

i 1

c 978-1-4799-8075-8/14/$31.00 2014 IEEE

ICACSIS 2014 ª1 0 0º ª0  1 0 º «0  1 0»; E «1 0 0» E1 3 « » « » «¬0 0 0»¼ «¬0 0 0»¼ ª0 0 1 º ª0 0 0 º « » « » E4 « 0 0 0 » ; E6 « 0 0 1 » «¬0 0 0»¼ «¬0 0 0»¼ Each geometric transformation mode corresponds to each Ei is shown in Fig 2. ª1 0 0º «0 1 0 »; E 2 « » «¬0 0 0»¼ ª0 1 0 º «1 0 0»; E 5 « » «¬0 0 0»¼

Fig 2. The geometric transformation modes induced by basis elements Ei of aff(2) [7]

The term A(X,t)∈aff(2) in (7) is the state dynamics on Lie group which determines the particle propagation. In the experiments, we would like to compare the efficiency of several dynamic models. Thus, three models is proposed for our experiments, that is: 1. The simplest and common choice for the state dynamics is a random walk (Brownian) model, that is: A(X,t)=0. 2. The first-order autoregressive (AR) process on Aff(2). This first-order AR based state dynamics model can be understood as an infinitesimal constant velocity model. The state equation with the state dynamics based on the AR process on Aff(2) can be represented as: Ak 1 a Log X k12 X k 1 where a is the AR process parameter. 3. The increment model, in which above AR process parameter is gradually increase from 0 up to 1 during iteration in KLD-resampling.





III. KLD RESAMPLING Kullback-Leibler distance (KLD), which also known as relative entropy is a measure of the information lost when proposal distribution Q is used to approximate true distribution P. For discrete probability distributions P and Q, the KLD of Q from P is defined as: P(i) d KL ( P || Q) ¦ P(i) ln Q (i) i

(9) Fox [1] has derived the required number N of samples so that, with probability 1−δ, KLD between samplebased maximum likelihood estimate (MLE) and the true distribution is less than an error bound threshold ε as: 1 2 N F k 1,1G (10) 2H where k is the number of different bins. The quantiles 2 of the chi-square distribution F k 1,1G are specified by: 2 2 P( F k 1 d F k 1,1G ) 1  G (11) To compute efficiently, Fox [1] propose to 2 approximate the quantile F k 1,1G by the Wilson Hilferty transformation, as following: k 1 2 2  z1G )3 N (1  (12) 9(k  1) 9(k  1) 2H which z1G is the upper quartile of the standard normal distribution. The basic idea of the KLD-sampling is to bound the approximation error in sampling process introduced by the sample-based representation of the particle filter. The number of bins k is estimated by counting during sampling process, before the true distribution is generated. It is done by checking for each generated sample whether it falls into an empty bin or not. Thus Fox [1] uses the predictive belief state to estimate of the true posterior distribution. Finally sampling process is stopped when the number of samples exceeds the threshold N shown in (12). The problem with KLD-sampling is the derivation of the bound using approximation from the proposal distribution, rather than the true posterior distribution. Thus the mismatch between the true and the proposal distribution is ignored. To fix the problem of KLDsampling, Soto [8] propose to compensate the degradation in the estimation of distribution to the importance function using relative numerical efficiency (RNE). Recently, Li et al [2] introduce KLD-resampling approach to overcome the problem. The benefit of this approach over Fox’s KLD-sampling is that the underlying distribution is exactly the posterior distribution, while the latter is the predictive belief. KLD-resampling uses the result of (12) in the resampling process to determine the total number of particles to resample. In our experiment, we modify the KLD-resampling on each iteration to allow the update of multiple samples batch by batch. The objective of the modification is to minimize the number of iterations and thus speed up the resampling process. An update step of the batch KLD-resampling particle filter is given in Table I. IV. EXPERIMENT RESULTS In order to evaluate the proposed method, it is done the experiments using several video dataset to track

387

c 978-1-4799-8075-8/14/$31.00 2014 IEEE

ICACSIS 2014 V. CONCLUSION

certain object. The experiments are implemented on TABLE I The batch KLD-resampling algorithm Inputs: bound ε and δ, bin size, batch size Initialization: k=0; i=0; N=1; all bins are zero-resampled; Iteration: while i≤ N do: Resample the particles according to the weight if (the new resampled particle fall into empty bins b) then: k:=k+1 //Update the number of resampled bin b: = non-empty end i=i+batch //Update number of generated samples Update N using equation (12) end

i3 2.53 [GHz] CPU (without GPU) and 2 [GB] RAM. The experiments face to various challenges in six video dataset that is illumination change, partial occlusions, object deformation, camera perspective, fast motion, and image blur. The used dataset are: woman [9], car4, davidin [10], person [11], cube, vase [7]. For initialization, the target object in the first frame is chosen carefully by drawing rectangular box. Then the diagonal of covariance matrix is adjusted with suitable values based on object dynamics. The parameter of KLD-resampling except batch size is chosen to produce reasonable number of samples. The quantile of the standard normal distribution 1-G is set to 0.5. The pre-specified error bound threshold ε, which represents target KLD between proposal distribution and true distribution, is set to 0.1. And finally the bin size is set to 0.1. Otherwise, the batch size parameter depends on the video dataset. To ensure fairness, each experiment, which consists of three dynamic models, was repeated several times with different batch size and the best result of each experiment was chosen as the final outcome. In order to compare the efficiency of three dynamics models, the average number of needed samples of each video sequence is listed in Table II. The table is also equipped with their batch size, and number of frame of each video dataset, as following:

DATA SET

CUBE

VASE

REFERENCES [1] Dieter Fox, "Adapting the Sample Size in Particle Filters Through KLD-Sampling," International Journal of Robotics Research, vol. 22 , no. 12 , pp. 985-1003, 2003. [2] Tiancheng Li, Shudong Sun, and Tariq Pervez Sattar, "Adapting sample size in particle filters through KLD-resampling," Electronics Letters, vol. 49, no. 12, pp. 740 - 742, June 2013. [3] Junghyun Kwon, Hee Seok Lee, F.C. Park, and Kyoung Mu Lee, "A Geometric Particle Filter for Template-Based Visual Tracking," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 36, no. 4, pp. 625 - 643 , April 2014. [4] Alexander A S Gunawan, Mohamad Ivan Fanany, and Wisnu Jatmiko, "Deep Extreme Tracker Based on Bootstrap Particle Filter," Journal of Theoretical and Applied Information Technology, vol. 67, no. 1, 2014. [5] A. J. Haug, Bayesian Estimation and Tracking: A Practical Guide.: Wiley, 2012. [6] Robert Gilmore, Lie Groups, Physics, and Geometry: An Introduction for Physicists, Engineers and Chemists. Cambridge : Cambridge University Press, 2008. [7] Junghyun Kwon, Kyoung Mu Lee, and F.C. Park, "Visual tracking via geometric particle filtering on the affine group with optimal importance functions ," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , Miami, 2009, pp. 991 - 998.

TABLE II The particle filter efficiency in 6 video sequences. AVERAGE PARTICLE BATCH #FRAME SIZE ST BROW 1 INCREM NIAN

WOMAN CAR4 DAVIDIN PERSON

In this paper, we have successfully enhanced our previous tracking algorithm [4] to adapt the number of samples during estimation process using the batch KLD-resampling. The benefit of this approach is that the underlying distribution is exactly the posterior distribution comparing to Fox’s KLD-sampling [1]. In Table II, we list the average number of needed samples of three dynamic models in order to measure the efficiency. Based on these experiment results, we found (i) the efficiency of particle filter can be measure reliably enough using batch KLDresampling, comparing to the running time. It means the average number of needed samples is invariant in repeated experiments. Because the running time, which is calculated in frame per second (fps), depends strongly on the computer performance, (ii) dynamics models affect the efficiency of particle filter, but their performance depends mostly on the case by case situationally. Thus it cannot be claimed that one dynamic model is better than the others, which suggests using the multiple-model approach.

39.78 25.26 79.48 29.46 34.87 99.36

ORDER

AR 49.34 26.13 65.45 31.88 34.15 98.42

[8] Alvaro Soto, "Self adaptive particle filter," in International Joint Conference on Artificial Intelligence (IJCAI) 19th, Edinburgh, Scotland, 2005, pp. 1398-1403.

ENT

45.09 27.69 85.19 28.59 31.05 98.41

20 15 50 15 15 50

550 659 770 948 271 316

[9] Naiyan Wang and Dit-Yan Yeung, "Learning a Deep Compact Image Representation for Visual Tracking," in Proceedings of Twenty-Seventh Annual Conference on Neural Information Processing Systems NIPS, Lake Tahoe, Nevada, 2013, pp. 1-9. [10] David Ross, Jongwoo Lim, Ruei-Sung Lin, and Ming-Hsuan Yang, "Incremental Learning for Robust Visual Tracking," the International Journal of Computer Vision, vol. 77, no. 1-3, pp. 125-141 , May 2008. [11] Dominik A. Klein. (2010) BoBoT - Bonn Benchmark on Tracking. [Online]. http://www.iai.uni-bonn.de/~kleind/tracking/

388

c 978-1-4799-8075-8/14/$31.00 2014 IEEE

Suggest Documents