Pseudo Measurement Based Multiple Model Approach for Robust ...

1 downloads 0 Views 248KB Size Report
Abstract. This paper presents a robust player tracking method for sports video analysis. In order to track agile player stably and robustly, we employ multiple ...
Pseudo Measurement Based Multiple Model Approach for Robust Player Tracking Xiaopin Zhong, Nanning Zheng, and Jianru Xue Institute of Artificial Intelligence and Robotics, Xi’an JiaoTong University, Xi’an, China {xpzhong, nnzheng, jrxue}@aiar.xjtu.edu.cn

Abstract. This paper presents a robust player tracking method for sports video analysis. In order to track agile player stably and robustly, we employ multiple models method, with a mean shift procedure corresponding to each model for player localization. Furthermore, we define pseudo measurement via fusing the measurements obtained by mean shift procedure. And the fusing coefficients are built from two likelihood functions: one is image based likelihood; the other is motion based association probability. Experimental results show effectiveness of our method in the hard case of player tracking literature.

1

Introduction and Related Work

Object tracking is one of crucial communities in computer vision. Good solutions to this problem (i.e. Real time and robust tracking) have a variety of applications such as navigation [6][8], missile defense [6], surveillance, human computer interface [7], intelligent transportation system and so on. Its application to sports domain also provides us with individuals moving analysis of team sports [4]. Usually, object tracking consists of two major components: Target Representation and Filtering [5]. In radar tracking domain, target is a simple echo and is only represented by coordinate value. So here filtering, which aims at target dynamics, is more important than target representation model. Whereas in visual tracking literature, target is large enough to be modeled by its appearance, shape, color or other specific features. When the sample rate of image data sequence is high enough, motion of the visual target between two consecutive frames will be negligible. Thereby, target representation is more important than filtering techniques in visual tracking literature. However, we pay attention to both of target representation and filtering because of the character of player tracking. In sports video analysis, it’s quite difficult to track players stably since they are highly non-rigid, identical dressing, dynamic uncertainty and occlusion of teammates frequently occurs. The method based on template matching [9] is easy to drift; therefore it is inevitable to be inaccurate in tracking and easy to loose the track. [10] and [11] used the top view to track handball game players indoor. [12] tracked multiple players in a video of American football. [13] used boosting P.J. Narayanan et al. (Eds.): ACCV 2006, LNCS 3852, pp. 781–790, 2006. c Springer-Verlag Berlin Heidelberg 2006 

782

X. Zhong, N. Zheng, and J. Xue

technique to reinforce the proposal distribution of particle filter for tracking multiple hockey players. [14] tracked athletes via multiple features of multiple views. They all pay attention to only one point of view, target representation or filtering. There are a lot of research works on how to remove camera motion from a game video [2], so it is out of scope in this paper and we use a static camera here. Even though the accuracy is less important to help the audience to enjoy the sports, filtering do much help to tackle the difficulties of player tracking, such as uncertainty of dynamics of agile player, partial occlusion of two or more players and clutter background. We insist on a good motion model so much that multiple model approach for hybrid system [8] will be chosen to track players. Consequently, a new algorithm, pseudo measurement based multiple model, is proposed for player tracking. It employs multiple model method, with a mean shift procedure corresponding to each model for player localization. Pseudo measurement is built via linear fusion technique by two likelihood function: one is image based likelihood; the other is motion based association probability. An important motivation for this idea is cue integration between image and motion to overcome the weakness of individual cue. Hence, pseudo measurement based multiple model algorithm is adaptive to some hard problem in player tracking literature, such as non-rigid target and agile motion. We begin in section 2 with player localization. In section 3, the proposed pseudo measurement based multiple model method is introduced. Experimental results and some minor problems present in section 4. Finally, conclusion and future work are discussed in section 5.

2

Mean Shift Based Player Localization

In this section, we first recall the well-known mean shift procedure for player localization, and then discuss the hard situations in localization of players. Mean shift is a nonparametric estimator of density gradient. When used in computer vision, color based mean shift is robust and also fast [5][15]. Color based mean shift models the target by the color histogram. Let {xi }ni=1 be a set of n points in R2 space to represent pixel locations of target. Then the probability of color u in the target model is derived by employing a convex and monotonic decreasing kernel function k : [0, ∞) → R. [5] has defined a distance metric,  pˆu (y)ˆ qu )1/2 d(y) = (1 − u

based on Bhattacharyya coefficient to denote how well the candidate and model match. Maximizing this distance (see [5] for details) yields mean shift vector computed with kernel k and its bandwidth h: n xi k( y−x h ) Mh (y) = i=1 −y (1) n y−x k( i=1 h )

Pseudo Measurement Based Multiple Model Approach

783

[5] recommends Epanechnikov kernel function. After a few iterations, mean shift vector will converge to zero. Only localization is not sufficient for filtering technique because of the measurement uncertainty. We assume the measurement uncertainty is Gaussian distribution and use three special point’s sum of squared differences (SSD) value to approximate the Gaussian distribution, according to [17].

Fig. 1. Three modes (marked in red×, red+ and red◦) are formed nearby two teammates. Consequently localization with mean shift procedure is inaccurate.

Here we notice that only color histogram information used, which will lead to large variations for adjacent location on the image lattice and the spatial information is lost. On the other hand, mean shift algorithm searches a local density extremum. Therefore, mean shift is sensitive to its initial placement. Especially when two players (team mates) get close, we probably obtain error localization with mean shift procedure (e.g. Fig.1). To tackle this difficulty, we use probabilities belonging to the real target, i.e. pseudo measurement presented in section 3, to constrain the localization results.

3

Pseudo Measurement Based IMM Filtering

A principled choice of dynamics of a tracking system is essential for good results. However, players are highly maneuvering targets, which is the reason that leads to awful player track with only one fixed dynamic model. Nowadays, considerable research has been undertaken in the field of hybrid system estimation theory [6] in radar tracking literature. That means we can make use of several dynamic models simultaneously to characterize the target’s motion. In our research, we pick IMM (interacting multiple model) method, one of suboptimal filtering techniques, along with a pseudo measurement to fuse multiple models. In this section, we first introduce pseudo measurement into IMM framework via Bayesian filtering theory, and then rectify the pseudo measurement with additional image based likelihood function and motion based likelihood function. In the end, the whole pseudo measurement based multiple model filtering algorithm we have proposed is listed.

784

3.1

X. Zhong, N. Zheng, and J. Xue

Pseudo Measurement Based Multiple Model Filtering Framework

In radar tracking literature, IMM has been verified to be a best compromise between optimal filtering and computational complexity [6]. Fig.2 demonstrates our framework.

Predictor 1:

zˆ1 Mean shift z1

z zˆ1

procedure

x1,k |k −1 = F1 x1,k −1|k −1 zˆ1 = H1 xk |k −1

zˆ1

Estimator 1:

x1,k |k = x1,k | k−1 +W1( zˆ1− z )

Pseudo measurement integrator

Interacting mixer

State and xk|k covariance combination

z

Predictor n:

xn ,k|k −1 = Fn xn ,k −1|k −1 zˆn = H n xk |k −1

zˆn

zˆn

z zˆn

Mean shift zn procedure

Estimator n:

xn ,k|k = xn ,k | k−1 +Wn ( zˆn − z )

Fig. 2. Pseudo Measurement based MM Filtering Framework

According to Fig.2, we define n motion models to form a IMM filter. The ith state transition equation and measurement equation are written as  i x (k + 1) = Fi xi (k) + vi (k) (2) z i (k + 1) = Hi x(k + 1) + wi (k + 1) Where xi (k) and z i (k) are state vector and measurement vector belonging to ith model at time k. Process noise vi (k) and measurement noise wi (k) are independent Gaussian noise with mean zero, covariance Qi (k) and Ri (k) respectively. In order to locate the targets (players), a mean shift procedure is employed m(k) for each motion model. Then n measurements are produced. Let {zi (k)}i=1 be the measurements at time k, i.e. localizations obtained by mean shift procedure in this paper. However, only one integrated measurement, which is called pseudo measurement here, will be used to drive IMM filter. Hence we define pseudo measurement z¯(k) as m(k)

z¯(k) =



ωi (k) · zi (k)

(3)

i=1

pi (k) ωi (k) =  i pi (k)

(4)

Here, m(k) is the number of measurement at time k. And ωi (k) is weighting factor determined by the likelihood pi of each candidate measurement belonging to the real target. The likelihood pi will be made clear in next subsection. z¯(k) in (3) is named pseudo measurement and is similar with but distinct from

Pseudo Measurement Based Multiple Model Approach

785

probabilistic data association [17], which is a radar tracking fusion strategy to handle the problem of data association when more than one or no measurement emerges. In our situation, each mean shift procedure returns a converging point, thus the number of measurement is changeless, i.e. n. (3) indicates that fused pseudo measurement is more accurate than any individual measurement. When teammates get close, it’s reasonable that motion information is prior to appearance information for player tracking system due to easily confusing the localization of teammates. So we try to employ the prediction of pseudo measurement to emphasize the motion information from multiple motion models. Let Mj (k) be the jth model at time k, then the model probability conditioned on history measurements is p(Mj (k)|Z k−1 ) =

n 

p(Mj (k)|Mi (k − 1), Z k−1 ) · p(Mi (k − 1)|Z k−1 )

(5)

i=1

Where, Z k−1 is history measurement up to time k − 1. p(Mj (k)|Mi (k − 1), Z k−1 ) indicates the model transition probability which is preset and p(Mi (k − 1)|Z k−1 ) means the previous model probability conditioned on history measurements. For each model, each corresponding filter (such as standard Kalman filter) can calculate a measurement prediction, denoted by Zˆj (k). Then we achieve the pseudo measurement prediction by zˆ(k) =

n 

p(Mj (k)|Z k−1 ) · zˆj (k)

(6)

j=1

This pseudo measurement prediction is crucial in our method in the case of players’ occlusion and no real measurement achieved (see next subsection for details). 3.2

Measurement Likelihood

In this subsection, we’ll build a straightforward likelihood function for pi in (4) using appearance information as well as motion information. In our method, likelihood function of the measurement is defined as below, pi = (Lai )α · (Lmi )β

(7)

Where, Lai denotes the likelihood from target appearance and Lmi from target motion. α and β are the weights implying the reliabilities of appearance based and motion based information respectively, satisfying 0 ≤ α, β ≤ 1. (7) indicates the likelihood pi is more rigorous after considering both points of view in tracking literature: target representation and filtering. In our experiments, we fix α and β for simpleness in spite of their significance for adaptiveness. Firstly, the image based likelihood Lai function can be many of similarity function, such as image based template matching function, feature based template matching function and even statistics based likelihood function. Without

786

X. Zhong, N. Zheng, and J. Xue

loss of generality and for simpleness, we apply Bhattacharyya coefficient, which has been defined in mean shift procedure [15], to get a robust image based likelihood function. Hence, we define Lai as Lai = exp(γ · ρi )  ρi = pˆl (zi )ql

(8) (9)

l

Here, ρi is Bhattacharyya coefficient between the color distribution of model q and that of candidate measurement pˆ(zi ), also an intermediate result from mean shift procedure. Notice that (8) is a nonlinear function and γ is another parameter to adjust the impact of appearance based likelihood. The influence of γ can be grasped more easily in our experiments. Secondly, when player occlusion occurs, appearance information of target fades out and their motion information should take over the tracker. We assume that the measurement innovation, which is obtained via the pseudo measurement prediction, obeys Gaussian distribution. Similar to IMM’s mode likelihood definition, we define Lmi as Table 1. Detailed steps of pseudo measurement based MM filtering in one circle

p(i,j)·µ p(i,j)·µ

k−1 (i)

1. Calculate the mixing probabilities: µk−1|k−1 (i, j) = 2. Redo the filters’ initialization (j),0

(j),0

Pk−1|k−1 =



 xˆ

(i) k−1|k−1 µk−1|k−1 (i, j) (i) (j),0 ˆk−1|k−1 − x ˆk−1|k−1 νk−1 (i, j) = x (i) i µk−1|k−1 (i, j) · Pk−1|k−1 + νk−1 (i, j)

x ˆk−1|k−1 =

k−1 (i)

i

i



(j)

T · νk−1 (i, j)



(j),0

3. Filters’ prediction: zˆj = Hj · x ¯k|k−1 = Hj · Fj · x ˆk−1|k−1 4. Calculate pseudo measurement prediction zˆ(k) in (6); 5. Mean shift procedure from zˆj for player localization zj and SSD for its uncertainty Rj ; 6. Get the appearance likelihood Lai via (8) and (9); 7. Obtain the motion based likelihood Lmi by (10); 8. Calculate measurement likelihood pi in (7); 9. Combine pseudo measurement z¯ via (3) and (3); 10. All filters run as standard Kalman filter; 11. Update model likelihood and probabilities



(j) (j),0 (j) xk|k−1 ); 0, Sk ; Λk = N Z¯ − h(ˆ (j)

(j)



(i)

ηk = Λk i p(i, j) · µk−1 ; 12. Estimate and covariance combination x ˆk|k =

 xˆ i

(i) (i) k|k µk ;

Pk|k =

 µ P i

(i) k

(i) k|k

(j)

µk = (i)

(j) k (i) i k

η η

(i)

+ [ˆ xk|k − x ˆk|k ] · [ˆ xk|k − x ˆk|k ]T



Pseudo Measurement Based Multiple Model Approach

  (zi − zˆ)T · Si−1 · (zi − zˆ) 1 exp − Lmi =  2 2π|Si |

787

(10)

Where zˆ, the pseudo measurement prediction is introduced in previous subsection and Si is the innovation covariance which is calculated with measurement covariance Ri in standard Kalman filter. Now the motion based likelihood function Lmi is indicating that the pseudo measurement is biased to motion prediction, controlled by the parameter α and β. 3.3

Pseudo Measurement Based MM Filtering

In this subsection, the detailed steps of pseudo measurement based MM filtering algorithm for player tracking are present for summary. In Table 1 Some procedures can be achieved from IMM algorithm (seeing [6] for details) directly.

4

Implementation and Results

The proposed method, pseudo measurement based multiple model, has been tested under various football game video. To evaluate the performance of the method, we compared our tracking results with ground truth, marked manually, and with other tracking strategies, such as mean shift and mean shift with Kalman filtering. 4.1

Experiment Configuration

The implementation configuration is set as below. To describe the player state, we use x(k) = [x, vx , ax , y, vy , ay ]Tk where (x, y) is coordinate of player location in image plane, (vx , vy ) is its velocity and (ax , ay ) the acceleration. Since we can only ”see” the player’s position information in image sequence, our system measurement is denoted by zk = [x, y]Tk only. Three models are used to characterize the player motion. They are constant velocity model (CV), constant acceleration model with small noise (LowCA) and constant acceleration model with large noise (HighCA). Since the football court is looked down, the size of the player varies slightly. Therefore, we won’t adapt the model size in our experiments. To be simple, α and β are both set to 1. However, γ adjusts the impact of appearance based likelihood, thereby we set γ to different value to test the effectiveness of our method, seeing next subsection. 4.2

Implementation Results

In this subsection, we test our algorithm with the video sequence ”football.avi”, compared with other two common algorithms (one is mean shift procedure only,

788

X. Zhong, N. Zheng, and J. Xue

06

10

27

28

29

06

10

27

28

29

06

10

27

28

29

Fig. 3. In these tracking result sequences, player position estimated is marked with a red cross. The first row displays mean shift only tracking result. The second row shows the result of Kalman + mean shift method. And the result of our method, pseudo measurement based multiple model approach, is put in the third row.

Fig. 4. The left figure is motion model probability of player selected. The right one demonstrates γ adjusting the effectiveness of the image based likelihood.

the other is mean shift with CV based Kalman filtering) in several phases. In video ”football.avi” a special target with agile motion is selected to be tracked. Firstly, the sequences with estimated position marked with Red Cross (Fig.3) are present. In Fig.3, only frames 6\10\27\28\29 are shown as key frames. Obviously, mean shift method failed when two teammates are very close to each other from frame 6 to frame 10, because mean shift can’t distinguish them well only by player’s appearance. From frame 27 to frame 29 mean shift + Kalman method also failed since the player’s position predicted in Kalman filter dropped into the region of another similar player. However, our approach is such a robust tracking method for player tracking that it can succeed in many hard cases. Secondly, the left figure in Fig.4 shows the history of the motion model probabilities for the player selected by our algorithm. Obviously, the motion model probability

Pseudo Measurement Based Multiple Model Approach

789

is not as stable as that in radar literature because the mean shift procedure is not stable for player localization. Thirdly, we redo our method only under the modification of parameter γ, comparing their square root position error with the ground truth marked by hand (the right figure in Fig.4). This experimental result has proven that the image based likelihood did help us to improve the player tracking.

5

Conclusion and Future Works

In this paper, we first present the challenges in player tracking area, for instance, the unknown motion mode and unknown noise level. Then to localize the player, we apply mean shift procedure which has been verified to be robust in visual system. However, mean shift procedure is dependent on the initialization so severely that only one initialization is not enough for robust player tracking. Furthermore, we import a multiple model method designed for hybrid system in radar tracking literature, to get multiple measurements which include true measurement and false measurements. To tackle the multiple measurements problem, a pseudo measurement is designed via two likelihood function: motion based likelihood and image based likelihood. The experimental results show the performance of our method in player tracking. However, there are several minor problem need to be taken into account further. For example, non-rigid player varying all the time challenges the model of image based likelihood. So a better model updating scheme may help a lot in accuracy. In addition, how to choose the parameter α, β and γ to be adaptive in different cases needs more research.

References 1. Bar-Shalom, Y., Fortmann, T.: Tracking and Data Association. Academic Press Inc. (1988) 2. http://vismod.media.mit.edu/vismod/demos/football/tracking.htm 3. Colins, R., Lipton, A., Fujiyoshi, H., Kanade, T.: Algorithms for cooperative multisensor surveillance. Proceedings of the IEEE. Vol. 89, No.10 (2001) 1456–1477 4. http://www.prowess.com.au/infodoc.html 5. Comaniciu, D., Ramesh, V., Meer, P.: Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol. 25, No.5 (2003) 564–577 6. Mazor, E., Averbuch, A., Bar-Shalom, Y., Dayan, J.: Interacting multiple model methods in target tracking: a survey. IEEE Transactions on Aerospace and Electronic Systems. Vol. 34, No. 1 (1998) 103–123 7. Kyung-Nam, K., Ramakrishna, R.S.: Vision-based eye-gaze tracking for human computer interface. IEEE SMC’99 Conference Proceedings. Vol. 2, (1999) 324–329 8. Bar-Shalom, Y., Li, X.R.: Estimation and applications to tracking and navigation: Academic Press Inc. (1995) 9. Seo, Y., Choi, S.H., Kim, H.W., Hong. K.S.: Where are the ball and players? Soccer game analysis with color-based tracking and image mosaic. In Proceedings of International Conference on Image Analysis and Processing, (1997) 196–203

790

X. Zhong, N. Zheng, and J. Xue

10. Pers, J., Kovacic, S.: Tracking People in Sport: Making Use of Partially Controlled Environment. In Proceedings of the 9th International Conference on Computer Analysis of Images and Patterns, (2001) 374–382 11. Pers, J., Kovacic, S.: Computer Vision System for Tracking Players in Sports Games. In Proceedings of the First Intenatioal Workshop on Image and Signal Processing and Analysis, (2000) 81–86 12. Intille, S., Bobick, A.: Closed-world tracking. In Proceedings of Intenatioal Conference on Computer Vision, (1995) 672–678 13. Okuma, K., Taleghani, A., Freitas, N., Little, J., Lowe, D.: A Boosted Particle filter: Multitarget detection and tracking. In Proceedings of European Conference on Computer Vision 2004, (2004) 28–39 14. Misu, T., Gohshi, S., Izumi, Y., Fujita, Y., Naemura, M.: Robust Tracking of Athletes Using Multiple Features of Multiple Views. In Journal of WSCG, Vol.12, No.1-3 (2004) 15. Comaniciu, D., Ramesh, V.: Mean Shift and Optimal Prediction for Efficient Object Tracking. In Proceedings of the IEEE Intenational Conference on Image Processing, (2000) 70–73 16. Kirubarajan, T., Bar-Shalom, Y., WD Blair, and GA Watson: IMMPDAF for Radar Management and Tracking Benchmark with ECM. IEEE Transaction on Aerospace and Electronic Systems, Vol. 34, No. 4 (1998) 1115–1134 17. Nickls, K., Hutchinson, S.: Estimating Uncertainty in SSD-Based Feature Tracking. Image and Vision Computing, Vol. 20, (2002) 47-58

Suggest Documents