Distributed Multi-Dimensional Hidden Markov Model: Theory and ...

42 downloads 0 Views 2MB Size Report
dimensional hidden Markov model (HMM) by distributing the non-causal model into ..... dataset∗which contains video clips of multiple trajectories with interactions. ... 2-trajectories, each of which have 30 samples, totally 1350 two-trajectory ...
Distributed Multi-Dimensional Hidden Markov Model: Theory and Application in Multiple-Object Trajectory Classication and Recognition Xiang Ma, Dan Schonfeld and Ashfaq Khokhar Department of Electrical and Computer Engineering, University of Illinois at Chicago, 851 South Morgan Street, Chicago, IL, U.S.A. ABSTRACT In this paper, we propose a novel distributed multi-dimensional hidden Markov model (DHMM). The proposed model can represent, for example, multiple motion trajectories of objects and their interaction activities in a scene; it is capable of conveying not only dynamics of each trajectory, but also interactions information between multiple trajectories, which can be critical in many applications. We firstly provide a solution for non-causal, multidimensional hidden Markov model (HMM) by distributing the non-causal model into multiple distributed causal HMMs. We approximate the simultaneous solution of multiple HMMs on a sequential processor by an alternate updating scheme. Subsequently we provide three algorithms for the training and classification of our proposed model. A new Expectation-Maximization (EM) algorithm suitable for estimation of the new model is derived, where a novel General Forward-Backward (GFB) algorithm is proposed for recursive estimation of the model parameters. A new conditional independent subset-state sequence structure decomposition of state sequences is proposed for the 2D Viterbi algorithm. The new model can be applied to many other areas such as image segmentation and image classification. Simulation results in classification of multiple interacting trajectories demonstrate the superior performance and higher accuracy rate of our distributed HMM in comparison to previous models. Keywords: Trajectory Modelling, Activity Recognition, Hidden Markov Models

1. INTRODUCTION Motion trajectories of objects have been shown to provide critical information in the representation of object dynamics for retrieval and classification. In many applications, the simultaneous interaction among multiple objects provides critical information that is invaluable in retrieval and classification applications, e.g. characterizing interaction activities and group dynamics. Examples of interactions of multiple objects are depicted in Fig. 1. It is important to point out that classification of motion activities involving multiple interacting objects can only be ascertained by modelling their trajectories simultaneously. In particular, motion trajectory classification of each object separately cannot be used to extract interactive events. For example, in Fig. 1(a), if we look at one person only, we can say “one person walks.” However, the underlying event “two people walk and meet” cannot be determined. Hidden Markov model (HMM) is very powerful tool to model temporal dynamics of processes, and has been successfully applied to many applications such as speech recognition,1 gesture recognition,2 musical score following.3 F. Bashir et al.4 presented a novel classification algorithm of object motion trajectory based on 1D HMM. They segmented single trajectory into atomic segments called subtrajectories based on curvature of trajectory, then the subtrajectories are represented by their principal component analysis (PCA) coefficients. Temporal relationships of subtrajectories are represented by fitting a 1D HMM. However, all the above applications rely on a one-dimensional HMM structure. Simple combinations of 1D HMMs can not be used to characterize multiple trajectories, since 1D models fail to convey interaction information of multiple interacting objects. The major challenge here is to develop a new model that will semantically reserve and convey the “interaction” information. Further author information: (Send correspondence to X. Ma.) E-mails: {mxiang, ds, ashfaq}@ece.uic.edu

Figure 1. Examples of multiple interactive trajectories: (a) “Two people walk and meet”. (b) “Two air planes fly towards each other and pass by”.

Various HMM model structures have been proposed as extensions of 1D HMMs. Early efforts devoted to extending 1D-HMMs to higher dimensions were presented by pseudo 2D-HMMs5 .6 The model is called “pseudo 2D” in the sense that it is not a fully connected 2D-HMM. The basic assumption is that there exists a set of “superstates” that are Markovian and within each superstate there is a set of simple Markovian states. To illustrate this model for higher dimensional systems, let us consider a two-dimensional image. The transition between superstates is modeled as a first-order Markov chain and each superstate is used to represent an entire row of the image; a simple Markov chain is then used to generate observations in the column, as depicted in Fig. 2(a). Thus, superstates relate to rows while simple states to columns of the image. Later efforts to represent 2D data using 1D HMMs were proposed using coupled HMMs (CHMMs)7 .8 In this framework, each state of the 1D HMM is used as a meta-state to represent a collection of states, as depicted in Fig. 2(b). For example, image representation based on CHMMs would rely on a 1D HMM where each state represents an entire column of the image. In certain applications, these models perform better than the classical 1D-HMM.5 However, the performance of pseudo 2D-HMMs and CHMMs remains limited since these models capture only part of the two-dimensional hidden state information. The first analytic solution to true two-dimensional HMMs has been presented by Li, Najmi and Gray.9 They proposed a causal two-dimensional HMM and presented its application to image Classification. In this model, state transition probability for each node is conditioned on the states of nearest neighboring nodes from the horizontal and vertical directions, as depicted in Fig. 2(c). The limitation of this approach is that the state dependence of a specific node may arise from any direction and from any of its neighbors. Thus, the analytic solution to the two-dimensional model presented in9 will only capture partial information. In particular, the training and classification algorithms presented in9 rely on the causality of the model. Hence, direct extension of these algorithms to general 2D-HMMs, which can represent state dependencies from neighbors in all directions, is not possible since such a model is inherently non-causal. We propose a novel distributed multi-dimensional hidden Markov Model (DHMM) for modelling of interacting trajectories involving multiple objects. In our model, each object-trajectory is modelled as a separate Hidden Markov process; while “interactions” between objects are modelled as dependencies of state variables of one process on states of the others. The intuition of our work is that, HMM is very powerful tool to model temporal dynamics of each process (trajectory); each process (trajectory) has its own dynamics, while it may be influenced by or influence others. In our proposed model, “influence” or “interaction” among processes (trajectories) are modelled as dependencies of state variables among processes (trajectories). Our model is capable of conveying not only dynamics of each trajectory, but also interactions information between multiple trajectories, while requiring no semantic analysis. We first provide a solution for non-causal, multi-dimensional HMMs by distributing the non-causal model into multiple distributed causal HMMs. We approximate the simultaneous solution of multiple distributed HMMs on a sequential processor by an alternate updating scheme, one possible alternate updating scheme is depicted in Fig. 3. The numbers {1, 2, 3, 4, ...} are the sequence orders of updating of model parameters.

Figure 2. Various two-dimensional hidden Markov models: (a) Pseudo 2D-HMM (b) Coupled HMM (CHMM) (c) Causal 2D-HMM9 with two nearest neighbors in vertical and horizontal directions (d) Proposed general non-causal 2D-HMM

Figure 3. Sequential alternate updating scheme of multiple distributed HMMs

Subsequently we extend the training and classification algorithms presented in9 to a general causal model. A new Expectation-Maximization (EM) algorithm for estimation of the new model is derived, where a novel General Forward-Backward (GFB) algorithm is proposed for recursive estimation of the model parameters. A new conditional independent subset-state sequence structure decomposition of state sequences is proposed for the 2D Viterbi algorithm. The new model can be applied to many problems in pattern analysis and classification. For simplicity, the presentation in this paper will focus primarily on a special case of the our model in two-dimensions, which we referred to as distributed 2D hidden Markov Models (2D DHMMs).

Figure 4. Distributed 2D Hidden Markov Models: (a) Non-causal 2D Hidden Markov Model. (b) Distributed 2D Hidden Markov Model 1. (c) Distributed 2D Hidden Markov Model 2.

2. DISTRIBUTED 2D HIDDEN MARKOV MODEL Suppose there are M ∈ N interacting objects in a scene. Recall that in our model, each object-trajectory is modelled as a Hidden Markov process of time; while “interactions” between object trajectories are modelled as dependencies of state variables of one process on those of the others. We constrain the probabilistic dependencies of state in one process (trajectory) at time t, on its own state at time t-1, as well as on the states of other processes (trajectories) that “interact” or influence on it at time t and t-1, i.e. P r(s(m, t)|s(l, t), s(n, 1 : t − 1)) = P r(s(m, t)|s(n, t − 1), s(l, t))

(1)

where m, n, l ∈ {1, ..., M } are indexes of processes (trajectories), l 6= m. The above constrain of state dependencies makes the desired model non-causal, since each process (trajectory) can influence others, there is no guarantee that the influence should be directional or causal. Fig. 4(a) shows an example of our non-causal 2D HMM, which is used to model two interacting trajectories. Each node S(i; t) in the figure represents one state at specific time t for trajectory i, where t = {1, 2, ..., T }, i = {1,2}; each node O(i,t) represents observations corresponding to S(i,t), and each arrow indicates transition of states (the reverse direction of it indicates dependency of states). The first row of states is the state sequence for trajectory 1, and the second row corresponds to trajectory 2. As can be seen, each state in one HMM chain (trajectory) will depend on its past state, the past state of the other HMM chain (trajectory), and the concurrent state of the other HMM chain (trajectory). The above model is capable of modelling multiple processes and their interactions, but it is intractable since it is non-causal. We propose a novel and effective solution to it, where we “decompose” it into M causal 2D hidden Markov models with multiple dependencies of states, such that each HMM can be executed in parallel in a distributed framework. In each of the distributed causal HMM, state transitions (or state dependencies) must follow the same causality rule. For example, we distributed the non-causal 2D HMM in fig. 4(a) to two causal 2D HMMs, shown in figs. 4(b) and 4(c), respectively. In fig. 4(b), state transitions follow the same rule, so do state transitions in fig. 4(c). The above rules ensure the homogeneous structure of each distributed HMM, which further enable us to develop relatively tractable training and classification algorithms. When trajectory number M=3, our non-causal 2D hidden Markov model is depicted as in Fig. 5(a), we use the same distributing scheme to get 3 distributed 2d hidden Markov models, as shown in Fig. 5(b), 5(c), and 5(d), respectively. Using the same distributing scheme, we can distribute any non-causal 2D hidden Markov model that characterizing M (> 3) trajectories, to M distributed causal 2D hidden Markov models.

3. DHMM TRAINING AND CLASSIFICATION Define the observed feature vector set O = {o(m,t), m = 1,2,...,M; t = 1,2,...,T } and corresponding hidden state set S = {s(m,t), m = 1,2,...,M; t = 1,2,...,T }, and assume each state will take N possible values. The model

Figure 5. Distributed 2D Hidden Markov Models with application to 3 trajectories: (a) Non-causal 2D Hidden Markov Model that treat 3 object trajectory as one system(only 2 adjacent time slots of the system states are shown). (b) Distributed 2D Hidden Markov Model 1 for Object Trajectory 1. (c) Distributed 2D Hidden Markov Model 2 for Object Trajectory 2. (d) Distributed 2D Hidden Markov Model 3 for Object Trajectory 3. (Please note in Figures (b) (c) and (d), only state transitions to one state point is shown, other state points follow the same rules, respectively).

parameters are defined as a set Θ = {Π, A, B}, where Π is the set of initial probabilities of states Π = {π(m, n)}; A is the set of state transition probabilities A = {ai,j,k,l (m)}, and ai,j,k,l (m) = P r(s(m, t) = l|s(m0 , t) = k, s(m, t − 1) = i, s(m0 , t − 1) = j) , and B is the set of probability density functions (PDFs) of the observed feature vectors given corresponding states, assume B is a set of Gaussian distribution with means µm,n and variances Σm,n , where m, m0 = 1, ..., M ; m 6= m0 ; n, i, j, k, l = 1, ..., N ; t = 1, ..., T . Due to space limit, we will discuss the case M = 2 in detail. Please note that in all the following discussions, time durations and numbers of trajectories in one scene are the same. This assumption is easily violated, due to different appear and disappear moments of different objects; however, since our aim is to model multiple interacting trajectories, we are only interested in those time durations when interacting objects co-exist and interact with each other. For a long time duration within which number of objects changes, we usually divide it into smaller time durations within which number of interacting trajectories remains the same. What’s more, single object trajectory part can be easily modelled as 1D HMM, and only multiple interacting trajectories will be modelled as DHMMs.

3.1. Expectation-Maximization Algorithm We propose a new Expectation-Maximization (EM) algorithm suitable for estimation of parameters of the M Distributed 2D Hidden Markov Models in M trajectory system, which is analog but different to EM algorithm for 1D HMM.10 EM algorithm was firstly proposed by Dempster, Laird and Rubin,11 and there are many applications of EM algorithm to Hidden Markov Models10 .1 Let’s take DHMM with applications to 2 trajectories, especially DHMM in Fig. 4(c) for example, to explain how EM algorithm works for DHMM; the estimation of parameters of the DHMM with applications to M (> 2) is in the same way. Recall the Distributed 2D Hidden Markov Model in Fig. 4(c), assume the duration of trajectories are T (Conditions that trajectories have different durations were discussed above). We have Observation Sequence O = {o(i, j), i = 1, 2; j = 1, ..., T }, and Hidden States S = {s(i, j), i = 1, 2; j = 1, ..., T } where o(1, j), o(2, j) refer to observation (feature) set of object 1 and 2, and s(1, j), s(2, j) refer to state sets of object 1 and object 2. Here S refers to the union of those two state sets. Assume the number of states is N , so s(1, j), s(2, j) ∈ {1, 2, ..., N } for j = 1, 2, ..., T .

The incomplete data is O, and complete data is (O, S). The incomplete-data likelihood function is P (O|Θ), the complete-data likelihood function is P (O, S|Θ). We would like to maximize the complete-data likelihood function P (O, S|Θ). According to EM algorithm, the Q function is: X

Q(Θ, Θ0 ) =

log(P (O, S)|Θ)P (S|O, Θ0 ).

(2)

am,n,k,l bs(1,t) (o(1, t))bs(2,t) (o(2, t)).

(3)

S∈V

where P (O, S|Θ) = πs(2,0)

T Y t=1

Here Θ0 is the current (known) estimation of parameters, Θ is the future (unknown) estimation of parameters that maximize the likelihood function, V is the space of all possible state sequences with length of T . The joint probability of Observations O and states S is P (O, S|Θ). Substitute (4) into (3), we get X

Q(Θ, Θ0 ) =

log(πs(2,0) )P (O, S|Θ0 )

S∈V

|

{z

}

A∗

+

T XX

log(am,n,k,l )P (S|O, Θ0 )

S∈V t=1

|

{z

}

B∗

+

T XX

log(bs(2,t) (o(2, t)))P (S|O, Θ0 )

S∈V t=1

{z

|

}

C∗

+

T XX

log(bs(1,t) (o(1, t)))P (S|O, Θ0 ) .

S∈V t=1

|

{z

(4)

}

D∗

Where in the above equations, am,n,k,l is the transition probability from states s(2, t − 1), s(1, t − 1), s(1, t) to state s(2, t), when s(2, t − 1) is in state m, s(1, t − 1) is in state n, s(1, t) is in state k and s(2, t) is in state l. m, n, k, l ∈ {1, 2, ..., N }; bs(m,t) (o(m, t)) is the probability of observation o(m, t) from trajectory m given corresponding state s(m, t), m = 1, 2; and assume they follow a d-dimensional Gaussian Distribution, when the corresponding state is in i(i ∈ {1, 2, ..., N }), i. e. bm,i (o(m, t)) =

1 d 2

1

(2π) |Σm,i |

1 2

e− 2 (o(m,t)−µm,i )

T

Σ−1 m,i (o(m,t)−µm,i )

(5)

where in the above equation, µm,i is d-dimensional mean vector and Σm,i is d × d covariance matrix, and d is the dimensionality of observation (feature) vector. To maximize Q(Θ, Θ0 ), we maximize both (A∗),(B∗),(C∗) and (D∗). By maximize (A∗),(B∗),(C∗) and (D∗), we get the iterative updating formulas of parameters of our distributed 2D HMM. (p)

Define Fm,n,k,l (i, j) as the probability of state corresponding to observation o(i − 1, j) is state m, state corresponding to observation o(i − 1, j − 1) is state n, state corresponding to observation o(i, j − 1) is state k and state corresponding to observation o(i, j) is state l, given the observations and model parameters, (p)

Fm,n,k,l (i, j)

(6)

µ ¶ = P m = s(i − 1, j), n = s(i − 1, j − 1), k = s(i, j − 1), l = s(i, j)|O, Θ(p) ,

(7)

(p)

and define Gm (i, j) as the probability of the state corresponding to observation o(i, j) is state m, then (p) G(p) ). m (i, j) = P (s(i, j) = m|O, Θ

(8)

We can get the iterative updating formulas of parameters of the proposed model,

Σ(p+1) m

(p+1) (p) πm = P (G(p) ). m (1, 1)|O, Θ PI PJ (p) i j Fm,n,k,l (i, j) (p+1) . am,n,k,l = PM PI PJ (p) j Fm,n,k,l (i, j) i l=1 PI PJ (p) j Gm (i, j)o(i, j) i (p+1) µm = . PI PJ (p) j Gm (i, j) i PI PJ (p) (p+1) (p+1) )(o(i, j) − µm )T i j Gm (i, j)(o(i, j) − µm = . PI PJ (p) i j Gm (i, j) (p)

(9) (10)

(11)

(12)

(p)

In eqns. (1)-(6), p is the iteration step number. Fm,n,k,l (i, j), Gm (i, j) are unknown in the above formulas, next we propose a General Forward-Backward (GFB) algorithm for the estimation of them.

3.2. General Forward-Backward (GFB) Algorithm Forward-Backward algorithm was firstly proposed by Baum et al.12 for 1D Hidden Markov Model and later modified by Jia Li et al. in.9 Here, we generalize the Forward-Backward algorithm in129 so that it can be applied to any HMM, the proposed algorithm is called General Forward-Backward (GFB) algorithm. For any HMM model, if its state sequence satisfy the following property, then GFB algorithm can be applied to it: The probability of all-state sequence S can be decomposed as products of probabilities of conditional-independent subset-state sequences U0 , U1 , ..., i.e., P (S) = P (U0 )P (U1 /U0 )...P (Ui /Ui−1 )..., where U0 , U1 , ..., Ui ...are subsets of all-state sequence in the HMM system, we call them subset-state sequences. Define the observation sequence corresponding to each subset-state sequence Ui as Oi . Subset-state sequences for our model are shown in Fig. 6(b)(c). The new structure enables us to use General Forward-Backward (GFB) algorithm to estimate the model parameters. 3.2.1. Forward and Backward Probability Define the forward probability αUu (u), u = 1, 2, ... as the probability of observing the observation sequence Ov (v ≤ u) corresponding to subset-state sequence Uv (v ≤ u) and having state sequence for u-th product component in the decomposing formula as Uu , given model parameters Θ, i.e. αUu (u) = P {S(u) = Uu , Ov , v ≤ u|Θ}, and the backward probability βUu (u), u = 1, 2, ... as the probability of observing the observation sequence Ov (v > u) corresponding to subset-state sequence Uv (v > u), given state sequence for u-th product component as Uu and model parameters Θ, i.e. βUu (u) = P (Ov , v > u|S(u) = Uu , Θ). The recursive updating formula of forward and backward probabilities can be obtained as X αUu (u) = [ αUu−1 (u − 1)P {Uu |Uu−1 , Θ}]P {Ou |Uu , Θ}. (13) u−1

βUu (u) =

X

P (Uu+1 |Uu , Θ)P (Ou+1 |Uu+1 , Θ)βUu+1 (u + 1).

(14)

u+1

Then, the estimation formulas of Fm,n,k,l (i, j), Gm (i, j) are : αUu (u)βUu (u) . u:Uu (i,j)=m αUu (u)βUu (u)

Gm (i, j) = P

Fm,n,k,l (i, j) αUu−1 (u − 1)P (Uu |Uu−1 , Θ)P (Ou |Uu , Θ)βUu (u) =P P . u u−1 [αUu−1 (u − 1)P (Uu |Uu−1 , Θ)P (Ou |Uu , Θ)βUu (u)]

(15)

(16)

Figure 6. Various HMM models and their corresponding conditional-independent subset-state sequence decomposition structures for GFB algorithm: (a)Distributed 2D Hidden Markov Model 1 for 3 trajectories. (b) Distributed 2D Hidden Markov Model 2 for 3 trajectories (c) Causal 2D Hidden Markov Model. (d) 1D Hidden Markov Model.

3.3. Viterbi Algorithm For classification, we employ a two-dimensional Viterbi algorithm13 to search for the best combination of states with maximum a posteriori probability and map each block to a class. This process is equivalent to search for the state of each block using an extension of the variable-state Viterbi algorithm presented in,9 based on the new structure in Fig. 6(b)(c). If we search for all the combinations of states, suppose the number of states in each subset-state sequence Uu is w(u), then the number of possible sequences of states at every position will be M w(u) , which is computationally infeasible. To reduce the computational complexity, we only use N sequences of states with highest likelihoods out of the M w(u) possible states.

3.4. Summary of DHMM Training and Classification Algorithms -Training: 1. Assign initial values to {πm,n , ai,j,k,l , µm,n , Σm,n }. 2. Update the forward and backward probabilities according to eqns. (13) and (14) using proposed GFB algorithm, calculate old logP (O|Θ0 ). 3. Update Fi,j,k,l (m, t), Gn (m, t) according to eqns. (15)(16). 4. Update πm,n , ai,j,k,l (m), µm,n and Σm,n according to eqns. (9)-(12) using the proposed EM algorithm. 5. Back to step 2,Calculate new logP (O|Θ), stop if logP (O|Θ)-logP (O|Θ0 ) is below pre-set threshold. -Classification: Use a two-dimensional Viterbi algorithm to search for the best combination of states with maximum a posteriori (MAP) probability.

4. EXPERIMENTAL RESULTS: MULTIPLE-OBJECT TRAJECTORY CLASSIFICATION For simplicity, we only tested our DHMM model-based multiple trajectory classification algorithm on the M=2 trajectory cases. We test the classification performance of both proposed distributed 2D HMM-based classifier, causal 2D HMM-based classifier and traditional 1D HMM-based classifier on 2 datasets: (A) Synthetic multipletrajectory dataset. (B) Subset of the Context Aware Vision using Image-based Active Recognition (CAVIAR) dataset∗ which contains video clips of multiple trajectories with interactions.

Figure 7. Multiple-trajectories samples of two classes in Synthetic dataset: (a)(b) two multiple-trajectories (M=2) samples from class 1; (c)(d) two multiple-trajectories (M=2) samples from class 2.

Figure 8. Multiple-trajectories samples of two classes in CAVIAR dataset: (a) multiple-trajectories (M=2) sample and one frame from class 1: “Two people meet and walk together”; (b) multiple-trajectories (M=2) sample and one frame from class 2: “Two people meet, fight and run away”.

Figure 9. ROC curve of DHMM, Strictly Causal 2D HMM and 1D HMM for Synthetic data

The results are reported in terms of 3 criteria: 1. the average Receiver Operating Characteristics (ROC) curve. The ROC curve captures the trade-off between false positive rate versus the true positive rate as the threshold on likelihood at the output of the classifier is varied. The resulting ROC curves are shown in Figs 9, 10. As a baseline case, the performance of a uniformly distributed random classifier is also presented in the ROC curve. 2. The Area Under Curve (AUC). The AUC is a convenient way of comparing classifiers, which varies from 0.5 (random classifier) to 1.0 (ideal classifier). The AUCs for two datasets are shown in Figs. 9 and 10, respectively. 3. Classification Accuracy. The Classification Accuracy is defined as : PAccuracy = 1 −

|F | . |S|

(17)

where |F | represents the cardinality of the false positives set, and |S| represents the cardinality of the whole dataset. We firstly test on the Synthetic dataset. We construct the synthetic 2-trajectory dataset with 45 class of 2-trajectories, each of which have 30 samples, totally 1350 two-trajectory samples in the dataset. Fig. 7 lists four samples of multiple-trajectories (M=2) from 2 classes in our Synthetic dataset. We use 50% samples as training data, and the rest as testing data. The ROC curve is shown in Fig. 9. Test results show that our DHMMbased classifier achieves a 91.25% high accurate rate of classification, followed by strictly-causal 2D HMM-based classifier, which is 8% lower than ours; and 1D HMM-based classifier, which is 15% lower than ours, as shown in Table 1. Then we test on our CAVIAR dataset, we select data classes that have 2 people interacting with each other, and use 50% samples of ground truth trajectory as training data, and the rest as testing data. There are 9 classes of 180 two-people interacting trajectories. Fig. 8 lists samples of multiple-trajectories(M=2) from two classes of multiple-trajectories out of 9 classes in our CAVIAR dataset. ROC curve is shown in Fig. 10. Here we can see the performance of DHMM which model two-object trajectory as a system is better than Causal 2D HMM and 1D HMM. As shown in Table 1, the average classification accuracy of our DHMM-based classifier reaches 92.04%, which is 8% higher than strictly-causal 2D HMM-based classifier, and 12classifier. ∗ The CAVIAR dataset is from (http://homepages.inf.ed.ac.uk/rbf/CAVIAR/)

EC

Funded

CAVIAR

project/

IST

2001

37540.

Figure 10. ROC curve of DHMM, Strictly Causal 2D HMM and 1D HMM for CAVIAR data Table 1. Average Classification Error Rates

Method–Dataset 1D HMM Strictly Causal 2D HMM DHMM

SYNTHETIC(1350) 0.7654 0.8319 0.9125

CAVIAR(180) 0.8097 0.8420 0.9204

5. CONCLUSION We propose a novel Distributed 2D Hidden Markov Model (DHMM) for the modelling of multiple-object trajectories and their interactions. This model can include multi-trajectory interaction information that is lost in previous proposed models such as 1D HMM and Causal 2D HMM. For estimation of the DHMMs model parameters, we derived a new EM algorithm suitable for our model, and a novel General Forward-Backward (GFB) algorithm is proposed for recursive calculation of model parameters. Simulation results on 1430 two-trajectory data in both synthetic and CAVIAR datasets show superior performance and higher accuracy of our proposed distributed 2D hidden Markov model.

REFERENCES 1. L. R. Rabiner, “A tutorial on hidden markov models and selected applications in speech recognition,” Proceedings of the IEEE 77, pp. 257–286, 1989. 2. T. Starner and A. Pentland, “Real-time american sign language recognition from video using hidden markov models,” Technical Report, MIT Media Lab, Perceptual Computing Group 375, 1995. 3. C. Raphael, “Automatic segmentation of acoustic musical signals using hidden markov models,” IEEE Transactions on Pattern Analysis and Machine Intelligence 21, pp. 360–370, 1999. 4. F. I. Bashir, A. A. Khokhar, and D. Schonfeld, “Hmm based motion recognition system using segmented pca,” IEEE International Conference on Image Processing (ICIP’05) 3, pp. 1288–1291, 2005. 5. S. S. Kuo and O. E. Agazzi, “Machine vision for keyword spotting using pseudo 2d hidden markov models,” Proceedings of International Conference on Acoustic, Speech and Signal Processing 5, pp. 81–84, 1993. 6. C. C. Yen and S. S. Kuo, “Degraded documents recognition using pseudo 2d hidden markov models in gray-scale images,” Proceedings of SPIE 2277, pp. 180–191, 1994. 7. M. Brand, “Coupled hidden markov models for modeling interacting processes,” Technical Report, MIT Media Lab, Perceptual Computing Group 405, 1997. 8. M. Brand, N. Oliver, and A. Pentland, “Coupled hidden markov models for complex action recognition,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’97) 2277, p. 994, 1997.

9. J. Li, A. Najmi, and R. M. Gray, “Image classification by a two-dimensional hidden markov model,” IEEE Trans. on Signal Processing 48, pp. 517–533, 2000. 10. J. A. Bilmes, “A gental tutorial of the em algorithm and its application to parameter estimation for gaussian mixture and hidden markov models,” Technical Report, Dept. of EECS, U. C. Berkeley TR-97-021, 1998. 11. A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the em algorithm,” Journal of the Royal Statistical Society: Series B 39, pp. 1–38, 1977. 12. L. E. Baum, T. Petrie, G. Soules, and N. Weiss, “A maximization technique occuring in the statistical analysis of probabilistic functions of markov chains,” Ann. Math. Stat 1, pp. 164–171, 1970. 13. D. Schonfeld and N. Bouaynaya, “A new method for multidimensional optimization and its application in image and video processing,” IEEE Signal Processing Letters 13, pp. 485–488, 2006.

Suggest Documents