Need to process data from Non- video sensors: e.g., wired- gloves, radar, GPS, ... acceleration, velocity, subtrajectory
Automatic Object TrajectoryBased Motion Recognition Using Gaussian Mixture Models
Faisal I. Bashir, Ashfaq A. Khokhar, Dan Schonfeld
Electrical and Computer Engineering,
University of Illinois at Chicago. Chicago, IL, USA.
Motivation -
Importance of trajectory-based processing - In poor quality videos (surveillance), features such as face, appearance, color, etc are not visible. - Need to process data from Non - video sensors: e.g., wired- gloves, radar, GPS, CNS.
-
Major Application areas -
Video Surveillance. (Intelligent Video) Sign Language Recognition. Sports Video Analysis for Teams and Viewers. Animal Mobility Experiments. Moving Object Databases.
ICME 2005
Multimedia Systems Lab, University of Illinois Chicago
Problem Statement Develop Scalable Scalable Classification Classification Algorithms Algorithms for for Develop Recognition of of Trajectories Trajectories obtained obtained from from Video Video or or Recognition Non-Video sources. sources. Non-Video
Optimally- Compact Representation at Low - Level Easily Extendible to High - Level Not tied to Video Sensors
ICME 2005
Multimedia Systems Lab, University of Illinois Chicago
Related Work –Trajectory Indexing and Retrieval
Sahouria & Zakhor [ICIP 99] – X and Y - separately processed using Haar Wavelets. First 8 coeffs. stored as index. Euclidean Distance.
W. Chen, S.F. Chang [SPIE 00] – Wavelet- based segmentation; Feature vector with acceleration, velocity, subtrajectory length, etc.
Lei-Chen & Oria [ACM MIR 04] – X and Y - transformed to Movement Sequence quantized into 8x8 bins; Normalized Edit Distance.
ICME 2005
Multimedia Systems Lab, University of Illinois Chicago
Related Work –Trajectory Modeling and Recognition
Rao, Shah [IJCV 04] – View - invariant representation of actions based on curvature; For each ‘dynamic instant’, frame number, location of hand and ‘sign’ of instant stored; Matching done on trajectories with same number of instants and same sign permutations.
Vaswani, Chellapa [IEEE TIP, to appear] – Model Activity performed by a group of moving and interacting objects; Objects in video taken as points and ‘shape’ formed by these points is tracked over time; Abnormality detected as perturbation in this ‘shape’.
ICME 2005
Multimedia Systems Lab, University of Illinois Chicago
Challenges & Proposed Solutions Compact Indexing UsePrincipal PrincipalComponent ComponentAnalysis Analysis(PCA) (PCA) based Use -- based representation.(Optimal (OptimalEnergy EnergyCompaction) Compaction) representation.
Partial Trajectories Due to Occlusion, Noise, etc. Segmenttrajectories trajectoriesinto intosmall smallchunks chunksof of Segment subtrajectories. subtrajectories.
Estimate High - Dimensional Multimodal PDFs of Activity Classes UseGaussian GaussianMixture MixtureModels Modelsto toEstimate Estimatearbitrarily arbitrarily Use Complex PDFs. PDFs. Complex
ICME 2005
Multimedia Systems Lab, University of Illinois Chicago
Outline Trajectory
Segmentation using Hypothesis Testing on Curvature PCA-Based Representation Gaussian Mixture Models for Class Density Estimation
ICME 2005
Multimedia Systems Lab, University of Illinois Chicago
Trajectory Segmentation - Motivation Partial
Queries can be answered.
– If one portion of trajectory is not available due to occlusion, etc. – If two objects follow the same pattern of motion for a while and then go their different ways. Implicit
ICME 2005
Dimensionality Reduction Multimedia Systems Lab, University of Illinois Chicago
Segmentation using Hypothesis Testing based on Curvature
Segmentation is based on Curvature: κ ( n) =
x (n) y(n) − x(n) y (n) 3
( x (n) 2 + y (n) 2 ) 2
Two non-overlapping windows from curvature Likelihood Ratio Test: The two windows come from same distribution ? Find DistinctMaximas in Distance measure to locate Segmentation Points. Segmented Subtrajectories are normalized for spatial invariance.
Hypothesis Testing on Curvature – Likelihood Ratio Test ( x − µ1 ) 2 1 L( X ;θ1 ) = exp(− ). 2 2 σ 2πσ 1 1
( x − µ2 )2 L(Y ;θ 2 ) = exp(− ) 2 2 σ 2πσ 2 2 1
( x − µ3 ) 2 1 exp(− ). θi = ( µi , σ i ) L( Z ;θ 3 ) = 2 2σ 3 2πσ 3 ( x − µ3 ) 2 1 exp(− ) L0 = 2 2σ 3 2πσ 3 1 ( x − µ1 ) 2 ( x − µ2 ) 2 exp(− L1 = + ) 2 2 2 σ1 σ2 2πσ 1σ 2 1
λL =
L0 L1
σ 1σ 2 1 ( x − µ3 ) 2 ( x − µ1 ) 2 ( x − µ2 ) 2 + [ − − d ( X , Y ) = − log(λL ) = − log 2π ] 2 2 2 σ3 σ3 σ1 σ2 2 ICME 2005
Multimedia Systems Lab, University of Illinois Chicago
Segmentation Results
Segmentation of trajectories from different signers. ‘Norway’ (a) – (b); ‘Alive’ (c) – (d) ICME 2005
Multimedia Systems Lab, University of Illinois Chicago
Principal Component Analysis
Data-dependant Orthonormal bases (PCs) as opposed to generic bases in DFT,DWT etc. Let X be a vector of p-random variables: – linear function α 1′ x of the elements of x with maximum variance. – linear functionα 2′ x , uncorrelated with α 1′ x , with maximum variance, and so on.
If Covariance matrix is known then kth PC is its eigenvector corresponding to kth largest eigenvalue.
ICME 2005
Multimedia Systems Lab, University of Illinois Chicago
Principal Component Analysis
Projection: y = Φ ′q x Y is maximally uncorrelated: det(Σ y )is maximized. How many PCs to be retained? m
∑λj t m = 100 ×
j =1 p
∑λj j =1
ICME 2005
Multimedia Systems Lab, University of Illinois Chicago
PCA Based Combined X- and YRepresentation
Segment based on 2-D spatio-temporal curvature Represent both x- and y- using single set of PCA Coefficients Trajectory data from x- and y- projections for each segment is stacked to form one vector per subtrajectory PCA is performed on these stacked vectors PCA feature vectors used to Train GMMs.
ICME 2005
Multimedia Systems Lab, University of Illinois Chicago
Gaussian Mixture Models Nc
P( y Θ ) = ∑ π i `( y; µi , ∑ i ) i =1
`( y; µ , ∑ ) : M-dimensional Gaussian density
µ
∑ πi ICME 2005
: Mean Vector : Covariance Matrix : Mixing parameters of the Gaussian components, satisfying
∑π
i
=1
Multimedia Systems Lab, University of Illinois Chicago
Expectation Maximization for GMM Parameter Estimation •E-Step:
hik ( t ) =
π ik `( y t ; µik , ∑ ik ) Nc
k t k k π ` ( y ; µ , ∑ ∑ j j i ) j =1
•M-Step: NT
π ik +1 =
∑h (t )
t =1 N c NT
k i
∑∑ h ( t ) i =1 t =1
k i
NT
µik +1 =
∑ h ( t )y t =1 NT
k i
∑h (t ) t =1
k i
NT
t
∑ik +1 =
k t k +1 t k +1 T h ( t )( y − µ )( y − µ ) ∑i i i t =1
NT
∑h (t ) t =1
k i
Training and Test Data Sets
ASL I: 207 Trajectories from 3 classes in Australian Sign Language. Training on half; Testing on rest half.
ASL II: Same as above. Training on half; Testing on all.
HJSL I: 108 Trajectories from High Jump and Slalom Skiing Dataset. Training on half; Testing on rest half.
HJSL II: Same as above. Training on half; Test on all.
Results - GMM Learning
1-Sigma contours of GMM’s learnt from three classes in Australian Sign Language Dataset. (a) ‘Norway’. (b) ‘Alive’. (c) ‘Crazy’.
Results - Classification
ROC curves for Three Classifiers using ASL II dataset for: (a) Class 1 ‘Norway’. (b) Class 2 ‘Alive’. (c) Class 3 ‘Crazy’. (d) Average performance across all classes.
Results - Accuracy
accuracy =1-
false alarms test set
Method
ASL I
ASL II
HJSL I
HJSL II
GMM
85.29
92.75
79.63
89.81
PCA Density Estimation
86.27
93.24
38.88
45.37
GMM Global
69.61
73.91
62.96
63.89
Classification Accuracy Results for Three classifiers in Four experimental setups.
Questions ??? Contact
Information : Faisal I. Bashir.
[email protected]
ICME 2005
Multimedia Systems Lab, University of Illinois Chicago