Detecting Irregularity in Videos Using Kernel Estimation and K-D Trees Yun Li1 , Chunjing Xu1 , Jianzhuang Liu1 1
2
Department of Information Engineering The Chinese University of Hong Kong, China {yli5,
Visual Computing Group Microsoft Research Asia, Beijing, China
chjixu, jzliu}@ie.cuhk.edu.hk
[email protected]
ABSTRACT
shapes, and motion speeds), models for the normal activities are developed either by hand or by applying supervised learning techniques [9]. Graphical models have been widely used for this purpose [7], [11], [6]. Many of them require fixed backgrounds [8], [16], and involve tracking the body parts, which often fails due to occlusion, sudden motion changes. Hidden Markov Models are also used to detect actions and classify objects [13], [14], [3]. The performance of model-based methods in detecting irregularity is expected to be good in situations where all normal events are well modeled. Observations which cannot be interpreted by these models are considered irregular. In real world, however, the large number of normal events makes it very difficult to build models comprehensively. The second class contains non-model-based methods. The hard to define but easy to discriminate property of irregular events enables us to detect irregularity in a new video from a video database without predefining object models explicitly. Recently, methods that exploit local features have been proposed and good experimental results are given on some real-world images and video sequences [15], [10], [2], [12] . However, dense unconstrained motion estimation such as optical flow fields [10] is noisy and unreliable for dynamic scenes, particularly when dynamic events contain unstructured objects, such as a fountain. The scheme of using spatiotemporal corner points to characterize actions [12] is sensitive to occlusions of interested points. The method in [2], [15] using space-time correlation of video segments is significantly time-consuming as the growth of the database volume. In [17], an unsupervised technique using Normalized Cut is developed to detect irregularity in videos, but the features they adopted are too global to discriminate local unusual activities. In this paper, we present an algorithm based on behavior similarity comparison for irregular human activity detection. The underlying observation is that newly-observed normal activities have large correlation with previous activities in the database while irregular activities do not. The detection is carried out without the need of explicit background substraction and motion estimation. We can also detect unusual behaviors when there are considerable changes in clothing, illumination, and background. Fig. 1 illustrates how the whole system works.
Automatic event understanding is the ultimate goal for many visual surveillance systems. In this paper, we propose a novel approach for on-line detecting unusual human activities in videos without the need to explicitly define all valid configurations. Within the framework of Bayesian inference, the detection process is formulated as an MAP estimation where we attempt to find whether activities in new video segments have similar activities in a video database. Our approach has three contributions: firstly, we build the statistical representation of normal behaviors in the database using nonparametric kernel density estimation; secondly, local feature descriptors are highly compressed using PCA and stored in a K-D tree structure, making the search for behavior-based similarity fast and effective; thirdly, the K-D trees are used to generate multiple hypotheses which compete for the optimal classification. The approach requires no tracking, no explicit motion estimation, and no predefined class-based templates. Experimental results have validated our approach in many real-world video sequences. Categories and Subject Descriptors: I.5.4 [ Pattern Recognition ]: Application. General Terms: algorithms, experimentation. Keywords: Irregularity detection, K-D trees, Kernels, MAP estimation.
1.
Xiaoou Tang1,2
INTRODUCTION
Detecting irregular human behaviors is a challenging research topic and is vitally important for video surveillance. It is not easy to give an explicit definition of irregularity because it depends on the context in which the activities happen. Suspicious events in one situation may be normal in another. In this paper, irregular activities are defined as the salient patterns which are quite different from the activities in a given video database. Previous approaches to irregularity detection can be generally categorized into two classes. The first class consists of model-based methods. For a small number of object activities, model-based methods assume a predefined set of rules and parameters for all valid configurations. Using extracted features (e.g., appearances,
2. FEATURE VECTORS In this paper, we process grey-level videos for simplicity. Color videos are changed to grey-level videos before the processing. For each frame in a video, we extract objects of
Copyright is held by the author/owner(s). MM’06, October 23–27, 2006, Santa Barbara, California, USA. ACM 1-59593-447-2/06/0010.
639
'DWDEDVH
)HDWXUH ([WUDFWLRQ
'DWDEDVH3&$ %ORFNV
&RPSUHVVHG )HDWXUHV
3ULQFLSODO %DVHV
6\VWHP7UDLQLQJ RIIOLQH
.HUQHO'HQVLW\ (VWLPDWLRQ 3')RI 1RUPDOLW\
3.1 Estimation of PDFs .' 7UHHV
$FWLYLW\'HWHFWLRQ RQOLQH ,QSXW 9LGHRV
)HDWXUH ([WUDFWLRQ
,QSXW %ORFNV
6XEVSDFH 3URMHFWLRQ
&RPSUHVVHG )HDWXUHV
,UUHJXODU $FWLYLW\
0$3 (VWLPDWLRQ 1R
5HJXODU "