adaptive texture-color based background subtraction ...

3 downloads 1856 Views 353KB Size Report
Texture and color are two primitive forms of features that can be used to describe a .... The similarity measure between two histograms h1 and h2 is defined as:.
ADAPTIVE TEXTURE-COLOR BASED BACKGROUND SUBTRACTION FOR VIDEO SURVEILLANCE Teck Wee Chua, Yue Wang, Karianto Leman Institute for Infocomm Research A*STAR (Agency for Science, Technology and Research), 1 Fusionopolis Way, Singapore {tewchua, ywang, karianto}@i2r.a-star.edu.sg ABSTRACT

all LBP-based algorithms are more invariant to local illumination changes, they are unable to detect uniform foreground objects in large uniform background except at the objects’ edges. Yao et al. [6] proposed to use both color and texture information, they modeled each pixel with LBP and photometric invariant RGB color features. However, their learning process requires clean background frames. In this paper, we propose a robust texture-color based background subtraction algorithm with adaptive weighting scheme. The method relies on the complementary nature of texture and color features and it dynamically adjusts feature weights based on pixels’ local properties. Compared to the work in [1, 2, 3, 6, 7], our method achieves the most consistent and yet accurate foreground detection for eight dynamic scenes.

Texture and color are two primitive forms of features that can be used to describe a scene. While conventional local binary pattern (LBP) texture based background subtraction performs well on texture rich regions, it fails to detect uniform foreground objects in large uniform background. As such, color information can be used to complement texture feature. In this study, we propose to incorporate local color feature based on Improved Hue, Luminance, and Saturation (IHLS) color space and introduce an adaptive scheme that automatically adjusts the weight between texture and color similarities based on the pixel’s local properties: texture uniformity and color saturation. Experiments on eight challenging sequences demonstrate the effectiveness of the proposed method compared to the state-of-the-art algorithms. Index Terms— Background subtraction, texture, color, adaptive weight, local binary pattern

2. LOCAL TEXTURE FEATURE We use uniform LBP [8] to extract local texture features. Given a center pixel (xc , yc ) with gray value gc and its P equally spaced neighborhood pixels on a circle of radius R with gray value gp , the LBP can be represented as follows:  P −1  1 x≥0 LBPP,R = s(gp − gc + a)2p , s(x) = 0 x TB

(6)

(8)

where α ∈ [0, 1] is the foreground decay rate and H c denotes the current pixel histograms. The foreground decay rate determines how fast the foreground pixel is absorbed into the background. For pixel that is classified as background, the best matched background layer, k  , (the one with highest similarity) histograms are updated as follows: 



H k = βH c + (1 − β)H k , The weights of the model layers are also updated:  1 k = k wk = βYk + (1 − β)wk , Yk = 0 otherwise

(4)

where TB is a user-defined threshold. Larger TB denotes that more layers are selected to represent the background. The current frame histograms are compared with the selected B layers. The similarity measure between two histograms h1 and h2 is defined as: L−1 i=0 min(h1i , h2i ) ∩(h1 , h2 ) = (5) L−1 L−1 min( i=0 h1i , i=0 h2i )

(9)

(10)

where β ∈ [0, 1] is background learning rate. 4. EXPERIMENT RESULTS We evaluate the performance of the proposed method on eight challenging sequences with dynamic scenes as summarized in Table 1. The proposed algorithm is compared with the stateof-the-art algorithms: GMM [1], ACM [7], Yao [6], Vibe [3], and LBP [2]. For Yao method, the first 100 frames of every sequence were used as the training frames and we used the

where L stands for the histogram length. Since the background model is represented by LBP and LCP histograms,

50

Table 1. Summary of benchmark sequences No. 1 2 3 4 5 6 7 8

Sequence

Summary

WavingTreea Camouflagea TimeOfDaya Campusb Escalatorb Fountainb Lobbyb WaterSurfaceb

Wavering trees in contrast to the clear sky (outdoor) Static foreground occludes flickering monitor (indoor) Light gradually brighten simulating moving sun (indoor) Campus with wavering vegetations and moving shadows (outdoor) Moving escalators (indoor) Water falling with moving vehicles in the background (outdoor) Office lobby with lights switching on/off (indoor) Rippling water surface (outdoor)

a

Available at [10]: http://research.microsoft.com/en-us/um/people/jckrumm/WallFlower/TestImages.htm b Available at [7]: http://perception.i2r.a-star.edu.sg/bk model/bk index.html

implementation provided by the author1 . Likewise, the implementation for Vibe can be obtained from the author website2 . For GMM and ACM, we used the OpenCV implementations. We omitted morphological operation for all algorithms and used the default parameters recommended by the respective authors. For LBP and our proposed algorithms, both shared the same set of parameters: K = 3, P = 8, R = 1, a = 3 N = 9, M = 4, TB = 0.9, β = 0.01. The only parameter that we varied is the threshold Tp : 0.55 (sequence 1), 0.6 (sequences 2, 8), 0.65 (sequences 4, 5, 6), and 0.7 (sequences 3, 7). The parameter α is unique to our algorithm and it was set to 0.005. Each color channel of ILHS color space was quantized to 16 levels (i.e., Q = 16). The algorithm was implemented in C++ and executed on a PC with Intel Xeon 2.4GHz CPU. The processing speed of the unoptimized codes is about 20 fps given a frame size of 160 × 120 and the parameters above, which is modest for many applications. As in [2, 4], the performance of the algorithm is evaluated by visual and numerical methods. Fig. 2 shows the qualitative comparison of the different methods. It is noticed that the proposed algorithm performs significantly better than other algorithms across all the test sequences. In particular, our method can deal well with dynamic background such as swaying vegetation, rippling water, flickering monitor, illumination changes and moving escalators. It can be clearly seen that conventional LBP method has problem dealing with uniform object (see the upper body part of ‘Camouflage’). The problem is solved in our algorithm. Moreover, from ‘TimeOfDay’ sequence it is evident that for Yao algorithm, the background models learned from the initial 100 frames are inaccurate to cater for the gradually varying illumination in the later frames. It is worthwhile to mention that unlike the GMM, ACM and Vibe algorithms, the background produced by LBP and our proposed algorithm has less noises due to region-based approach. This comes with a slight trade-off that most false positives oc-

Table 2. Comparison of F-Measure for the test sequences Sequence WavingTree Camouflage TimeOfDay Campus Escalator Fountain Lobby WaterSurface

GMM

ACM

Yao

Vibe

LBP

Our

0.82 0.88 0.43 0.43 0.40 0.71 0.36 0.87

0.76 0.88 0.45 0.32 0.39 0.47 0.54 0.83

0.82 0.96 0.15 0.49 0.33 0.53 0.58 0.92

0.89 0.88 0.40 0.32 0.52 0.48 0.20 0.81

0.72 0.69 0.64 0.52 0.43 0.64 0.49 0.82

0.88 0.96 0.85 0.73 0.63 0.78 0.72 0.88

*

Sequences 1-3, only single frame was used for benchmarking. The frame was selected according to [10] * Sequences 4-8, 20 frames were selected according to [7] and the results were averaged.

cur on the edges of the foreground objects. Nevertheless, as observed in Fig. 2 this drawback is fairly minimal in all the test sequences. The numerical method is by means of F-Measure: F = 2.(Recall.Precision)/(Recall+Precision) where Recall = Nc /Ngt and Precision = Nc /Ndet . Nc denotes the number of correctly classified foreground pixels, Ngt the number of foreground pixels in ground truth, and Ndet the number of detected foreground pixels. Table 2 is the corresponding numerical evaluation which further proves the effectiveness of our algorithm. Our method achieves the highest F-Measure (averaged across all sequences) which is 0.80, followed by LBP (0.62), GMM (0.61), Yao (0.60), ACM (0.58), and lastly Vibe (0.56). In particular, our algorithm achieves the best F-Measure in most sequences except for ‘WavingTree’ and ‘WaterSurface’ which is just slightly lower than Vibe and Yao methods respectively. However, the ability of Yao algorithm to deal with wide ranges of dynamic scenes is questionable especially for ‘TimeOfDay’ sequence which simulates the illumination changes caused by moving sun. This scenario is very common in video surveillance applications.

1 http://www.idiap.ch/∼odobez/human-detection/index.html 2 http://www2.ulg.ac.be/telecom/research/vibe/

51

Fig. 2. Qualitative comparison of the results on the selected frames. 5. CONCLUSION A robust texture-color based background subtraction algorithm with adaptive weighting scheme is proposed in this paper. In the experiments, our proposed method delivers more consistent and better performances than most state-of-the-art methods. In particular, it totally outperforms the conventional LBP algorithm by a large margin. The ability to adapt the weight of color and texture information makes our algorithm very suitable for video surveillance applications especially with dynamic scenes.

[5]

[6]

[7]

References [1] C. Stauffer and W. E. L. Grimson, “Adaptive background mixture models for real-time tracking,” in Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1999, pp. 246–252. [2] Marko Heikkil¨a and Matti Pietik¨ainen, “A texturebased method for modeling the background and detecting moving objects,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, pp. 657–662, 2006. [3] O. Barnich and M. Van Droogenbroeck, “ViBe: A universal background subtraction algorithm for video sequences,” IEEE Trans. Image Process., vol. 20, no. 6, pp. 1709 –1724, June 2011. [4] Gengjian Xue, Li Song, Jun Sun, and Meng Wu, “Hy-

52

[8]

[9]

[10]

brid center-symmetric local pattern for dynamic background subtraction,” in Proc. of IEEE International Conference on Multimedia and Expo (ICME), 2011. Shengping Zhang, Hongxun Yao, and Shaohui Liu, “Dynamic background modeling and subtraction using spatio-temporal local binary patterns,” in Proc. of IEEE International Conference on Image Processing (ICIP), 2008, pp. 1556 –1559. Jian Yao and J.-M. Odobez, “Multi-layer background subtraction based on color and texture,” in Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2007. Liyuan Li, Weimin Huang, Irene Y. H. Gu, and Qi Tian, “Foreground object detection from videos containing complex background,” in Proc. of ACM International Conference on Multimedia (ACMMM), 2003. T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, pp. 971 –987, July 2002. Allan Hanbury, “A 3D-polar coordinate colour representation well adapted to image analysis,” in Proc. of Scandinavian Conference on Image Analysis (SCIA), 2003. K. Toyama, J. Krumm, B. Brumitt, and B. Meyers, “Wallflower: principles and practice of background maintenance,” in Proc. of IEEE International Conference on Computer Vision, 1999, vol. 1, pp. 255–261.

Suggest Documents