400
The Eighth PSU Engineering Conference 22-23 APRIL 2010
Background and Foreground Segmentation from Sequence Images by Using Mixture of Gaussian Method and K-Means Clustering Theekapun Charoenpong1 Ajaree Supasuteekul2 Chaiwat Nuthong3 Department of Electrical Engineering, Faculty of Engineering, Srinakharinwirot University, Nakornnayok, Thailand, 26120, E-mail:
[email protected] 2 Department of Mechanical Engineering, Faculty of Engineering, Srinakharinwirot University, Nakornnayok, Thailand, 26120, E-mail:
[email protected] 3 International College, King Mongkut's Institute of Technology Ladkrabang, Thailand, E-mail:
[email protected] 1
Abstract Background and foreground segmentation is an essential technique in vision systems including foreground segmentation, object tracking and video surveillance system. Mixture of Gaussian (MOG) is a popular method for modeling adaptive background in many researches. However, the clustering technique and the number of clusters are different depending on their applications. In this paper, we proposed a novel method for segmenting foreground from background image by using the Gaussian Mixture Model and KMeans clustering technique. Sequence images are used as input. Intensity of each pixel in the same position from sequential image is collected. Distribution of intensity is analyzed by the Gaussian Mixture Model. Based on the intensity of background cluster and foreground cluster, the Gaussian distribution is divided into two clusters by K-Means clustering technique. The intensities in the cluster which has maximum member are averaged. The average intensity is used for background model. The foreground is extracted by using background subtraction technique. Nineteen image sequences were done in the experiments. The results show the feasibility of the proposed method. Keywords: Background Modeling, K-Means, Surveillance System, Gaussian Mixture Model.
1. Introduction Background and foreground segmentation is a crucial step in many applications such as foreground segmentation, moving object tracking and video surveillance system. Many researches have been studied for handling dynamic natural environments such as waving trees, rippling water, etc [1, 2, 3, 4, 5]. The current methods are difficult to find a good set of parameter values and need the training data from long period of time. In some applications such as a video surveillance system of Automated Teller Machine (ATM) [6], there are a few moving objects in the scene. Therefore, it is important to develop the new technique which is simpler and the small training data is used. In this paper, we improve the Mixture of Gaussians (MOG) by using K-Means clustering
method for constructing an adaptive background model based on the assumption that there are a few moving objects in the scene. The purpose of background and foreground segmentation is to extract the non-stationary objects. By subtracting the new frame from background image, the moving object is extracted. Based on the period of time to accumulate the training data, the background modeling algorithm is classified into two types: nonadaptive and adaptive method. For non-adaptive method, the initialization background is selected manually from a frame in sequence images. Without re-initialization, the error in the background accumulates over time. Therefore, this method is appropriate for only short-time tracking. For adaptive method, the training data are accumulated over time for constructing the background model. The background from adaptive method seems to be current scene than the one from non-adaptive method. Therefore, some adaptive background modeling methods are mentioned in the following researches. Marko H. proposed the texture-based method [7] for modeling the background and detecting moving objects from a video sequence. Each pixel is modeled as a group of adaptive local binary pattern histograms calculated over a circular region around the pixel. This method has relatively many parameters. Therefore, it is difficult to find a good set of parameter values for each scene being modeled. Kyungnam K. [8,9] use codebook model for segmenting foreground and background. Sample background values at each pixel are quantized into codebooks which represent a compressed form of background model for a long image sequence. This method shows effective performance when use a long training sequence images. However, this method cannot model very slow scene illumination change in long time. Mohamad H. [10] proposed a two layer codebook for improving the weakness of codebook method in Ref. [8, 9] when the illumination in the scene change very slowly in long time. However, the algorithm shows effective performance when much training data is necessary. Stauffer and Grimson [11] proposed the Mixture of Gaussians method and using an on-line approximation to update the model. Each
401
Figure 1 Background and foreground segmentation flow chart pixel is classified based on whether the Gaussian distribution which represents it most effectively is considered part of the background model. As this method is flexible enough to handle the illumination variation, moving scene cluster and dynamic natural environment, there are many improvements of this MOG model as mentioned in [4]. K. Wagstall [12] uses COP-KMEANS algorithm for classifying the data of each pixel in Gaussians distribution into k cluster. This method is developed for detecting road lanes from GPS data. T.W. Chen [13] defined the number of clusters of K-means algorithm by using Bayesian information criterion. These methods are appropriate for multi clusters classification. In this paper, we proposed a novel method for segmenting foreground from background by using the Gaussian Mixture Model and K-Means clustering technique. Input of the system is sequence images. Intensity level of a pixel in sequence image is accumulated. Number of cluster of Gaussian distribution is defined by Kmeans algorithm. Based on the assumption that there are a few moving object in the scene, we define the number of cluster to be two: one is the data of background cluster, another is the data of foreground cluster. The cluster of background is the cluster which has the maximum member in the cluster. By averaging the intensity of the background cluster, the background is modeled. The foreground is extracted by using background subtraction technique. The proposed method is described in detail in the following section beginning from Background Modeling Method, Experimental Result and Discussion and Conclusion, finally.
described. 2.1 System Concept Input of the system is sequence image (Fig. 1.a). Images in RGB mode are converted to gray mode (Fig. 1.b). Intensity level of each pixel on coordinate (x,y): I ( x, y, t ) from sequence image at time, t, is accumulated G1,..., Gt I ( x, y, t ) : 1 t t0 (1) where Gt is gray-level of intensity of a pixel on coordinate (x,y) at time, t as shown in Fig. 1 c. Histogram of I ( x, y ) is used
h( I ( x, y )) nd ( x, y ) (2) where nd ( x, y ) is number of intensity ranking in rd 1 , rd and d is rank number. We use MOG to analyze the distribution of Gt . The member of each cluster is classified by K-means clustering (Fig. 1.d). By averaging the intensity of the background cluster, the background is modeled (Fig. 1.e). By subtracting an image at time i (Fig. 1.f) with background model, the foreground is extracted (Fig. 1.f). 2.2 Mixture of Gaussian Method (MOG) Gray-level of each pixel Gt is modeled by a mixture of K Gaussian distribution [11] as shown in Eq. 3. The probability of observing the current pixel value is
PGt i ,t Gt , i ,t , i ,t K
where K is the number of distribution,
2. Background Modeling Method In this section, the system concept, mixture of Gaussian method and K-means algorithm are
(3)
i 1
th
i,t is
a
estimate of the weight of i cluster of the Gaussian distribution at time t, 1 is used in the proposed method.
402
Figure 3 Experimental results. Top row is sample of sequence images. Middle row is the background model. Lowest one is the moving object which is the difference intensity between an image and the background.
i,t
is the mean value of the ith cluster of the
Gaussian distribution at time t.
i,t is the covariance
th
matrix of the i cluster of the Gaussians distribution at time t and is a Gaussian probability density function
(Gt , , )
1 n
1
e
1 Gt t T 1 Gt t 2
(4)
(2 ) 2 2 K is determined by the available memory and computational power. In this proposed method, two is used. Member of each cluster is classified by Kmeans algorithm as described in subsection 2.3. 2.3 K-means Algorithm K-means clustering is an iterative algorithm for cluster analysis. This method is used to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. The standard algorithm [13,14] is stated as follows: Step 1: Begin with a decision of the number of K where K is number of cluster. In this paper, K is two clusters. Step 2: Determine the initial mean value of K clusters. Step 3: Classify intensity in a pixel at time t, Gt , to corresponding cluster which has the nearest mean value as shown in Eq. 5 * Si( n ) I j : I j i( n ) I j i( n ) for all i 1,2,..., K (5)
where
S
(n ) i
*
I j is the gray level of an input pixel, n,i and
representing the mean value and the set of pixels
of the i-th cluster in the n-th iteration, respectively.
Step 4: Update the new means to be the center of the observation in the cluster.
i( n 1)
1 Si( n )
I
j
(6)
i( n 1) i( n ) 1
(7)
I j S i( n )
Step 5: The iteration stop when Otherwise, return to step 3 and n is incremented. The mean of the cluster in which the member is maximum is defined as the intensity of the pixel in corresponding coordinate of the background model.
3. Experimental Result and Discussion In this section, we report on experiments using 19 sequence images. The color image has 240x320 pixels. The frame rate is 30fps. 60 frames are used for the training data. The algorithm was implemented in the OpenCV library version 1.0. We show the result in Fig. 3. The top row has moving objects in the scene. The background modeled by the proposed method is presented in the middle row. The lowest row is the foreground images which is the different between a foreground and the background image. For the foreground image in column a), there is much noise in the scene. This is because the camera we use is sensitive to the lighting condition. Observations on the moving object in column b) and c), some pixels inside the moving object are eliminated. This is because the background intensity is close to the moving object intensity. The data is loose during subtraction and noise elimination. The computational time of the proposed method is 3.0 seconds for modeling a background using Intel
403
Core (TM) Duo CPU 3.00 GHz processor with 2 GB of RAM.
4. Conclusion We proposed a novel method for segmenting foreground from background image by using the Gaussian Mixture Model and K-Means clustering technique. Sequence images are used as input. The number of cluster is defined as two based on clusters of background and foreground. Foreground is extracted by using background subtraction technique. The experiments were done by using 19 sequence images. The results show the feasibility of the proposed method. Advantages of the proposed method over the existing research are following: firstly, the parameter values are simply to define while, in Marko’s method [7], it is difficult to find a good set of parameter values, secondly, the number of the training data used in the proposed method is less than the one using codebook method [8, 9, 10] and thirdly, our method is appropriate for non-dynamic background. For further enhancement, an algorithm for improving the performance of noise elimination is necessary.
Acknowledgments This research was financially supported by the faculty of engineering, Srinakarinwirot University, contract no. 029/2553.
References [1] Zheng J., Wang Y., Nihan N., Hallenbeck E. 2006. Extracting Roadway Background Image: A Mode Based Approach. J. Transport Res Repot, 2006; 1994 82-88. [2] Cheung S, Kamath C. 2005. Robust Background Subtraction with Foreground Validation for Urban Traffic Video. J. Appl Signal Proc, Special Issue on Advances in Intelligent Vision System: Methods and Applications (EURASIP 2005), 14: 2330-2340. [3] Elhabian S, El-Sayed K, Ahmed S. 2008. Moving Object Detection in Spatial Domain using Background Remove Technique-State-of-Art. Recent Pat on Comput Sci 2008, 1(1): 32-54. [4] T. Bouwmans, F. El Baf, B. Vachon. 2008. Background Modeling using Mixture of Gaussians for Foreground Detection—A Survey, Recent Patents Comput. Sci. 1, no. 3, pp. 219-237, 2008. [5] Piccardi M. 2004. Background Subtraction Techniques: A Review. Proc. of the Int. Conf. on Systems, Man and Cybernetics (SMC 2004), The Hague, The Natherlands, October 2004, 31993104. [6] Jaywoo K., Younghum S., Sang M.Y. and Bo G.Y.. 2005. A New Video Surveillance System Employing Occluded Face Detection, 18th Int. Conf. on Industrial and Engineering Application of Artificial Intelligence and Expert Systems IEA/AIE 2005, Italy, June, 2005: 65-68.
[7] M. Heikkilä and M. Pietikäinen. 2006. A Texture-Based. Method for Modelling the Background and Detecting Moving Objects, IEEE TPAMI, vol. 28, no. 4, pp. 657-662. [8] K. Kim, T.H. Chalidabhongse, D. Harwood, L. Davis. 2004. Background Modeling and Subtraction by Codebook Construction. Int. Cont. on Image Processing (ICIP2004), October 2004, Vol. 5, pp. 3061-3064. [9] K. Kim, T.H. Chalidabhongse, D. Harwood, L. Davis. 2005. Real-time Foreground-background Segmentation Using Codebook Model. Elsevier Real-time Imaging, Vol. 11, pp. 172-185. [10] H.S. Mohamad, F. Mahmood, 2008, Real-time Background Modeling/Subtraction using TwoLayer Codebook Model, Proc. of the Int. Multiconf. Of Engineers and Computer Scientists 20008 (IMECS 2008), March 2008, Hong Kong. [11] Stauffer C., Grimson W. 1999. Adaptive Background Mixture Models for Real-time Tracking, Proc. IEEE Conf. on Comp Vision and Patt Recog (CVPR 1999), pp. 246-252. [12] Kiri W., Claire C., Seth R., Stefan S. 2001. Constrained K-means Clustering with Background Knowledge, Proc. of the 18th Int. Conf. on Machine Learning (ICML 2001), pp. 577-584. [13] Chen T.W., Hsu S.C., Chien S.Y., 2007. Robust Video Object Segmentation Based on K-Means Background Clustering and Watershed in IllConditioned Surveillance Systems, IEEE Int. Conf. on Multimedia and Expo, July 2007 pp. 787 – 790. [14] K-means clustering - Wikipedia, http://en.wikipedia.org/wiki/K-means_clustering