Foreground Detection using Background Subtraction with Histogram

Foreground Detection using Background Subtraction with Histogram

Index Terms— Background Subtraction, Foreground Detection, Histogram

I. INTRODUCTION: Motion plays a very important role in any foreground detection algorithm. The level of preciseness of motion detection directly affects the efficiency of performance measures and subjective quality of detected foreground. Motion detection approaches vary for static and dynamic backgrounds. The algorithm presented in this paper is ideal for static background. Detecting any change in pixel will be considered as a part of the foreground object. Based on temporal information between frames implied in the difference, two different approaches are recommended: background (-frame) subtraction and techniques based on temporally adjacent frames. Background (-frame) subtraction uses the first frame as the reference image/frame and if there is significant change (in pixels) in the subsequent frames, then it shall be considered as part of the foreground object. This is a very simple and useful solution and can be used in the following two situations: Ideal situation: where there is no foreground object in the reference image, so the foreground result is good enough as shown in the figure (1). However in real world/general situation there is a possibility that there is foreground object in the reference image. When the current image appears with a new foreground object, the new and old foreground objects both are detected i.e., the foreground object in the reference image is also known as ghosting.

Background subtraction result

Current frame

Ideal

Reference frame

White mask is that of current frame

General

Abstract—In the background subtraction method one of the core issue is; how to setup the threshold value precisely at run time, which can ultimately overcome several bugs of this approach in the foreground detection. In the proposed algorithm the key feature of any foreground detection algorithm; motion is used however getting the threshold value from the original motion histogram is not possible, so for the said purpose smooth motion histogram is used in a systematic way to obtain the threshold value. In the proposed algorithm the main focus is to get a better estimation of threshold so that to get a dynamic value, from histogram at run time. If the proposed algorithm is used intelligently in terms of motion magnitude and motion direction it can distinguish accurately between background and foreground, camera motion along with camera motion and object motion.

Event

Muhammad Nawaz, Prof John Cosmas, Awais Adnan, Muhammad Inam Ul Haq, Eman Alazawi

Yellow and blue foreground masks are of current and refrence images respectively

Figure 1 Background Subtraction Time-differencing or technique based on temporally adjacent frames suggests that a pixel is moving if and only if its intensity has significantly changed between previous and current frame. In Equation 1, is the pixel position intensity that belongs to a moving object, represent current frame at is the previous frame at time t-1, and is the time t, threshold value. |

|

1

There is no doubt that this is an easy approach but it will only work if the object speed and frame rate is known in advance. Otherwise this technique leads to two types of problems [1]: foreground aperture and ghosting. The solution of this problem is known as double-difference image [2], the pictorial form of this algorithm is given in the Figure (2).

Algorithm 1 |

|

AND |

Foreground Pixel; | | AND | Collect pixels in blobs; #

In order to avoid the need of previous background learning, robust pixel foreground classification is introduced [4]. This algorithm claims that robust pixel foreground classification is possible without the need of previous background learning. For proper pixel classification joint background subtraction and frame-by-frame differencing method is used as given in the Algorithm 1 and this is known as joint difference algorithm [4]. Finally, the background model is selectively updated using Equation (2) as suggested in [5]. In Equation is (2), the value of depends on pixel classification, background at time t 1 and is the foreground at time t. 1

2

|

# Foreground Pixel; // foreground aperture problem solution else Background pixel; // background object suddenly starts moving at time t end if | | | AND |

Figure 2: Double-Difference Image Algorithm This approach requires thresholding difference between frame at time t-1 and t, and between frames at t and t+1. Finally they are combined by logical AND operator. However, this approach has the drawback that it cannot find the precise position of an object in real time. Similarly, accurate motion detection becomes a problem, if there is not enough texture. Another algorithm that has been proposed is known as hybrid algorithm [1] for motion detection. This algorithm is based on three-frame differencing methods, which finds the difference between frames at time t and t 1 and the difference at t and t 2 in order to determine the regions of reasonable motion and overcome the ghosting problem. Adaptive background (frame) subtraction [3] was applied to overcome the issue of the foreground aperture. This background update procedure has three main problems [7]: when the foreground object begins or ends motion, in case of luminance variation, and produces problem at variable depth shorts as this is mainly used in outdoor atmosphere shooting with low depth of field images.

|

Background

Pixel;

//ghosting

problem

solution else // |

|

AND |

|

Background Pixel; end if The good enough performance of this algorithm has been confirmed in [7]. However, this is at the cost of two disadvantages: this method totally fails when the foreground object stops or if its speed is low then the second drawback is that of setting the proper threshold value [7-8]. This problem can be solved using: clustering, entropy and object attribute based methods. Other two methods are spatial and local methods. The proposed algorithm provide the solution to this problem using Histogram shaped-based method, by using histogram peak, valleys, and curvatures of the smooth histogram. The reasons for selecting Histogram shaped-based method are: higher accuracy and less time of execution. The challenge of original/non normalized histogram is depicted in figure 3, which shows the original histogram of the reference image and current frame which both have so many peaks that decision making using thresholding is not so possible.

Frames

Respective non normalized Histogram

Reference Image

200

Number of Pixels / Pixel Frequency

150

100

50

Current Frame

0

50

100

150 Pixel Value / Pixel Intensity

200

250

motion in that range (total bins are 32) for decision making of threshold value as shown in the figure 4. 8. Threshold value In the proposed algorithm, x-axis of the histogram is divided in two equal portion i.e., first half is from 0 to 16 and the second is above than 16 to 32 in order to cover the major portion of motion in both halves. This can be expressed mathematically in the equation 3 and 4.

160

140

Number of Pixels / Pixel Frequency

120

100

1

0

2

16

16

3

80

60

40

20

0

50

100

150 Pixel Value / Pixel Intensity

200

32

4

250

Figure 3 Original Histogram

II.

PROPOSED ALGORITHM AND EXPERIMENTAL RESULTS

1. 2.

Video is analysed frame by frame Macro block/vector size is set as 4x4 =16; the reason is that by taking lesser value than that increase the execution time while larger size decreases the accuracy of motion detection. 3. Search distance value is set as 5. It sets the sensitivity and possible values of motion, for example if it is 2 maximum motion will be around 2 and if it is 10, motion vectors will be calculated in 10 values. During testing different values were tried such as 5, 10 and 15 but results were almost the same while 5 is faster in terms of execution time, than the rest of two. Furthermore search size 5 was also found suitable for 32 bins. 4. Motion vectors are calculated in terms of magnitude and direction, however due to the scope of the topic only magnitude is considered in this paper. 5. Get motion information of previous 5 frames The reason for taking window size of 5 is that, if large windows size is used, it will ignore short motion (normally there are 12 to 25 frames per second and by taking window size of 5 measures motion of 0.2 to 0.5 seconds, so larger window will not be suitable. On the other hand if window size is small, it will not be beneficial to take average of 2 or 3 frames. That is why mid value is the appropriate and normally in different compression methods, 9 frames are used as a set so windows size 5 suits that number 9 and 5 is also at the middle of digit 9. That is why to get sufficient motion information; the average on previous 5 frames was used, while on the other hand if only one previous frame is used this does not has enough information to be utilised effectively. 6. Motion vector is converted into 32 values for information/values/data on x-axis of the histogram (i.e., 0 to 32), this represent the motion vectors magnitude. For the decision making smooth values of the histogram are considered as show in the figure 4. 7. The overall histogram of proposed algorithm is in such a way that x-axis has motion vectors magnitude (from 0 to 32) and y-axis has the number of blocks (4x4) having

Figure 4 Histogram From histogram two largest peaks are selected and valley is calculated which is in between these two peaks, subject to the condition that if first peak is in first half and lies in first bin, where motion is zero, then it will be considered as background and else part if any represents foreground as can be seen the figure 5, which is the first frame. Worth to mention that at this point camera is still and this point shows background part of the frame. Similarly if camera is in motion first peak will not be in first bin, as first bin shows that most of the blocks have zero motion which is not possible if camera is moving as can be seen in the Figure 4.

Figure 5 Histogram of 1st frame

6 Figure 6, shows the foreground results on the proposed algorithm using various video sequences along with the respective original frames, while black mask shows background which is static at that particular instant of time.

Frame No.892

Frame No.985

Frame No. 1

Frame No.3

Frame No.7

Frame No.67

Frame No.126

Frame No.183

Frame No.207

Frame No.213

Frame No.214

Frame No.229

Frame No.232

Frame No.339

Frame No.354

Frame No.359

Frame No.393

Frame No.406

Foreground

Original

Frame No.39

Foreground

Original

Frame

Foreground

Original

Frame

Frame

Original

5

Frame No. 20

Foreground

The reason for selection of largest peaks from both half’s is to get better valley and cover the maxim motion area. Furthermore during the searching for the second peak in the range from 255 to 1 on y-axis, this peak represents another major movement. Here instead of considering only magnitude of the motion, number of motion blocks are also consider as this give better result, then search a valley (minimum between the two largest peaks) which is the ultimate threshold value. Less value than threshold is backgrounds and greater is foreground as given in the equation 5 and 6. The reason for picking valley as threshold point, although it points to a minimum value between two peaks so first of all if it is minimum value (may be zero) means it is not representing any cluster. And it is known that on both sides there are two peaks so it means there are two major motions (as peaks means many blocks having these motion) segments on both sides so this is the best option for threshold as it best estimate between two major values. Valley point having minimum value, is a point dividing the feature vector into halves have at-least one peak on each side so by taking this as threshold and values less than or equal to threshold belongs to background part while values above than this belong to foreground part of the respective frame as expressed in the equation 5 and 6.

Frame

Frame

Original

Refereeing to figure 3, for peak selection first of get the maximum number (i.e., largest peak) and then select second largest peak in such a way that these two peaks fall in separate portion (one peak is in first half and second peak in second half or vice versa for better motion estimation) then find minimum value between the two to get valley. This is the threshold point of proposed algorithm.

Foreground

ACKNOWLEDGMENT

Frame No.519

Frame No.523

Frame No.528

Frame No.535

Original

Frame

We are grateful to the Institute of Management Sciences Peshawar (http://imsciences.edu.pk/) through Higher Education Commission Islamabad, Pakistan (http://hec.gov.pk/) who financially supported this project. I am also thankful to Abdulkadir Audu, PhD scholar who helped me in conducting this research. REFERENCE: Collins, R., Lipton, A., Kanade, T., Fijiyoshi, H., Duggins, D., Tsin, Y., Tolliver, D., Enomoto, N., Hasegawa, O., Burt, P.,Wixson, L.: A system for video surveillance and monitoring. Tech. rep., Carnegie Mellon University, Pittsburg, PA (2000)

2.

Kameda, Y., Minoh, M.: A human motion estimation method using 3-successive video frames. In: International Conference on Virtual Systems and Multimedia (VSMM), pp. 135–140. Gifu, Japan (1996)

3.

Kanade, T., Collins, R., Lipton, A., Burt, P.,Wixson, L.: Advances in cooperative multi-sensor video surveillance. In: Darpa Image UnderstandingWorkshop, vol. I, pp. 3– 24. Morgan Kaufmann (1998)

4.

Migliore, D., Matteucci, M., Naccari, M.: A revaluation of frame difference in fast and robust motion detection. In: 4th ACM International Workshop on Video Surveillance and Sensor Networks (VSSN), pp. 215–218. Santa Barbara, California (2006)

5.

Wren, C., Azarbeyejani, A., Darrell, T., Pentland, A.: Pfinder: Real-time tracking of the human body. IEEE Transactions on Pattern Analysis andMachine Intelligence (PAMI) 19(7), 780–785 (1997) K. K. Hati, P. K. Sa and B. Majhi, "LOBS: Local Background Subtracter for Video Surveillance," 2012.

Foreground

1.

Frame No.625

Frame No.631

Frame No.640

Frame No.668

Foreground

Original

Frame

Frame No.738

Frame No.798

Frame No.899

Frame No.902

Foreground

Original

Fram e

6. Figure 6 Foreground results III.

CONCLUSION

In this paper, we presented a simple technique to obtain the threshold value for background subtraction method from smooth motion histogram which precisely separate foreground (motion area) and background (static). The proposed algorithm is tested on eight different video sequences of various natures and produced good enough results visually. From experimental results it is cleared that the proposed algorithm work very well where camera and background are fixed and only foreground object(s) is/are in motion, however when camera is moving or foreground object(s) motion is very slow the proposed algorithm may produce erroneous results.

7. E. MARTNEZ-MARTN, "Robust motion detection in real-life scenarios," 2012. 8. S. W. Sun, Y. C. F. Wang, F. Huang and H. Y. M. Liao, "Moving Foreground Object Detection via Robust SIFT Trajectories," Journal of Visual Communication and Image Representation, 2012.

Foreground Detection using Background Subtraction with Histogram

Foreground Detection using Background Subtraction with Histogram

Suggest Documents

Foreground Subtraction of Cosmic Microwave Background Maps

Bayesian Background Modeling for Foreground Detection - CiteSeerX

Background Modeling and Foreground Detection for ... - Sapienza

Saliency Detection via Foreground Rendering and Background ...

background subtraction with adaptive spatio

Background Subtraction and Shadow Detection in ... - CiteSeerX

Robust Tracking Using Foreground-Background Texture Discrimination

Background and Foreground Modeling Using Nonparametric Kernel ...

Foreground/Background Segmentation with Learned ...

Statistical Background Subtraction Using Spatial Cues - CiteSeerX

Segmentation of Moving Objects using Background Subtraction

Universal Background Subtraction Using Word ...

Background Subtraction with Dirichlet Processes - eecs.qmul

BSCGAN: Deep Background Subtraction with ... - IEEE Xplore

Background Subtraction with Real-time Semantic

Independent Multimodal Background Subtraction

Background Subtraction with KINECT data: An ...

New Trend in Video Foreground Detection Using

Foreground-background Segmentation by Cellular

Foreground Subtraction in Intensity Mapping with the SKA

Illumination invariant foreground detection using multi-subspace ...

S3 Appendix. Background Subtraction procedure. Background ... - PLOS

Background Subtraction Based on Rank

Background Subtraction based on Cooccurrence