This paper describes a non-linear IR (infra-red) scene prediction method for range video surveillance and navigation. A Gabor-filter bank is selected as a ...
Non-linear IR Scene Prediction for Range Video Surveillance Mehmet Celenk, James Graham, and Kai-Jen Cheng School of Electrical Engineering and Computer Science Ohio University, Stocker Center, Athens, OH 45701 USA {celenk, jg193404, kc134905}@ohio.edu
Abstract This paper describes a non-linear IR (infra-red) scene prediction method for range video surveillance and navigation. A Gabor-filter bank is selected as a primary detector for any changes in a given IR range image sequence. The detected ROI (region of interest) involving arbitrary motion is fed to a non-linear Kalman filter for predicting the next scene in time-varying 3D IR video. Potential applications of this research are mainly in indoor/outdoor heat-change based range measurement, synthetic IR scene generation, rescue missions, and autonomous navigation. Experimental results reported herein show that non-linear Kalman filtering-based scene prediction can perform more accurately than linear estimation of future frames in range and intensity driven sensing. The low least mean square error (LMSE), on the average of about 2% using a bank of 8 Gabor filters, also proves the reliability of the IR scene estimator (or predictor) developed in this work.
1. Introduction Scene prediction in 2-D or 3-D realms has been a highly popular field of exploration during the last decade [1,2,3,4,5,6,7]. Increased attention toward such areas as unmanned navigation and guidance, surveillance, tracking, mapping, virtual world simulation, precision manufacturing, multimedia networking, animation, and rescue missions have driven research in image processing related fields of research. Two popular tools for these endeavors are the Kalman and Gabor filters. The Kalman filter (KF) is one of the most widely used methods for tracking and estimation due to its simplicity, optimality, tractability, and robustness. Roumeliotis and Bekey [8,9] have detected and extracted indoor features using dual-Kalman filtering, and explorer Bayesian estimation combined with Kalman filtering to aid in the localization of mobile robots. In [10], Ulhaas has presented a Kalman filter based optical tracking model, which works reliably in real time, capable of coping with display lags or communication delays. Campagnola et al. [11] have
1-4244-1180-7/07/$25.00 ©2007 IEEE
presented a laser scanning based method for nonlinear optical imaging of live cells by using the second harmonic optical phenomena. In this work, we predict the changes in a nonlinear system indirectly using the Kalman filter. Since a direct approach to nonlinear system modeling analytically can be un-tractable, the most common method for handling this situation is to use an extended Kalman filter (EKF) [12,13] as an estimator. The EKF retains the basics of the Kalman filter and does so by linearizing the nonlinear system parameters so that the traditional linear Kalman filter equations can be applied [14]. Gabor filtering is a popular method for filtering based on texture differences within an image [15,16] such as texture segmentation, document analysis, edge detection, retina identification fingerprint processing [17], and image coding and representation. This is because the Gabor filter can separate information of different textures. Therefore, Gabor filters are able to discriminate the minimum bound with simultaneous location in the spatial and frequency domains [18]. Moreover, extension application for a Gabor filter, called Gabor filter bank, comprises with a set of different frequencies and orientations that provides a complete cover of spatial frequency domain. The result can generate versatile information for texture description. Montillo et al.’s work [19] is based on the Gabor filter bank adjusted to the filter’s orientation and the radial frequency and angle of the filter’s sinusoidal grating to extract information about the deformation of tissue. Another example is Macenko et al.’s work [20], which provides both a good explanation of the approach to using Gabor filtering and a highly relevant practical application in lesion detection within the brain. A slightly older work by Lang and Yishman [21] describes the process of using Gabor filtering to help detect changes in terrain. In this work, the prediction of frame-to-frame movement in a given IR range image sequence is the focused objective undertaken. Estimation is carried out by using a bank of Gabor filters to determine the region of interest (ROI), followed by the application of a nonlinear Kalman filter to the ROI to predict movement. This approach differs from others work in that it uses Gabor-based ROI detection and Kalman filtering to predict future IR images.
This has not been attempted previously to the best of our knowledge. Gabor filtering is used mainly to detect shapes via texture difference and is not typically applied to an estimation problem such as this one. On the other hand, while Kalman filtering is popular, it is not typically used in image prediction. This, in turn, is the first time that!the extended Kalman filter has been applied to IR image prediction. The following sections describe the approach summarized above in detail. Experimental setup and results obtained are then presented. Conclusions and future study are given at the end.
2. Description of Overall Approach
x" = x # cos $ + y #sin $ y" = %x #sin $ + y # cos $
(4)
That is, g"( x, y ) is a version of the Gaussian g( x, y ) that has been spatially scaled and rotated by " . Sx and Sy are the variances along the x and y axes and are equal to λσ and σ respectively. The parameter " is the spatial scaling, !which controls the width of the filter ! impulse response. ! The value " defines the aspect ratio of the filter, which determines the directionality of the filter. The orientation ! angle " is usually chosen to be in direction of the filter’s center circular frequency as " = tan#1 $ x $ y . The values ! ωx and ωy are a function of the filter’s center circular
This paper takes a new approach to the scene prediction problem by looking at the situation with regards to IR ! frequency ω given by " = " x2 + " y2 . The filter can also images and applying both Kalman and Gabor based ! an offset value. In the frequency be modified to include filtering. Prediction of an entire image is not necessarily domain, the Gabor filter can be defined as useful or even practical. Because of this, Gabor filtering 2 was chosen to help determine an ROI in which we 2) #! & 2 " '( u$ " w $x ) %2 +( v $ " w $y ) * generate prediction results. The basic algorithm flow is + (5) H ( u,v ) = e 2 ( shown in Figure 1. u" = u # cos $ + v # sin $ (6) v" = %u # sin $ + v # cos $
!
where w"x and w"y are the corresponding versions of wx and wy rotated by θ.
! !
(1)
The standard linear Kalman filter is limited to linear estimations. However, most systems are non-linear. The non-linearity can be present in the process model, the observation model, or both. Hence, a variation of the Kalman filter, referred to as the extended Kalman filter, was considered to overcome the related shortcomings of the linear Kalman filtering. The standard Kalman estimator relies on a linear prediction model that can be expressed as [23]
(2)
x k +1 = " k x k + w k
Figure 1. Algorithm flowchart.
Before continuing with the description of the method, it is necessary to define our implementation of the Gabor filter. Here, the Gabor function is taken as defined in [22] by j $ x +$ y h( x, y ) = g"( x, y ) # e ( x y ) 1 # x" y" & g"(x, y) = g% , ( Sx Sy %$ Sx Sy ('
!
!
1 # g( x, y ) = e 2"
x 2 +y 2 2
[
(7)
]
T
w k = ., pn(i"1), j , pn i, j , pn(i+1), j , .., vn(i"1), j , vn i, j , vn(i+1), j , . (8) (3) !
!
!
In the proposed system, the current frame is fed to a Gabor filter ! bank that calculates the output images for a series of Gabor filters with varying orientations. Here, the filter bank, which has a set of frequencies and orientations, cover the spatial-frequency space and capture as much shape information as possible. From these Gabor output images, a combined Gabor saliency image is produced. The saliency image is used to determine the region of interest (ROI). Once the ROI is specified, the relevant portion of the image is passed onto the Kalman filter.
!
Qk = wk wk
T
(9)
where the state vector xk will be of length 2 " m " n in the form T
&K, x(i '1), j , xi , j , x(i +1), j ,K,!x&(i '1), j , # xk = $ ! (10) $% x&i , j , x&(i +1), j ,K, &x&(i '1), j , &x&i , j , &x&(i +1), j ,K!" Here, m " n is the size of the window to which the filter is being applied, x k is the current value in the frame, x& k is the current velocity estimate, &x&k is the current
! acceleration estimate, " k is the Kalman filter state transition ! matrix, wk is the process noise vector, and Qk is the covariance matrix of the process noise vector. The values pnij and vnij are the pixel range value process noise, and the axial ! range velocity respectively. The indices k and k+1 represent the current and future state values and T denotes matrix transposition. The second set of Kalman equations dictate how the current state is related to the current measurement made by the range camera. They are given by (11)
zk = H k x k + v k
[
]
2 2 2 v k = K," (i#1), j ," i, j ," (i+1), j ,K
T
!
! ! !
Rk = v k v k H k = [ I | O]
! !
T
(12) (13) (14)
where zk is a vector representation of the measurement within at time tk, the vector v k is the measurement error vector, and Rk is the measurement error covariance matrix. Hk is the state relationship matrix showing how the previously predicted value xk is related to the measured value zk. However, ! the extended Kalman filter represents the state transition models not as linear functions, instead as a differentiable set of functional:
x k +1 = f ( x k ,w k ) z k = h( x k , v k )
constant through the Kalman prediction. The error covariance vector is continuously updated through the Kalman filtering process with an initial estimate of zero. The filter is given an error covariance matrix previous estimate Pk" as the covariance of the current state. If the velocity is determined to be either above or below a selected threshold, the filter estimate is considered invalid and iteration is terminated. The filter provides estimates of !the error within the local m " n window. This covariance can be examined to help determine if the current window estimate should be used. Furthermore, tracking or navigation can benefit from determining which part of the ! image is moving in free space by examining the covariance and the local velocity estimates.
Figure 2. Kalman linear estimator (from [23]).
The extended Kalman filter uses essentially the same procedure; however, the equations are represented slightly differently: K k = Pk " H kT (H k Pk " H kT + Rk )"1
xˆ k =
xˆ "k
(
+ Kk zk " h
( )) xˆ "k ,0
(15)
Pk = (I " K k H k )Pk "
(16)
xˆ k +1 = " k xˆ k
!
!
! The function f can be used to compute the predicted state from the previous estimate and similarly the function! h can be used to compute the predicted measurement from ! the predicted state. However, f and h cannot be applied to the covariance directly. Instead, a matrix of partial derivatives (the Jacobian) is computed. At each time step, the Jacobian is evaluated with current predicted states. These Jacobian matrices can then be used in the Kalman filter equations. This process essentially piecewise linearizes the non-linear function around the current estimate. The linear Kalman filter is typically implemented as shown in Figure 2. The process and measurement noise vectors are assumed to remain
Pk"+1 = # k Pk# kT + Qk
(17) (18) (19) (20) (21)
with Hk and " k representing the derivatives of equations 10 and 11 with respect to x instead of the static values they held for the linear Kalman filter.
!
3. Experimental Results We used a data set from the Terravic Research Motion IR Database [24]. Video provided in the database is in 2-D intensity format. Since depth information is not provided, the Kalman filter will model pixel intensity.
The 2-D scene data used for this experiment is from static surveillance camera, meaning the camera’s position is fixed. In the collected images, only the scene contents move while the camera remains stationary. The images are in JPEG format with a resolution of 240x320 collected at 30fps. Figure 3 shows a sample set of images from the selected database, which involves the motion of a person from left to right within the screen window.
Figure 4. Gabor filter results for frame #215.
Figure 3. Database example.
We have adopted the same formulation for the Gabor filter as Macenko[20], which specifies the Gabor filter variables to be Sx = 1 , Sy = 1 , and " = { 0, # 4,..., # ,...,7# 4} . Eight different orientations for the Gabor bank are adapted since more would not provide any significant improvement and fewer would likely not discern enough ! about the ! image. Upon ! passing the image through the filter bank a combined saliency image is created. The saliency image has the background saliency image subtracted to leave only the correct region of interest (ROI). The resulting ROI image is then passed through a noise reduction and blocking filter to remove “specks” which results from small background changes and to “block out” the ROI to give it slightly better coverage. Figure 4 illustrates the process of determining the ROI. Image (a) shows the Gabor saliency image for the background, while image (b) illustrates the Gabor saliency for the current frame. The next image in (c) depicts the result of the noise removal and black and white conversion of the previous image. The final image, (d), shows the results of “blocking” the ROI. The blocking ensures that most of the pixels immediately surrounding the region of interest also ! get included in the Kalman filter estimations. Next, the superimposed frame containing the selected ROI is passed on to the Kalman filter. The Kalman filter is then applied to the region of interest.
To alleviate computational time issues and better handle the uneven lines of the ROI, the filter is run on 3x3 subsets or blocks of the total image. A 3x3 pixel filter is run for each frame, and the predicted results are then combined to create a full scene image array. In experimentation, the pixel noise value (pnij) is assumed to be zero, and velocity is not taken into account. The state transition matrix ! k is adjusted for a 3x3 window based Kalman filter realization as a 27x27 matrix given by
&I I I # 'k = $$0 I I !! %$0 0 I !" where I is a 9x9 identity matrix and 0 is a 9x9 zero matrix. The noise variance ( " ij2 ) is considered as white Gaussian noise with a value of 0.25. In our experiment, the pixel noise (pnij) is assumed to be 0, and the velocity noise (vnij) is taken to be 1 m/s. The prediction error for the described ! method is calculated in the least mean square sense, which is given by
e=
1 M "N
$ $ ( fˆk +1(i, j) # f k +1(i, j))
2
(22)
Figure 5 gives an example of the prediction results for two original and predicted frame pairs. The measured frames represented that actual frame, while the predicted frame is the frame predicted from the previous cycle. The images below show that the results are better quality aside from a few barely noticeable discrepancies. One such discrepancy can be seen in the lower right of Predicted Frame #230, just above the “3” in the “300.” The discrepancy is not
really the fault of the prediction algorithm. It occurred because that section of the image was not part of the region of interest but instead part of the background and, thus, was not tracked.
linear Kalman filtering method at 348 seconds vs. 761 seconds, respectively.
Figure 6. LMSE results for the linear Kalman Filter.
Figure 5. Prediction results.
Figure 6 depicts the LMSE error results of a linear Kalman filter on a serious of frames depicting the person from Figures 4 and 5 moving from outside the left side of the screen toward the center. The fluctuation of the results from less than 0.5% to beyond 6% in some places is likely a result of changes in the position of the region of interest. As the ROI moves into a previously unhandled region, the filter has to update itself to that region. When it does the prediction immediately performs better. Another factor contributing to the error are the fluctuating changes in the background, which are not handled by the filter and tend to come and go at regular intervals within the selected database. Figure 7 presents the LMSE error results for the same series of frames as in Figure 6, but using the extended nonlinear Kalman filter. As can be seen from comparing the two graphs, the nonlinear filter performs better. However, it retains the same fluctuating nature as the linear Kalman filter. This is because most of the problems mentioned as the reasons for the high error rate of the linear Kalman filter remain for the extended Kalman filter. The issue with the background is independent of the ROI and is, therefore, not affected by the filtering. Furthermore, it would be computationally inefficient and counter productive to ignore the ROI and apply the filter to the background as well. In terms of computational efficiency, the nonlinear Kalman filter combines with the Gabor ROI detection operated about twice as fast as the
Figure 7. LMSE results for nonlinear Kalman filter.
4. Conclusions The presented research study has the objective of predicting scenes as the image changes. Unlike previous work, this approach has made use a bank of eight Gabor filters to select the ROI and an extended Kalman filter implementation to perform the image prediction. The reported experimental results demonstrate that the nonlinear Kalman filtering based scene prediction performs better than linear Kalman prediction and can accurately estimate the next frames in images to a certain degree of accuracy. Furthermore, the use of Gabor filtering has shown to significantly improve computation time by limiting the regions to which Kalman filtering is applied. The low LMSE error measurement of the nonlinear filter, on the average of about half the error of
the linear filter, proves the reliability and robustness of this approach to IR data processing. The presented results are within the allowable error range of the low-cost cameras used for the experimentation. Potential areas for future research lie in devising a better ROI detector and improvements to the Kalman filtering algorithm. The magnitude of the prediction error indicates that further work is needed for the performance improvement of the proposed Kalman filter model.
5. References [1] T. Oggier, et al., “An all-solid-state optical range camera for 3D real-time imaging with sub-centimeter depth resolution (SwissRangerTM),” SPIE Proc. Vol. 5249-65, St. Etienne, France, 2003. [2] J. Kim and J.W. Woods, “3-D Kalman filter for image motion estimation,” IEEET-IP, Vol.7, No.1, Jan. 1998, pp. 42-52. [3] M. Irani and P. Anandan, “A unified approach to moving object detection in 2D and 3D scenes,” IEEET-PAMI, Vol. 20, No.6, June 1998, pp.577-589. [4] A. Hoover, D. Goldgof, and K. Bowyer, “Egomotion estimation of a range camera using the space envelope,” IEEET-SMCB, Vol.33, No.4, Aug. 2003, pp.717-721. [5] G. Vosselman and S. Dijkman, “3D building model reconstruction from point clouds and ground planes,” Proc. Land Surface Mapping and Characterization Using Laser Altimetry, Annaplolis, Oct.22-24, 2001, pp.37-43. [6] H.S. Sawhney, Y. Guo, and R. Kumar, “Independent motion detection in 3D scenes,” IEEET-PAMI, Vol.22, No. 10, Oct. 2000, pp.1191-1199. [7] S. Kim and I. S. Kweon, “3D target recognition using cooperative feature map binding under Markov chain Monte Carlo,” Pattern Recog. Lett. (27)7, May 2006, pp. 811-821. [8] S. I. Roumeliotis and G. A. Bekey, “SEGMENTS: A layered, dual-Kalman filter algorithm for indoor feature extraction,” Proc. IROS 2000, Vol.1, Oct. 31-Nov. 5, 2000, pp.454-641. [9] S. I. Roumeliotis and G. A. Bekey, “Bayesian estimation and Kalman filtering: A unified framework for mobile robot localization,” Proc. IEEE Conf. on Robotics and Automation, Vol.3, Apr. 24-28, 2000, pp.2985-2992. [10] K. Dorfmüller-Ulhaas, “Robust optical user motion tracking using a Kalman Filter,” University of Augsburg, May 2003. [11] P. I. Campagnola, et al., “High-resolution nonlinear Optical imaging of live cells by second harmonic generation,” Biophysical Journal, Vol.77, No.6, 1999, pp.3341-3349. [12] N. P. Angelo and V. Haertel, “On the application of Gabor iterating in supervised image classification,” March 2003. [13] H. W. Sorenson, editor. Kalman Filtering: Theory and Application. IEEE Press, 1985. [14] S. J. Julier and J. K. Uhlmann, “A new extension of the kalman filter to nonlinear systems,” Proc. AeroSense: 11th Int. Symp. on Aerospace/Defense Sensing, Simulation and Controls, 1997.
[15] D. M. Tsai, S. K. Wu, and M. C. Chen, “Optimal Gabor filter design for texture segmentation using stochastic optimization,” Image and Vision Computing, Vol.19, April 2001, pp.299-316. [16] T. P. Weldon, W. E. Higgins, and D. F. Dunn, “Efficient Gabor filter design for texture segmentation,” Pattern Recognition (29), Dec. 1996, pp.2005-2015. [17] M.T. Leung, W.E. Engler, and P. Frank, “Fingerprint image processing using neural network,” IEEE Reg. 10 Conf. on Computer and Comm. Systems, Hong Kong, Vol.2, Sept. 24-27, 1990, pp.582-586. [18] R. Manthalkar, P. K. Biswas, and B. N. Chatterj, “Rotation and scale invariant texture classification using Gabor wavelets,” Proc. 2nd Texture Workshop, Copenhagen, Denmark, 2002. [19] A. Montillo, D. Metaxa, and L. Axel, “Extracting tissue deformation using Gabor filter banks,” Proc. SPIE Medical Imaging 2004, Vol.5369, April 2004, pp.1-9. [20] M. Macenko, R. Luo, M Celenk, L. Ma, and Q. Zhou, “Lesion detection using Gabor-based saliency field mapping,” Medical Imaging 2007, Proc. SPIE Vol.6512, Feb. 17-24, San Diego, CA. [21] F. Yang and R. Lishman, “Land cover change detection using Gabor filter texture,” Proc. of 3rd Int. Workshop on Texture Analysis and Synthesis, Nice, France, 2003. [22] A. Bovie, “Analysis of multichannel narrow-band filters for image texture segmentation,” IEEET-SP, Vol.39, No.9, 1991, pp.2025-2044. [23] R.G. Brown and P.Y.C. Hwang, Introduction to Random Signals and Applied Kalman Filtering. John Wiley & Sons, New York, 1992. [24] Terravic Research Motion IR Database, IRW08