Concealed object detection and segmentation ... - Semantic Scholar

2 downloads 873 Views 441KB Size Report
In the following paragraphs we explain the basics of the .... The basic idea behind it, is to minimize .... [11] H. J. Seo, P. Chatterjee, H. Takeda, and P. Milanfar.
Concealed object detection and segmentation over Millimetric Waves Images Oriol Mart´ınez, Luis Ferraz, Xavier Binefa Universitat Pompeu Fabra CMTech group, DTIC, UPF, C/Tanger, 122-140, 08018 Barcelona, Spain [email protected], [email protected], [email protected]

Ignacio G´omez, Carlos Dorronsoro Alfa Imaging SA Alfa Imaging S.A., C/General Pardias 91, 28006 Madrid, Spain [email protected], [email protected]

Abstract Millimetric Waves Images (MMW) are becoming more and more useful in the passive detection of threaten objects based on plastic substances as explosives or sharp/cutting weapons. Our goal is to achieve segmentation of the body and concealed threats dealing with the inherent problems of this type of images: noise, low resolution and intensity inhomogeneity. In this work we present the results of applying Iterative Steering Kernel Regression (ISKR) method for denoising and Local Binary Fitting (LBF) for segmentation in order to correctly segment bodies and threats over a database of 29 MMW images. These methods, which had not been tested in the literature with these type of images, are compared with previously applied state of the art methods. Experimental results show that the use of the proposed methods in MMW images improve the results that had been obtained before.

1. Introduction Computer Vision (CV) can be understood as the science and technology that aims to give cognitive abilities to artificial vision systems. During many years many sensors have been developed in order to explore a wider range of frequencies from the electromagnetic spectrum. These sensors go beyond our perceptual capabilities and are able to acquire new information. In this paper, we explore a specific waveband which provides the ability to go through obstacles like cloth or polymers and also smoke, dust or fog for security and surveillance applications [1]. As we can see in Figure 1, we are using the millimeter-wave (MMW) region, which goes from 30 to 300 GHz and corresponds to a wavelength range be-

tween 1 cm to 1 mm [2] [3].

Figure 1. Electromagnetic Spectrum. In green, the used MMW region.

1.1. Scope The main goal of this article is to achieve the segmentation of concealed plastic and metallic threats over a human body in a database of images which had been obtained by a MMW passive imaging system using an automatic and non-parametric process. The MMW imager employed in this work, developed by Alfa Imaging S.A, is a raster sensor operating indoors at 94 GHz with the subject posing at two different distances: 5m and 10m (see Figure 2). The use of this type of sensors, specially for stand-off detection, implies inherent limitations: • Low resolution: the size of the obtained images is 75x52 pixels. • The use of passive radiometry implies very low signal levels and low Signal to Noise Ratio (SNR). • Intensity inhomogeneity: MMW images are obtained by detecting the radiometric temperature of the scene.

Due to radiometric temperature of one object depends on the reflection of the sky temperature plus its own thermal emissions, objects may suffer from intensity inhomogeneity.

Figure 2. Example of images used in real scenarios. From left to right: subject posing at 5m wearing a simulant of powder explosive, subject posing at 10m wearing TNT and subject posing at 5m wearing an Improvised Explosive Device (IED).

To deal with the above constraints, we have divided our work in two phases: image denoising and body/threat segmentation. In the first one, described in Section 2, we apply two state of the art denoising methods whose main property is the preservation of small details: Non Local means (NLmeans) [4] and ISKR [13]. In the second one, at Section 3, we study the segmentation of concealed plastic threats over a human body with a novel local region based Active Contour method: LBF [8]. Finally, in Section 4, we present a comparative of the obtained results over MMW images.

1.2. State of the art in MMW Previous work has referred the problem of image denoising and segmentation of concealed metallic (not plastic) threats in MMW imaging systems. In [5], two methods of segmentation are presented: kmeans and Active Shape Models (ASM). In the first case, the segmentation of the background, the body and the metallic threat is done by applying the k-means clustering algorithm over the intensity histogram. The provided results show unconnected segmentations of the body and the necessity to introduce heuristics in order to determine the number of clusters (when a metallic threat is present or not). The classification stage of each segment is done by performing the maximum-likelihood estimation method with the pre-computed Probability Density Functions (PDFs) of the background and the foreground. In order to improve the connectivity of body segmentation an ASM is used over the results obtained using k-means. Their conclusions show that modelling the full range positions of the body with an ASM is a very complex task (too many parameters are needed) and maybe can’t be achieved using this method without a sub-division of the model space.

In [6], which is an extension of the previous work, different weighted mixture models are estimated depending on the scenario (if a metallic threat is present or not). In the classification stage, the Expectation Maximization (EM) algorithm is applied. Provided results show an improvement on the accuracy of the segmentation but still present the same problems as before: unconnected segmentation and dependence on heuristics. The work of X. Shen et al. [12] introduces the application of denoising methods before the segmentation of concealed objects in Terahertz (THz) images. In spite of the fact that these images have different properties than MMW ones [3] [2], the goals and the problems referred on this work are very close to ours. For the denoising process NLmeans [4] and Anisotropic diffusion [9] are applied. The main strength of these two methods is the intention to preserve image structures. Anisotropic diffusion makes this by studing local values of the image gradient encouraging diffusion in the case of smooth regions (absolute gradient values are small) and stopping it at discontinuities (absolute gradient values are large). On the other side, NL-means takes advantage of the high degree of redundancy that exists in natural images using similar patches in order to perform its denoising.

2. Denoising The digitalization of images produces intrinsic limitations in image accuracy, specially blur and noise. These problems still remain as a challenge for researchers because noise removal algorithms can introduce artifacts and cause blurring over denoised images. In this section, a deeper insight into two denoising techniques is presented in order to highlight their strengths and weaknesses when dealing with MMW imaging. The use of passive radiometry in MMW images implies dealing with very low signal levels and low SNR. Therefore, the segmentation process into different meaningful parts (body and threats) can be a difficult task in images with obtrusive noise. An important number of sophisticated image denoising algorithms can be used in our current images. The approach taken in them goes from the Spatial and Fourier domain to the Wavelet transform domain. In this work we have focused on spatial domain methods. Specifically, those which emphasize on the preservation of small details. This decision is clearly linked to the size restrictions present in the passive MMW images for stand-off detection (threats, subtending only a few pixels, must not be removed as noise). Classical spatial filters can be classified into linear and non-linear. Linear filters, like mean or Gaussian, tend to blur sharp edges, destroy lines and other fine image details. In spite of these problems, they are often used as the basis for non-linear noise reduction filters. On the other hand,

classical non-linear filters like Median filtering and Total variation Minimization [10] are better maintaining details but also erase small structures because of size restrictions (few pixels). In order to avoid these problems, new nonlinear methods have improved the results by preserving the most representative structures. As we have commented in the previous section, Anisotropic Diffusion as well as NL-means try to preserve the details of the original images using different strategies. The results provided in [4] show clearly that NL-means gives better results in natural images. Furthermore, in [13], a newer approach based on data-adaptive kernel regression is presented. This method, ISKR, outperforms the results of NL-means in natural images [11]. In the following paragraphs we explain the basics of the two methods that we have used to denoise our MMW images database: NL-means and ISKR.

2.1. Non Local Means (NL-Means) NL-means can be considered as the natural evolution of neighborhood filters which estimate the value of each pixel by averaging its neighbors with similar intensity value. The main difference is that in spite of using a fixed spatial neighborhood or weighting neighboring pixels by the distance to the pixel that is evaluated, this method uses the information of all the pixels whose neighborhoods are close to the studied one (similar structures). This is done by substituting the value of each pixel by the weighted average of all the pixels with similar neighborhoods and results in a robust denoising method which mantains the local structures and details as well as the contours. Specifically, if we take as input a discrete noisy image T I = {yi = I(xi )|xi = [x1i , x2i ] ∈ Ω} we have to compute a weighted average of all the pixels in the image, N L(xi ) =

X

w(xi , xj ) ∗ yi

where N L(xi ) is a denoised pixel, yi is a noisy sample at coordinates xi and the weights will depend on how similar the neighborhoods are. Let Nk be the set of yi values taken in a neighborhood of a point xi . In the original work of Buades [4] the weights were measured as a decreasing function of the weighted Euclidean distance of the intensity vectors, I(Ni ) and I(Nj ), obtained from the correspondent square neighborhoods Nk . kI(Ni ) − I(Nj )k2 h2

where Z(xi ) is the normalizing constant

kI(Ni ) − I(Nj )k2 h2 exp −

(3)

xj

and parameter h controls the degree of averaging. In order to improve computational performance, an interesting option is the inclusion of a search-window, of fixed size and centered in k, used to delimitate the area of pixels that can be used for averaging. Finally, in the work of Tasdizen [14], weights are computed projecting the neighborhood vectors onto a lowerdimensional subspace by means of Principal Component Analysis (PCA). This modification to the non-local means method results in improved accuracy and computational performance.

2.2. Iterative Steering Kernel Regression (ISKR) In order to compare the results achieved by NL-means with other state of the art method, we have studied the nonparametric Data-Adaptive Kernel-Regression method proposed at [13], where the authors adapt and expand kernel regression ideas using the local gradient information to estimate kernel shapes according to local structures. Takeda et al. [13] propose that the data follows the model fitted by yi = z (xi ) + εi , i = 1, ..., P

(4)

where z(.) is the regression function to be estimated, εi are the independent and identically distributed zero mean noise values and P is the total number of samples in a neighborhood of interest. A second order classic kernel regression method is used in order to estimate the first and second order terms of the Taylor expansion. So, in [13], the local regression function is estimated as,

(1)

xj ∈Ω

− 1 exp w(xi , xj ) = Z(xi )

Z(xi ) =

X

(2)

z (xi ) ≈ z (x) + {∇z (x)}T (xi − x) 1 (xi − x)T {Hz (x)} (xi − x) + 2 n

(5) o

T

= β0 + β1T (xi − x) + β2T (xi − x) (xi − x)

where ∇ and H are the first order (Jacobian) and second order (Hessian) operators respectively. In order to preserve image detail as much as possible the parameters {βn }2n=0 must be estimated from all the samples P yi=1 . The formulation of the fitting problem is to solve the following minimization problem, min

{βn }

PP

i=1



[yi − β0 − β1T (xi − x)

(6)

β2T {(xi − x)(xi − x)T }]2 KHisteer (xi − x)

where K is a locally adapted kernel and Histeer is the (2 x 2) steering matrix. Specifically Histeer has the form of −1/2

Histeer = hµi Ci

(7)

where h is a global smoothing parameter, µ is a scalar that captures the local density of data samples and Ci is the ith covariance matrix computed from the differences between local gray-values. If a Gaussian kernel is chosen, the steering kernel is mathematically represented as p KHisteer (xi − x) =

floor (see Figure 4). This change, which may vary depending on the scenario, adds another problem to MMW image segmentation. So, ideally, we are searching for a non-parametric segmentation method, robust to illumination changes and noise, and capable to segment the body and the threat at the same time.

T

(x −x) Ci (xi −x) − i det(Ci ) 2h2 µ2 i exp 2πh2 µ2i

(8)

Using this method the denoising is affected most strongly along the edges (see Figure 3), rather than across them, resulting in strong preservation of details in the final output. The resulting system can be solved as a weighted linear least squares problem.

Figure 3. Into the green circle data adapted kernels which elongate with respect to the red edge.

Finally, in order to perform a good denoising process, the method must be applied iteratively. So, at each iteration the estimation of the first order and second order Taylor terms as well as the kernels will be more accurate. This is done using the denoised image as input in the next iteration.

3. Segmentation Image segmentation can be understood as the process of dividing a digital image into several regions in order to change its representation into something that is more meaningful and easy to analyze for us. In the case of the images we have to deal with, the segmentation process can be divided in two steps. The first one is to segment the body of the person that appears in the image. The second is to segment the objects located inside the body. In the obtained MMW images, apart from noise and low contrast, a sudden intensity variation appears due to the reflection of the sky radiometric temperature over the

Figure 4. Arrows indicate the intensity inhomogeneity present in MMW images.

Taking a look at the restrictions, we can imagine that classical approaches like intensity clustering algorithms would not achieve our goals. In spite of they can rudimentary segment the body and the threat at the same time [5] [6], heuristics have to be used and they are not robust to sudden illumination changes and noise. On the other side, due the application of an ASM could result in a hard task [5], we have decided to use a method based on region-based Active Contours, LBF [8], which main properties (robust to noise and intensity inhomogeneity) fit well with the requirements that MMW present. Apart from this, segmentation provided by Active Contours approaches should provide better connected regions than intensity clustering algorithms.

3.1. Local Binary Fitting (LBF) The underlying idea behind Active Contours or deformable models for image segmentation is simple. An initial guess contour must to specified. This contour is then moved by image driven forces to the boundaries of the desired objects. In such models, we can consider two types of forces: internal and external. Internal forces are defined within the curve and keep the model smooth during the deformation. External ones are computed from the underlying image data and are defined to move the model towards an object boundary. As we can see in [7] [8], existing active contour models can be categorized into two major classes: edge-based models and region-based models. Edge-based models use local edge information to attract the Active Contour towards the object boundaries. Region-based models aim to identify each region of interest by using a certain region descriptor to guide the motion of the Active Contour. There are many advantages for region-based approaches compared to edge-based ones [7], like robustness against

initial curve placement and insensitivity to image noise. In this work we have used a local-region Active Contour method which emphasizes on intensity inhomogeneity robustness: LBF. The basic idea behind it, is to minimize the error produced by separating each local neighborhood region (for all the points belonging to the contour) in two regions: outside and inside the object (see Figure 5). This is done by adding a weighting kernel in each local region energy computation and then converting the resulting energy functional to a level set formulation.

the contour. The second term is the length smoothing term and the last term is the level set regularization term to penalize the deviation of the level set function (φ) from a signed distance function. Finally, the energy minimization of Eq. 9, using the standard steepest descent for a fixed φ with respect the two functions f1 (x) and f2 (X), stays as ∂φ ∂t

=

(λ1 e1 − λ2 e2 ) + vδ (φ)div(

+ µ(∇2 φ − div(

∇φ ) |∇φ|

∇φ )) |∇φ|

(10)

where δ is a ”smoothed Dirac delta function” (see [8] for further details) and e1 and e2 are the following functions, Z ei (x) =

Kσ (y − x)|I(x) − fi (y)|2 dy, i = 1, 2. (11)

4. Experimental results Figure 5. In both images, the point x is represented by the small dot. β(x, y) is represented by the larger red circle. Shaded regions indicate local exterior and interior regions.

Consider a given image I : Ω →