Imaging for concealed weapon detection - Signal ... - CiteSeerX

26 downloads 4553 Views 2MB Size Report
The Concealed Weapon Detection (CWD) program was started in 1995 to address this problem. (among others) under .... In this section, we describe a tech- nique for ..... is a chief scientist at ITT industries, Advanced Engineering and. Sciences .... of Computer Science, Brigham Young University, 1995 [Online]. Available:.
Chen, Seungsin Lee, Raghuveer M. Rao, [ Hua-Mei Mohamed-Adel Slamani, and Pramod K. Varshney]

© EYEWIRE

Imaging for Concealed Weapon Detection [A tutorial overview of development in imaging sensors and processing]

T

he detection of weapons concealed underneath a person's clothing is an important obstacle to the improvement of the security of the general public as well as the safety of public assets like airports and buildings. Manual screening procedures for detecting concealed weapons such as handguns, knives, and explosives are common in controlled access settings like airports, entrances to sensitive buildings, and public events. It is desirable sometimes to be able to detect concealed weapons from a standoff distance, especially when it is impossible to arrange the flow of people through a controlled procedure. The Concealed Weapon Detection (CWD) program was started in 1995 to address this problem (among others) under the sponsorship of the National Institute of Justice through the administration of the Air Force Research Laboratory in the United States. The goal is the eventual deployment of automatic detection and recognition of concealed weapons. It is a technological challenge that requires innovative solutions in sensor technologies and image processing. The problem also presents challenges in the legal arena. A number of sensors based on different phenomenology as well as image processing support are being developed to observe objects underneath people’s clothing. The main aim of this article is to provide a tutorial overview of these developments. IMAGING SENSORS FOR CWD A comprehensive overview of the imaging sensors being developed for CWD applications is available in NIJ’s Guide to the Technologies of Concealed Weapon Imaging and Detection [1]. Table 1 gives a brief summary of the

IEEE SIGNAL PROCESSING MAGAZINE [52] MARCH 2005

1053-5888/05/$20.00©2005IEEE

Figures 1(b) and 2(b) that the image quality of video-rate MMW imaging sensors for CWD based on [1], including their portability, cameras is degraded as compared to that of the first-generation proximity, and whether they use active or passive illumination. It MMW sensor. To increase the performance of the second-generais clear from Table 1 that many imaging systems being developed tion MMW camera (video-rate MMW camera) for weapon for CWD applications are active and require certain forms of illumination. Though most of the illumina[TABLE 1] A SUMMARY OF THE IMAGING SENSORS tions required are low-power radiation, BEING DEVELOPED FOR CWD BASED ON [1]. legal issues such as posting warnings or seeking consent from people being screened DESCRIPTION ILLUMINATION PROXIMITY PORTABILITY may alert carriers of concealed weapons and MAGNETIC IMAGING PORTAL ACTIVE NEAR TRANSPORTABLE diminish the value of the CWD system. For MRI BODY CAVITY IMAGER ACTIVE NEAR FIXED-SITE this reason, the rest of the article is focused MICROWAVE HOLOGRAPHIC IMAGER ACTIVE NEAR TRANSPORTABLE on passive systems only. MICROWAVE DIELECTROMETER IMAGER ACTIVE NEAR TRANSPORTABLE X-RAY IMAGER

INFRARED IMAGER MICROWAVE RADAR IMAGER Infrared imagers utilize the temperature BROADBAND/NOISE PULSE MILLIMETERdistribution information of the target to WAVE/TERAHERTZ-WAVE IMAGER form an image [2]–[3]. Normally they are MILLIMETER-WAVE RADAR DETECTOR used for a variety of night-vision applicaINFRARED IMAGER [2] tions, such as viewing vehicles and people. PASSIVE MMW SYSTEM [3–5] The underlying theory is that the infrared ACTIVE MMW SYSTEM [6] radiation emitted by the human body is absorbed by clothing and then re-emitted by it. As a result, infrared radiation can be used to show the image of a concealed weapon only when the clothing is tight, thin, and stationary. For normally loose clothing, the emitted infrared radiation will be spread over a larger clothing area, thus decreasing the ability to image a weapon.

ACTIVE

NEAR

TRANSPORTABLE

ACTIVE

FAR

TRANSPORTABLE

ACTIVE

FAR

TRANSPORTABLE

ACTIVE

FAR

HANDHELD

PASSIVE

FAR

TRANSPORTABLE

PASSIVE

FAR

TRANSPORTABLE

ACTIVE

FAR

TRANSPORTABLE

PASSIVE MILLIMETER WAVE IMAGING SENSORS Passive millimeter wave (MMW) sensors measure the apparent temperature through the energy that is emitted or reflected by sources. The output of the sensors is a function of the emissivity of the objects in the MMW spectrum as measured by the receiver. Clothing penetration for concealed weapon detection is made possible by MMW sensors due to the low emissivity and high reflectivity of objects like metallic guns. In early 1995, the MMW data were obtained by means of scans using a single detector that took up to 90 minutes to generate one image. Among the first (a) (b) generation of MMW sensors is the focal-plane array MMW sensor by Millitech Corporation. Figure 1(a) shows a visual image of a [FIG1] (a) Visible and (b) MMW image of a person concealing a person wearing a heavy sweater that conceals two guns made metal handgun (top) and a ceramic handgun (bottom) beneath a heavy sweater. with metal and ceramics. The corresponding 94-GHz radiometric image [Figure 1(b)] was obtained by scanning a single detector across the object plane using a mechanical scanner. The radiometric image clearly shows both firearms. Recent advances in MMW sensor technology have led to video-rate (30 frames/s) MMW cameras [4–5]. One such camera is the pupil-plane array from Trex Enterprises. It is a 94-GHz radiometric pupil-plane imaging system that employs frequency scanning to achieve vertical resolution and uses an array of 32 individual wave-guide antennas for horizontal resolution. This system collects up to 30 (a) (b) frames/s of MMW data. Figure 2(a) and (b) shows the visible and second-generation MMW images of an individual [FIG2] (a) A visual image and (b) second-generation MMW image of a hiding a gun underneath his jacket. It is clear from person concealing a handgun beneath a jacket.

IEEE SIGNAL PROCESSING MAGAZINE [53] MARCH 2005

(a)

(b)

(c)

[FIG3] An example of image fusion for CWD. (a) Image 1 (visual), (b) image 2 (MMW), and (c) fused image. (Images provided by Thermotex Corp.)

detection purposes, the quality of the video sequence must be improved. CWD THROUGH IMAGE FUSION By fusing passive MMW image data and its corresponding infrared (IR) or electro-optical (EO) image, more complete information can be obtained; the information can then be utilized to facilitate concealed weapon detection. In [7], fusion of an IR image revealing a concealed weapon and its corresponding MMW image has been shown to facilitate extraction of the concealed weapon. In addition, fusion of an EO image and its corresponding MMW image may facilitate recognition of a concealed weapon by locating the human subject(s) hiding the object. This is illustrated in the example given in Figure 3. Figure 3(a) shows an image taken from a regular CCD camera, and Figure 3(b) shows a corresponding MMW image. If either one of these two images alone is presented to a human operator, it is difficult to recognize the weapon concealed underneath the rightmost person’s clothing. If a fused image [as shown in Figure 3(c)] is presented, a human operator is able to respond with higher accuracy. This demonstrates the benefit of image fusion for the CWD application, which integrates complementary information from multiple types of sensors. IMAGE PROCESSING CHAIN FOR CWD An image processing architecture for CWD is shown in Figure 4. The input can be multisensor (i.e., MMW + IR, MMW + EO, or MMW + IR + EO) data or only the MMW data. In the latter case, the blocks showing registration and fusion can be removed from

Output to a Human Operator Input

Preprocessing

Enhancement and/or Denoising

Registration and Fusion

Automatic Weapon Detection

Feature Extraction/ Description

[FIG4] An image processing architecture for CWD.

Weapon Recognition

Figure 4. The output can take several forms. It can be as simple as a processed image/video sequence displayed on a screen; a cued display where potential concealed weapon types and locations are highlighted with associated confidence measures; a “yes,” “no,” or “maybe” indicator; or a combination of the above. The image processing procedures that have been investigated for CWD applications range from simple denoising to automatic pattern recognition. PREPROCESSING Before an image or video sequence is presented to a human observer for operator-assisted weapon detection or fed into an automatic weapon detection algorithm, it is desirable to preprocess the images or video data to maximize their exploitation. The preprocessing steps considered in this section include enhancement and filtering for the removal of shadows, wrinkles, and other artifacts. When more than one sensor is used, preprocessing must also include registration and fusion procedures. MMW IMAGE DENOISING AND ENHANCEMENT Many techniques have been developed to improve the quality of MMW images [8], [9]. In this section, we describe a technique for simultaneous noise suppression and object enhancement of passive MMW video data [10] and show some experimental results. Denoising of the video sequences can be achieved temporally or spatially. First, temporal denoising is achieved by motioncompensated filtering, which estimates the motion trajectory of each pixel and then conducts a 1-D filtering along the trajectory. This reduces the blurring effect that occurs when temporal filtering is performed without regard to object motion between frames. The motion trajectory of a pixel can be estimated by various algorithms such as optical flow methods, block-based methods, and Output Bayesian methods [11]. If the motion in an image sequence is not abrupt, we can restrict the search to a small region in the subsequent frames for the motion trajectory. For additional denoising and object enhancement, the technique employs a wavelet transform method that is based on multiscale edge representation computed by wavelet transform in the

IEEE SIGNAL PROCESSING MAGAZINE [54] MARCH 2005

Mallat and Hwang’s framework [12]. The approach provides more flexibility and selectivity with less blurring. Furthermore, it offers a way to enhance objects in low-contrast images. Let ψ 1 (x, y) and ψ 2 (x, y) be wavelets for x and y directions of an image, respectively. The dyadic wavelet transform of a function f(x, y) at (x, y) is defined as

W2kj f(x, y) = f ∗ ψ2kj (x, y), k = 1, 2

(1)

where * represents the convolution operator, j is a wavelet decomposition level, and

ψ2kj (x, y) =

1  x y ψ , , k = 1, 2. 2j 2j 2j

(2)

Then the vector (W21j f (x, y), W22j f (x, y)) contains the gradient information of f(x, y) at a point (x, y). The multiscale edge representation G2 j ( f) [13] of an image at a level j is obtained by the magnitude ρ2 j f(x, y) and angle θ2 j f(x, y) of the gradient vector (W21j f (x, y), W22j f (x, y)) ; it is defined as G2 j ( f) ={[(xi , yi ), ∇2 j f(xi , yi )]|ρ2 j f(xi , yi ) has local maximum at (xi , yi ) along the direction θ2 j f(xi , yi )}

(3)

where the magnitude and angle of the gradient are defined by

ρ2 j f(x, y) ≡



2  2 1/2 W21j f(x, y) + W22j f(x, y) 

θ2 j f(x, y) ≡ arctan

W22j f(x, y) W21j f(x, y)

(4)

because of noise and low contrast. The images in Figure 5(b) show the denoised frame by motion-compensated filtering. The frame was then spatially denoised and enhanced by the wavelet transform methods. Four decomposition levels were used and edges in the fine scales were detected using the magnitude and angles of the gradient of the multiscale edge representation. The threshold for denoising was 15% of the maximum gradient at each scale. Figure 5(c) shows the final results of the contrast enhanced and denoised frames. Note that the image of the handgun on the chest of the subject is more apparent in the enhanced frame than it is in the original frame. However, spurious features such as glint are also enhanced; higher-level procedures such as pattern recognition have to be used to discard these undesirable features. CLUTTER FILTERING Clutter filtering is used to remove unwanted details (shadows, wrinkles, imaging artifacts, etc.) that are not needed in the final image for human observation, and can adversely affect the performance of the automatic recognition stage. This helps improve the recognition performance, either operator-assisted or automatic. For this purpose, morphological filters have been employed [14]. Examples of the use of morphological filtering for noise removal are provided through the complete CWD example given in Figure 6. A complete description of the example is given in a later section. REGISTRATION OF MULTISENSOR IMAGES As indicated earlier, making use of multiple sensors may increase the efficacy of a CWD system. The first step toward image fusion is a precise alignment of images (i.e., image registration). Very little has been reported on the registration problem for the CWD application. Here, we describe a registration approach for images taken at the same time from different but nearly colocated (adjacent and parallel) sensors based on the maximization of mutual information (MMI) criterion.

 .

(5)

The multiscale edge representation G2 j ( f ) denotes a collection of local maxima of the magnitude ρ2 j f(x, y) at a point (xi , yi ) along the direction θ2 j f(x, y). The wavelet transformbased denoising and enhancement technique is achieved by manipulating G2 j ( f ). By suppressing the noisy edges below a predefined threshold in the finer scales, noise can be reduced while most of the true edges are preserved. To avoid removing true edges accidentally in lower scales, where true edges generally become smaller, variable thresholds can be applied depending on scales. Enhancement of the image contrast is performed by stretching the multiscale edges in G2 j ( f ). A denoised and enhanced image is reconstructed from the modified edges by the inverse wavelet transform; Figure 5 shows the results of this technique. In Figure 5(a), which shows a frame taken from the sample video sequence, the concealed gun does not show clearly

(a)

(b)

(c) [FIG5] (a) Original frame. (b) Denoised frame by averaging along the motion trajectory. (c) Denoised and enhanced frame by the wavelet approach.

IEEE SIGNAL PROCESSING MAGAZINE [55] MARCH 2005

Registration

Filtering

Fusion

Segmentation

employed to maximize the MI measure. One registration result obtained by this approach is illustrated through the example given in Figure 6. Another MI-based registration algorithm motivated by the advancement of the video-rate MMW camera and aimed at developing a modalityindependent video registration algorithm was reported in [17]. In that work, regions of interest in IR and visual sequences were registered frame-by-frame through the MMI procedure described above.

IMAGE FUSION The most straightforward approach to image 1 fusion is to take the average of the source images, but this can produce undesirable 0.9 results such as a decrease in contrast. Many of the advanced image fusion methods 0.8 involve multiresolution image decomposi0.7 tion based on the wavelet transform [18]. 0.2 First, an image pyramid is constructed for 0.15 0.6 each source image by applying the wavelet 10 0.1 20 transform to the source images. This trans30 0.05 40 50 form domain representation emphasizes 0 60 important details of the source images at different scales, which is useful for choosing [FIG6] A CWD example. the best fusion rules. Then, using a feature selection rule, a fused pyramid is formed for the composite image from the pyramid coefficients of the source MMI states that two images are registered when their mutual images. The simplest feature selection rule is choosing the maxiinformation (MI) reaches its maximum value [15]. This can be mum of the two corresponding transform values. This allows the expressed mathematically as the following: integration of details into one image from two or more images. Finally, the composite image is obtained by taking an inverse ∗ α = arg opt(I(F( x˜ ), R(Tα ( x˜ )))) (6) pyramid transform of the composite wavelet representation. The process can be applied to fusion of multiple source imagery. This type of method has been used to fuse IR and MMW images for where F and R are the images to be registered. F is referred to as CWD application [7]. The first fusion example for CWD applicathe floating image, whose pixel coordinates ( x˜ ) are to be mapped tion is given in Figure 7. Two IR images taken from separate IR to new coordinates on the reference image R. The reference cameras from different viewing angles are considered in this image R is to be resampled according to the positions defined by case. The advantage of image fusion for this case is clear since we the new coordinates Tα ( x˜ ), where T denotes the transformation can observe a complete gun shape only in the fused image. The model, and the dependence of T on its associated parameters α is second fusion example, fusion of IR and MMW images, is providindicated by the use of notation Tα . I is the MI similarity meased in Figure 6. The fused image facilitates the subsequent shape ure calculated over the region of overlap of the two images and extraction process, as will be illustrated later in this article. Other can be calculated through the joint histogram of the two images works related to fusion of imagery for CWD applications can be [15]. The above criterion says that the two images F and R are found in [19]–[22]. registered through Tα∗ when α∗ globally optimizes the MI measure I. In [16], a two-stage registration algorithm was developed PROCESSING TOWARD for the registration of IR images and the corresponding MMW AUTOMATIC WEAPON DETECTION images of the first generation. At the first stage, two human silAfter preprocessing, the images/video sequences can be dishouette extraction algorithms were developed, followed by a played for operator-assisted weapon detection or fed into a binary correlation to coarsely register the two images. The purweapon detection module for automated weapon detection. pose was to provide an initial search point close to the final soluToward this aim, several steps are required, including object tion for the second stage of the registration algorithm based on extraction, shape description, and weapon recognition. the MMI criterion. In this manner, any local optimizer can be

IEEE SIGNAL PROCESSING MAGAZINE [56] MARCH 2005

SEGMENTATION FOR OBJECT EXTRACTION Object extraction is an important step towards automatic recognition of a weapon, regardless of whether or not the image fusion step is involved. Otsu’s method [23] has been successfully used to extract the gun shape from the fused IR and MMW images. This could not be achieved using the original images alone. One segmented result from the fused IR and MMW image is shown in Figure 6. Another segmentation procedure applied successfully to MMW video sequences for CWD application is called the Slamani Mapping Procedure (SMP) [24]–[25]. A block diagram of this procedure is given in Figure 8. The procedure computes multiple important thresholds of the image data in the automatic threshold computation (ATC) stage for 1) regions with distinguishable intensity levels, and 2) regions with close intensity levels. Regions with distinguishable intensity levels have multimodal histograms, whereas regions with close intensity levels have overlapping histograms. The thresholds from both cases are fused to form the set of important thresholds in the scene. At the output of the ATC stage, the scene is quantized for each threshold value to obtain data above and below. Adaptive filtering is then used to perform homogeneous pixel grouping in order to obtain “objects” present at each threshold level. The resulting scene is referred to as a component image. Note that when the component images obtained for all thresholds are added together, they form a composite image that displays objects with different colors. Figure 9 shows the original scene and its corresponding composite image. Note that the weapon appears as a single object in the composite image.

(a)

(b)

SHAPE DESCRIPTION The task of shape description is to depict the extracted shapes (c) effectively to facilitate the recognition stage that follows. In general, it is not known how the shapes appear in the observed [FIG7] One fusion example for CWD application: (a) and (b) are the original IR images; (c) is the fused image. frame. They could be rotated with an angle and/or scaled in size and/or shifted in position. Therefore, the algorithms to be considered should be Data rotation, scale, and translation invariHomogeneous Patches Pixels Above and Collection at the Threshold Level Below Threshold ant. Shape descriptors based on comThreshold pactness/circularity [26], Fourier Component Image 1 Homogeneous Pixel 1 descriptors [27], and moments [28] are Grouping capable of being invariant to an object’s N-Level Composite translation (position), scale, and rotaAutomatic Image Objects Pixels tion (orientation) [29], and have been Thresholds Sum Computation used for concealed weapon detection (ATC) applications [30]. Threshold N

Moments Hu [28] defines six shape descriptors based on the second- and third-order normalized moments that are translation, scale, and rotation invariant. The definitions of these six descriptors are provided below:

Homogeneous Pixel Grouping

– Regions with Distinguishable Intensity Levels – Regions with Very Close Intensity Levels – Fusion of Threshold Values for Both Types of Regions

[FIG8] Block diagram of SMP.

IEEE SIGNAL PROCESSING MAGAZINE [57] MARCH 2005

Component Image N Image with Homogeneous Regions

Using these moment definitions, a measure that can be used to describe feature shapes is defined by [29]

F2 = 10 9 8 7 6 5 4 3 2 1

Original Image

SMP Weapon

Composite Image

Compactness/Circularity A dimensionless measure of shape compactness or circularity, C, is defined as C = P 2 /A

SD1 =η20 + η02

(7)

SD2 =(η20 − η02 )

2 + 4η11 2

(8) 2

SD3 =(η30 − 3η12 ) + (3η21 − η03 ) 2

2

SD4 =(η30 + η12 ) + (η21 + η03 )

(9) (10)

SD5 = (η30 − 3η12 )(η30 + η12 )[(η30 + η12 )2 − 3(η21 + η03 )2 ] + (3η21 − η03 )(η21 + η03 ) × [3(η30 + η12 )2 − (η21 + η03 )2 ] (11)   2 2 SD6 = (η20 − η02 ) (η30 + η12 ) − (η21 + η03 ) + 4η11 (η30 + η12 )(η21 + η03 )

(12)

(p+q+2)/2

where η p,q = (µ p,q)/µ0,0 is the normalized central  moment with µ p,q = (x − x¯ ) p(y − y¯ )q being the central moment. The performance of these six moment-based shape descriptors are examined in the next section. In addition to the moments of images, moments of region boundaries can be also defined. Let the coordinates of the N contour pixels of the object be described by an ordered set (x(i ), y(i )), i = 1, 2, . . . , N. The Euclidean distance between the centroid, ( x¯ , y¯ ) and the ordered sequence of the contour pixels of the shape is denoted as d(i), i = 1, 2, . . . , N. This set forms a single-valued 1D unique representation of the contour. Based on the set d(i ), the pth moment is defined as

mp =

N 1  [d(i )] p N i=1

(13)

and the pth order central moment is defined as

Mp =

N 1  [d(i ) − m1 ] p. N i=1

(14)

(15)

Note that the value is dimensionless as well as rotation, scale, and translation invariant.

[FIG9] Original and composite images.

2

(M4 )1/4 − (M2 )1/2 . m1

(16)

where P is the length of the region perimeter and A is the area of the region. Compactness provides a measure of contour complexity versus area enclosed. In addition, it measures how circular or elongated the object is. A shape with a rough contour including several incursions will have a high value of C, indicating low compactness. It is clear that this quantity is independent of rotation, scale, and translation. Fourier Descriptors [27] In this approach, the boundary pixels are used to extract the feature information. Suppose that there are N points on the contour of a shape (region). The x–y coordinates of each point in the contour are represented as a complex number in a complex plane. The ordered contour sequence may then be written as a complex sequence zi

zi = xi + j yi

i = 0, 1, 2, . . . , N − 1.

(17)

Note that the complex number sequence is periodic with each traversal of the complete boundary. The Fourier descriptors (FDs) are defined as

B(k) =

 1 N−1 zi exp[− j2π k i/N]k = 0, 1, . . . , N − 1. (18) N i=0

Before using the FDs for shape analysis, we need to eliminate their dependence on orientation, size, position, and starting point of the contour. Based on the following properties [27], the Fourier descriptors can be modified accordingly. 1) A change in the position of the contour (translation) changes the value of B(0) only. So, by setting B(0) = 0, the FDs become translation invariant. 2) To change the size of the contour (scaling), the zi s need to be multiplied by a

IEEE SIGNAL PROCESSING MAGAZINE [58] MARCH 2005

Library1 data are obtained by running the shape characterizascalar. This is equivalent to multiplying B(k)s by a scalar. So tion algorithms on these images. For the known nonweapon if the FDs are divided by B(1), then they become normalized shapes, basic synthesized geometric shapes are considered. and independent of scaling. 3) Rotating the contour in the These are circles, squares, and rectangles (with different side complex plane is equivalent to multiplying each coordinate ratios) in several sizes and by exp( jθ), where θ is the rotated positions. Reference angle of rotation. Due to the IT IS DESIRABLE SOMETIMES TO BE ABLE Library2 data are obtained linearity of the Fourier TO DETECT CONCEALED WEAPONS from these synthetic shapes transform, it results in mulFROM A STANDOFF DISTANCE. in the same manner described tiplying the frequency for Library1 data. domain coefficients, B(k), Next, on a new extracted by exp( jθ). The normalized shape image with unknown origin, the N algorithms are run FDs (NFDs) thus obtained are and its shape characterization values are obtained as an N dimensional vector. The recognition of this new shape is 0, k = 0  achieved by applying the nearest-neighbor method in this NNFD(k) = B(k)/B(1), k = 1, 2, . . . , N/2  dimensional space. The normalized Euclidean distances B(k + N)/B(1), k = −1, −2, . . . , −(N/2) + 1 between the new sample and each sample of the reference (19) libraries are computed. The shortest distance determines the class of the new image. From (19), a Fourier descriptor measure (FDM) that can be used to measure the feature information of the shape is defined as EXAMPLE   N/2 N/2 The entire CWD process of Figure 4 is demonstrated here by   FDM =  NFD(k)/|k| NFD(k) the example shown in Figure 6. The original IR and MMW k=− N2 +1 k=− N2 +1 images are shown in the left-most panels of Figure 6. Also (20) shown in Figure 6 are the images after registration, filtering, fusion, and thresholding. After registration, each source image was morphologically filtered using a 5 × 5 structuring where · is the norm. In the measure above, the division by |k| element to remove some of the unwanted details (opening reduces its sensitivity to high-frequency noise (which appears in followed by a closing operation). Visually, the MMW images the shapes with rough boundaries), and the denominator nordo not seem to change much after morphological filtering, malizes the quantity and limits its range within zero to one. but the IR images appear to contain less detail. The filtered Besides the shape descriptor defined above, another two IR images appear smoother around the arm and neck areas, descriptors based on the FD defined in (18) were found to be but the contrast around the gun remains about the same. For effective for CWD application [30]. Their definitions are given fusion, two levels were used for the wavelet decomposition below. and reconstruction. The result of thresholding the fused image is also shown in Figure 6 with the gun shape clearly N  SD7 = abs(Bk) (21) visible. Also, three shape descriptors described in (15), (16), k=2 and (20) are calculated for the gun shape and compared to the other shapes in the weapon and nonweapon libraries. In  N/2 abs(Bk + BN−k+2 ) if rem(N, 2) = 0 the three-dimensional plot (see Figure 6), the gun shape is  k=2 (22) SD8 =  denoted by a black “×,” the nonweapon library points are  (N−1)/2 represented by a red “0,” and the weapon library points are abs(Bk + BN−k+2 ) if rem(N, 2) = 1. k=2 represented by a blue “+.” From this plot, it is clear that the gun shape is closest to the points in the weapon library; it will thus be classified as a weapon. WEAPON RECOGNITION To evaluate the performance of each individual shape In this section, we describe a standard pattern recognition descriptor, a test is designed based on the available MMW video approach to distinguish the shapes representing weapons from sequence. First, a set of 30 frames was selected from a those representing nonweapons according to the shape descripsequence of MMW data. Objects from each frame were extracttors adopted. First, two shape libraries are constructed from ed using the SMP described previously. There were 166 total known weapon and nonweapon images. Each shape is run objects extracted, among which 28 were weapons, by observing through the selected shape description algorithms. If we assume the original video sequence. To determine the performance of N shape descriptors are used, then N dimensionless numbers each shape descriptor, the probability of detection (PD) versus can be obtained, from which an N-dimensional vector is formed. probability of false alarm (PFA) are plotted by choosing differFor the known weapon shapes, pictures of different kinds of ent thresholds for each of the shape descriptors. Figure 10 handguns are digitized and the corresponding reference

IEEE SIGNAL PROCESSING MAGAZINE [59] MARCH 2005

Probability of Detection

Probability of Detection

Probability of Detection

underneath common clothing. Recent advances in MMW sensor technology 1 1 have led to video-rate (30 frames/s) MMW cameras. However, MMW cam0.8 0.8 eras alone cannot provide useful infor0.6 0.6 mation about the detail and location of the individual being monitored. To 0.4 0.4 enhance the practical values of passive MMW cameras, sensor fusion 0.2 0.2 approaches using MMW and IR, or 0 0 MMW and EO cameras are being inves0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 0 1 tigated. By integrating the comple(b) (a) mentary information from different sensors, a more effective CWD system 1 is expected. In the second part of this 0.8 article, we provided a survey of the image processing techniques being 0.6 developed to achieve this goal. 0.4 Specifically, topics such as MMW image/video enhancement, filtering, 0.2 registration, fusion, extraction, description, and recognition were dis0 0 0.2 0.4 0.6 0.8 1 cussed. A preliminary study on the per(c) formance of several shape descriptors that show promising results has also [FIG10] (a) PD versus PFA for C (o); (b) PD versus PFA for SD7 (o) and SD8 (); and (c) PD been reported in this paper. When versus PFA for SD1 (o), SD2 (), SD3 ( ), SD4 (❒), SD5 ("), and SD6 (◊). combined with advanced CCD-based video processing systems like the ones reported in [31], an active video surveillance system capable shows the plots obtained for the compactness measure C of detecting concealed weapons is expected. defined in (16), moment-based descriptors SD1 ∼ SD6 defined There are several challenges ahead. One critical issue is the in (7)–(12), and two Fourier descriptor-based measures SD7 challenge of performing detection at a distance with high probaand SD8 defined in (21)–(22). For the compactness measure C, bility of detection and low probability of false alarm. Yet another Figure 10(a) shows that when all the weapons are detected difficulty to be surmounted is forging portable multisensor (PD = 1.00), the PFA is about 0.13. Figure 10(b) shows the instruments. Also, detection systems go hand in hand with subresults obtained when the FD-based measures SD7 and SD8 sequent response by the operator, and system development are used. It shows that the sum of the magnitude of the FDs should take into account the overall context of deployment. results in better performance with less PFA than using the magnitude of the combination of the positive and corresponAUTHORS ding negative phases of the FDs. Finally, Figure 10(c) shows Hua-Mei Chen is an assistant professor in the Department of the results of using moment-based shape measures to the set Computer Science and Engineering, the University of Texas at of objects. The plots of PD versus PFA show that SD1 and SD2 , Arlington. He received his B.S. degree in electrophysics from which are based on second-order moments, are the worst the National Chiao Tung University, Hsin Chu, Taiwan, in 1991, behaved ones; whereas SD 3 through SD 6 , based on thirdand his M.S. and Ph.D. degrees in electrical engineering and order moments, are the best behaved ones and result in small computer science from Syracuse University, New York, in 2000 values of PFA while generating very close results. and 2002, respectively. His research interests include digital signal/image/video processing with applications to computer CONCLUSIONS vision, medical imaging, and remote sensing. He is currently Imaging techniques based on a combination of sensor techwith the Institute for Research in Security (IRIS) in the College nologies and processing will potentially play a key role in of Engineering at the University of Texas at Arlington, pursuing addressing the concealed weapon detection problem. In this technologies for video surveillance systems capable of concealed article, we first briefly reviewed the sensor technologies weapon detection. being investigated for the CWD application. Of the various Seungsin Lee received the B.S. degree from Yonsei methods being investigated, passive MMW imaging sensors University, Seoul, Korea, in 1994 and the M.S. degree from offer the best near-term potential for providing a noninvasive the Rochester Institute of Technology (RIT), Rochester, NY, in method of observing metallic and plastic objects concealed

IEEE SIGNAL PROCESSING MAGAZINE [60] MARCH 2005

2000, all in electrical engineering. He received the Ph.D. degree from the Center for Imaging Science, RIT, in July 2004. His research interests are in the areas of signal and image processing. Raghuveer M. Rao received the master of engineering degree in electrical communication engineering from the Indian Institute of Science, Bangalore, India, in 1981 and the Ph.D. degree in electrical engineering from the University of Connecticut, Storrs, in 1984. From 1984 to 1987, he was with Advanced Micro Devices Inc., Austin, Texas. Since December 1987, he has been with the Rochester Institute of Technology, New York, where he is a professor of electrical engineering and a member of the Ph.D. faculty in imaging science. He is an associate editor of the Journal of Electronic Imaging and a past associate editor of IEEE Transactions on Signal Processing and IEEE Transactions on Circuits and Systems II. He has received the IEEE Signal Processing Society’s Best Young Author Paper Award. He is a Fellow of SPIE, the International Society for Optical Engineering. Mohamed-Adel Slamani received the B.S. degree with beststudent-award from Polytechnic of Algiers, Algeria, in 1986, and the M.S. and Ph.D. degrees in electrical and computer engineering from Syracuse University in 1991 and 1994, respectively. He is a chief scientist at ITT industries, Advanced Engineering and Sciences Division. His areas of interest include weak signal detection, nonGaussian problems, fusion, enhancement, data classification and clustering, and recognition. Pramod K. Varshney received the B.S. degree in electrical engineering and computer science (with highest honors) and the M.S. and Ph.D. degrees in electrical engineering from the University of Illinois at Urbana-Champaign in 1972, 1974, and 1976, respectively. Currently he is a professor of electrical engineering and computer science and the research director of the New York State Center for Advanced Technology in Computer Applications and Software Engineering. His current research interests are in distributed sensor networks and data fusion, detection and estimation theory, wireless communications, image processing, and radar signal processing. He has published extensively and received numerous awards. He is a Fellow of the IEEE and was the 2001 president of the International Society of Information Fusion.

REFERENCES [1] N.G. Paulter, “Guide to the technologies of concealed weapon imaging and detection,” NIJ Guide 602-00, 2001 [Online]. Available: http://www.ojp.usdoj.gov/ nij/pubs-sum/184432.htm [2] P.W. Kruse and D.D. Skatrud, Eds., Uncooled Infrared Imaging Arrays and Systems, in Semiconductors and Semimetals, vol. 47. San Diego, CA: Academic, 1997. [3] L.A. Klein, Millimeter-Wave and Infrared Multisensor Design and Signal Processing. Boston, MA: Artech, 1997. [4] A. Pergande and L. Anderson, “Video rate millimeter-wave camera for concealed weapons detection,” Proc. SPIE, vol. 4373, pp. 35–39, 2001. [5] C.A. Martin, S.E. Clark, J.A. Lovberg, and J.A. Galliano, “Real-time wide-fieldof-view passive millimeter-wave imaging,” Proc. SPIE, vol. 4719, pp. 341–349, 2002. [6] E.N. Grossman and A.J. Miller, “Active millimeter-wave imaging for concealed weapons detection,’’ in Proc. SPIE, Conference of Passive Milliter-Wave Imaging

Technology VI and Radar Sensor Technology VII, Orlando, FL, Apr. 2003, vol. 5077, pp. 62–70. [7] P.K. Varshney, H. Chen, L.C. Ramac, M. Uner, D. Ferris, and M. Alford, “Registration and fusion of infrared and millimeter wave images for concealed weapon detection,” in Proc. IEEE Int. Conf. Image Processing, Kobe, Japan, 1999, vol. 3, pp. 532–536. [8] A.H. Lettington, M.R. Yallop, and S. Tzimopoulou, “Restoration techniques for millimeter-wave images,” Proc. SPIE, vol. 4373, pp. 94–103, 2001. [9] J.D. Silverstein, “Passive millimeter-wave image resolution improvement by linear and non-linear algorithms,” in Proc. SPIE, vol. 4373, pp. 132–153, 2001. [10] S. Lee, R. Rao, and M.A. Slamani, “Noise reduction and object enhancement in passive millimeter wave concealed weapon detection,” in Proc. Int. Conf. Image Processing, Rochester, NY, 2002, vol. 1, pp. I-509–512. [11] A.M. Tekalp, Digital Video Processing. Upper Saddle River, NJ: Prentice-Hall, 1995. [12] S. Mallat and W.L. Hwang, “Singularity detection and processing with wavelets,” IEEE Trans. Inform. Theory, vol. 38, no. 2, pp. 617–643, 1992. [13] S. Mallat and S. Zhong, “Characterization of signals from multiscale edges,” IEEE Trans. Pattern Anal. Machine Intell., vol. 14, no. 7, pp. 710–732, 1992. [14] L.C. Ramac, M.K. Uner, P.K. Varshney, M. Alford, and D. Ferris, “Morphological filters and wavelet based image fusion for concealed weapons detection,” in Proc. SPIE Aerosense‘98, 1998, vol. 3376, pp. 110–119. [15] F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens, “Multimodality image registration by maximization of mutual information,” IEEE Trans. Med. Imag., vol. 16, no. 2, pp. 187–198, 1997. [16] H. Chen and P.K. Varshney, “Automatic two stage IR and MMW image registration algorithm for concealed weapon detection,” in IEE Proc. Vision Image and Signal Processing, vol. 148, pp. 209–216, 2001. [17] H. Chen, P.K. Varshney, and M.A. Slamani, “On registration of regions of interest (ROI) in video sequences,” presented at IEEE Int. Conf. Advanced Video and Signal Based Surveillance, Miami, FL, 2003. [18] Z. Zhang and R.S. Blum, “Categorization of multiscale-decomposition-based image fusion schemes with a performance study for a digital camera application,” in Proc. IEEE, vol. 87, pp. 1315–1326, 1999. [19] Z. Xue and R.S. Blum, “Concealed weapon detection using color image fusion,’’ in Proc. 6th Int. Conf. Information Fusion, Gallup, NM, July 2003, pp. 1389–1394. [20] A. Toet, “Color image fusion for concealed weapon detection,’’ Proc. SPIE, Sept. 2003, vol. 5071, no.1, pp. 372–379. [21] T. Meitzler, D. Bryk, E.J. Sohn, K. Lane, J. Raj, and H. Singh, “Fuzzy logic based sensor fusion for mine and concealed weapon detection,’’ in Proc. SPIE, Conference on Detection and Remediation Technologies for Mines and Minelike Targets VIII, Sept. 2003, vol. 5089, pp. 1353–1362. [22] Y. Yang and R.S. Blum, “A statistical signal processing approach to image fusion for concealed weapon detection,’’ in Proc. 2002 Int. Conf. Image Processing, Rochester, NY, Sept. 2002, vol. 1, pp. I-513–516. [23] N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Trans. Syst., Man, Cybern., vol. 9, no. 1, pp. 62–66, 1979. [24] M.A. Slamani, P.K. Varshney, M.G. Alford, and D. Ferris, “Use of A’SCAPE and fusion algorithms for the detection of concealed weapons,” presented at the Proc. 8th Int. Conf. Signal Processing Applications and Technology (ICSPAT’97), San Diego Convention Center, San Diego, CA, 1997. [25] M.A. Slamani, D.D. Weiner, and V. Vannicola, “A new statistical procedure for the segmentation of contiguous nonhomogeneous regions based on the Ozturk algorithm,’’ in Proc. SPIE Conf. Statistical and Stochastic Methods for Image Processing, Denver, CO, 4–5 Aug. 1996, vol. 2823, pp. 236–246. [26] R.C. Gonzalez and P. Wintz, Digital Image Processing, 2nd ed. Reading, MA: Addison-Wesley, 1987. [27] B.S. Morse, “Lectures in image processing and computer vision,” Department of Computer Science, Brigham Young University, 1995 [Online]. Available: http://iul.cs.byu.edu/morse/550-F95/550.html [28] M. Hu, “Visual pattern recognition by moment invariants,” IRE Trans. Inform. Theory, vol. IT-8, pp. 179–187, 1962. [29] L. Shen, R.M. Rangayyan, and J.E. Leo Desautels, “Application of shape analysis to mammographic calcifications,” IEEE Trans. Med. Imag., vol. 13, no. 2, pp. 263–274, 1994. [30] M.A. Slamani, D.D. Ferris, “Shape descriptors based detection of concealed weapons in millimeter-wave data,’’ in Proc. SPIE AeroSense 2001, Optical Pattern Recognition XII, pp. 4387–4419, Orlando, FL, 16-20 Apr. 2001, pp. 4387–4419. [31] G.L. Foresti, C. Micheloni, L. Snidaro, P. Remagnino, and T. Ellis, “Active video-based surveillance system,” IEEE Signal Processing Mag., vol. 22, no. 2, pp. 25–37, Mar. 2005. [SP]

IEEE SIGNAL PROCESSING MAGAZINE [61] MARCH 2005

Suggest Documents