Target detection and recognition using two-dimensional isotropic and ...

1 downloads 0 Views 2MB Size Report
Target detection and recognition using two-dimensional isotropic and anisotropic wavelets. J.-P. Antoine and P. Vandergheynst. Institut de Physique Th eorique, ...
Target detection and recognition using two-dimensional isotropic and anisotropic wavelets J.-P. Antoine and P. Vandergheynst Institut de Physique Theorique, Universite Catholique de Louvain K. Bouyoucef and R. Murenziy CTSPS, Clark Atlanta University, Atlanta, GA 30314, USA

ABSTRACT Automatic target detection and recognition (ATR) requires the ability to optimally extract the essential features of an object from (usually) cluttered environments. In this regard, ecient data representation domains are required in which the important target features are both compactly and clearly represented, enhancing ATR. Since both detection and identi cation are important, multidimensional data representations and analysis techniques, such as the continuous wavelet transform (CWT), are highly desirable. First we review some relevant properties of two 2D CWT. Then we propose a two-step algorithm based on the 2D CWT and discuss its adequacy for solving the ATR problem. Finally we apply the algorithm to various images.

1 INTRODUCTION The purpose of automatic target detection and recognition (ATR) is the use of computer processing to detect and recognize signatures in sensor data, especially targets embedded in a cluttered environment, with the aim of neutralizing potentiel threats to military and civilian populations while minimizing the required resources and the risk of human life. Such targets can be tanks, planes, other vehicles, missiles, ground troops, etc. Clutter can be grass, trees, topographical features, atmospheric phenomena (i.e. clouds, smoke, etc.). In general the situation can be modeled using the following equation:

s(~x) = n(~x) +

X T (~x): N

l=1

l

(1:1)

where n(~x) represents an additive noise (clutter plus measurement noise), Tl (~x) are targets to be detected and recognized, and s(~x) represents the accessible measured signal. Automatic or assisted target detection and identi cation requires the ability to optimally extract the essential features of an object from (usually) cluttered environments. In this regard, ecient data representation domains  Boursier IRSIA, Belgium y Supported by ARPA (Advanced Research

Project Agency), Grant Nr.MDA 972-93-1-0013

are required in which the important target features are both compactly and clearly represented, enhancing ATR. Since both detection and identi cation are important, multidimensional data representations and analysis techniques are highly desirable. Typically, to provide detection and identi cation of dicult targets while maintaining full surveillance coverage, a coarse resolution sensor is required for detection, while a ne resolution sensor is necessary for recognition (identi cation). Techniques that would allow multiscale processing could greatly ease the burden on a platform's processor while providing the exibilty to utilize only the resolution required at each level, and perhaps, allowing optimal processing for each of the required operations. Many methods have already been used for the ATR problem: classical pattern matching, model-based schemes, dyadic wavelets, subband coding, even some attemps using neural networks. In this paper we will initiate algorithms that are based on the multidimensional continuous wavelet transform (MCWT). ? There are many reasons to use the MCWT for ATR. The typical features to be extracted from the image of a target in a cluttered environment are: 1,2

3

5

 position of the target  spatial extent (size) of the target: scale.  shape of the target: isotropic and anisotropic (orientation) symmetry. To extract such target features it is preferable not to work in the image space, but rather to map the image into the feature space for each of the above characteristics: position, scale, orientation. The multidimensional continuous wavelet transform does precisely this. Multidimensional wavelets represent a di erent approach to discrimination and detection, which could o er a great improvement over traditional pattern matching methods as applied to sensored images. In addition, wavelet methods yield a consistent and ecient image reconstruction algorithm. As indicated above, unlike other methods, the multidimensional wavelet transform incorporates several parameters directly relevant to the essential features of an object. Projection of the transform can thus provide a useful set of image representations for fully automated discrimination. 6,7

An additional important aspect of the MCWT approach is its directional sensitivity, that leads to a robust behavior in identifying the orientation of the target. Since the transform is scale dependent, one expects sensitivity to variations in sensor resolution. Interestingly, because scale is an explicit parameter in the transform, MCWT can be used to determine the target size, or equivalently the target distance, in ISAR images, for example (this is also applicable to optical and FLIR images). Before going into technical details, let us emphasize that in this paper we will use exclusively the continuous WT (CWT) in two dimensions, based on the 2D dimensional Euclidean group with dilations, the so-called 2 dimensional similitude group. This is a very ecient and exible tool in image analysis, particularly for the detection and measurement of certain characteristic features of images. By contrast, the discrete or dyadic WT, based on the concept of multiresolution analysis, is often more appropriate for image synthesis, for instance in data compression (see Refs.3-5 for a survey of both approaches). In this work we propose a two stage algorithm for target detection and recognition based on the 2D CWT and give examples of its application

 to the extraction of simple geometrical objects (such as an `L' shape) embedded in additive Gaussian noise,

and IR images;  on images from TRIM2 which consists of IR images containing di erent type sof targets (tanks, planes, etc..), with di erent background clutters. 1

1

Data set provided by NV&ESD(Night Vision and Electonic Sensor Directorate)

2 MATHEMATICS OF THE 2D CWT 2.1 Elementary operations on images By an image, we mean a two-dimensional signal of nite energy, represented by a complex-valued function de ned on the real plane IR and square integrable, i.e. a function s 2 L (IR ; d ~x) :

Z

2

2

2

ksk = d ~x js(~x)j < 1: 2

2

2

(2:1)

2

(sometimes it is useful to take s integrable as well). In practice, a black and white image will be represented by a bounded non-negative function: 0  s(~x)  M  1; 8~x 2 IR ; (2:2) the discrete values of s(~x) corresponding to the level of gray of each pixel. However it is useful to keep general functions s as above. 2

Z

The Fourier transform of the signal s is de ned, as usual, by (2:3) s(~k) = 21 d ~k e?i~k:~x s(~x); where k 2 IR is the spatial frequency and ~k:~x = k x + k x is the Euclidean scalar product. Of course, the Fourier transform is unitary (Parseval relation): (2:4) s 2 L (IR ; d ~k) and ksk = ksk : All the operations we will apply to a signal s are obtained by combining three elementary transformations of the plane, namely, translations, dilations and rotations. These transformations are represented by the following unitary operators in the space L (IR ; d ~x) of signals:

b

2

2

1

b

2

2

2

2

1

2

2

b

2

2

2

2

(i) translation : (T ~bs)(~x) = s(~x ? ~b); ~b 2 IR ; (2.5) (ii) dilation : (Da s)(~x) = a1 s( ~xa ); a > 0; (2.6) (iii) rotation : (R s)(~x) = s(r? (~x));  2 [0; 2); (2.7) where ~b 2 IR is the displacement parameter, a > 0 the dilation parameter,  the rotation angle, and the rotation matrix r 2 SO(2) acts on ~x = (x; y) as usual : r (~x) = (x cos ? y sin ; x sin  + y cos ); 0   < 2: (2:8) Combining now the three operators, we de ne the unitary operator : 2

2

(a; ; ~b) = T ~b Da R ;

(2:9)

~ ( (a; ; ~b)s)(~x) = sa;;~b (~x)  a1 s(r? ( ~x ?a b ));

(2:10)

which acts on a given function s as :

or, equivalently, in the space of Fourier transforms :

d

b

sa;;~b(~k) = ae?i~b:~k s(ar? (~k)): If the function s is rotation invariant, we simply omit the index  : ~ sa;~b (~x) = a1 s( ~x ?a b ):

(2:11) (2:12)

2.2 Wavelets and continuous wavelet transform By de nition, a wavelet is an admissible vector, that is, a complex-valued function 2 L (IR ; d ~x) satisfying the condition ~ c  (2) d~ k j (~k)j < 1; (2:13) 2

Z

2

b

2

2

2

2

jk j where is the Fourier transform of and j~kj = ~k:~k = (k ) + (k ) .

b

2

2

1

2

2

b(~0) = 0 () Z d ~x

2

If is regular enough, the admissibility condition (2.13) simply means that the wavelet must be of zero mean: (~x) = 0:

2

(2:14)

Clearly the three unitary operators T ~b ; Da ; R preserve the admissibility condition, and so does therefore (a; ; ~b). Hence any function a;;~b = (a; ; ~b) obtained from a wavelet by translation, rotation or dilation is again a wavelet. Thus the given wavelet generates the whole family f a;;~bg, indexed by the elements a > 0;  2 [0; 2); ~b 2 IR . 2

Let now s 2 L (IR ; d ~x) be an image. Its continuous wavelet transform (with respect to the xed wavelet ), S  W s is the scalar product of s with the transformed wavelet a;;~b, considered as a function of (a; ; ~b): 2

2

2

S (a; ; ~b) = h = =

Z1 d js~xi (r ( ~x ? ~b ))s(~x) ? a aZ a d ~k e b(ar (~k)) bs(~k): a;;~b



2

2

i~b:~k

?

(2.15) (2.16) (2.17)

The wavelet may be required to have a few vanishing moments, as in the 1D case. This condition determines the capacity of the WT to detect singularities. Indeed, if has n vanishing moments,

Z d ~x x y 2



(~x) = 0; 1  +  n;

(2:18)

then the WT W is blind to polynomials of degree up to n. Equivalently, W detects singularities in the (n +1)th derivative of the signal. Thus if the signal is rough, a fortiori if it is a measure (as in the analysis of fractals), it is sucient to take a wavelet with no vanishing moment, i.e. no condition has to be imposed beyond (2.14). 8

The main properties of the (continuous) WT W : s 7! S may be summarized as follows: (1) W is linear in the signal s, contrary, for instance, to the Wigner-Ville transform, which is bilinear ; (2) W is covariant under translations, dilations and rotations, which means that the correspondence W : s(~x) 7! S (a; ; ~b) implies the following ones : 3,4

6,7,9

s(~x ? ~bo ) 7! S (a; ; ~b ? ~bo ) ~ W : a1 s( a~x ) 7! S ( aa ; ; ab ) o o o o s(ro (~x)) 7! S (a;  ? o ; r?o (~b)):

(2.19)

(3) W conserves energy:

c?

1

ZZZ

Z

da dd ~b jS (a; ; ~b)j = d ~x js(~x)j ; a

(2:20)

ZZZ da dd ~b

(2:21)

2

2

2

2

3

i.e. it is an isometry from the space of signals into the space of transforms. (4) As a consequence, W is invertible on its range and the inverse transformation is simply the adjoint of W . Thus one has an exact reconstruction formula:

s(~x) =

a

~ a;;~b(~x) S (a; ; b):

2

3

In other words, the 2D wavelet transform, like its 1D counterpart, provides a decomposition of the signal in terms of the analyzing wavelets a;;~b , with coecients S (a; ; ~b). Remarkably enough, the wavelet transform is uniquely determined by the three conditions of linearity, covariance and energy conservation, plus some continuity. 6,7

2.3 Implementation and interpretation: the two basic representations Images can be analyzed and reconstructed with the 2D CWT just described. In practice, however one immediately faces a problem of computation and visualization of the CWT. Indeed S (a; ; ~b) is a function of four variables: two position variables bx ; by , the scale parameter a and the rotation (or anisotropy) angle . In the 1D case, a? de nes the frequency scale, thus the full parameter space of the 1D WT, the time-scale half plane, is in fact a phase space. Exactly the same situation prevails in 2D: the pair (a? ; ) plays the role of spatial frequency, expressed in polar coordinates, and so the full four-dimensional parameter space of the 2D WT may be interpreted as a phase space (see Ref.11 for more details). 10

1

1

As a consequence, there are two natural ways of presenting the CWT, using two-dimensional sections of the parameter space : 9,12

(i) the position representation, where a and  are xed and the CWT is considered as a function of position ~b alone (this amounts to take a set of snapshots, one for each value of (a; ), which may then be collected together into a movie). (ii) the scale-angle representation: for xed ~b, the CWT is considered as a function of scale and angle (a; ), i.e. of spatial frequency; in other words, one looks at the full CWT as through a keyhole located at ~b, and observes all scales and all directions at once. The position representation is the standard one, and it is useful for the general purposes of image processing: detection of position, shape and contours of objects; image ltering by resynthesis after elimination of unwanted features (for instance, noise). The scale-angle representation will be particularly interesting whenever scaling behavior (as in fractals) or angular selection is important, in particular when directional wavelets are used. In fact, both representations are needed for a full understanding of the properties of the CWT in all 4 variables. For the numerical evaluation, discretization of the WT in either representation and systematic use of the FFT algorithm, will lead to a numerical complexity of 3N N log(N N ), where N ; N denote the number of sampling points in the variables (bx ; by ) or (a; ). We refer to Ref.8 for a more detailed discussion. 1

2

1

2

1

2

Whichever representation we use, we end up with a function of two variables, either in cartesian coordinates

~b, or in polar coordinates (a? ; ): In both cases, the function will be real if the wavelet is real (and then it is 1

advantageous to plot the function itself, since the sign of its extrema contains useful information ). Similarly, the transform is complex whenever the wavelet is. In that case, it will be often represented through its modulus and phase. It turns out that the phase is particularly instructive, as was already the case in 1D. 9

9

2.4 Choice of the analyzing wavelet Many wavelets have been proposed and often designed for speci c problems. Let us quote, for instance: a di erence of Gaussians (DOG wavelet), the Morlet wavelet, the Mexican hat,the conical wavelets, multidirectional or `fan' lter wavelets. We present here only the most popular ones: the Mexican hat and the Morlet wavelet. Both of them have been studied and calibrated systematically, and they have been used for many problems in image processing and fractal analysis. ? 9,11

9,11

3

5

 The 2D Mexican hat or Marr wavelet

In its isotropic version, this is simply the Laplacian of a Gaussian: 1 (2:22) H (~x) = (2 ? j~xj ) exp(? 2 j~xj ): This is a real, rotation invariant wavelet, with vanishing moments of order 0 and 1. There exists also an anisotropic version, obtained by replacing in (2.22) ~x by A~x, where A = diag[? = ; 1];   1; is a 2  2 anisotropy matrix. However, this wavelet (shown in Figure 1) is not directional, because it still acts as a second order operator and detects singularities in all directions. Hence the Mexican hat will be ecient for a ne pointwise analysis, but not for detecting directions. 2

2

1 2

1.0

1.0

-1.0

0.0

(a)

(b)

Figure 1: The anisotropic Mexican hat H with  = 5 (a = 0:25;  = 45o ): (a) in position space ; (b) in spatial frequency space.

 The 2D Morlet wavelet

This is the prototype of an oriented wavelet: Mor

(~x) = exp(i~ko  ~x) exp(? 12 jA~xj ) + corr. term: 2

(2:23)

b

The parameter ~ko is the wave vector, and A the anisotropy matrix as above. The correction term enforces the admissibility condition (~0) = 0, but it is numerically negligible for j~ko j  5:6 and will usually be dropped. In that case, putting  = 1, we obtain the function: 1 (2:24) G (~x) = exp(i~ko  ~x) exp(? 2 j~xj ); well-known in the image processing literature under the name of Gabor function. The modulus of the wavelet G is a Gaussian, whereas its phase is constant along the direction orthogonal to ~ko . Thus the wavelet G smoothes the signal in all directions, but detects the sharp transitions in the direction perpendicular to ~ko . The angular selectivity increases with j~ko j, and even more so if, in addition, one introduces some anisotropy by taking  > 1. Then the modulus becomes a Gaussian elongatedp in the x direction, i.e. its `footprint' is an ellipse with large axis on the x-axis and ratio of axes equal to . Clearly this wavelet will detect preferentially singularities (edges) in the x direction, and its eciency increases with . The best selectivity will be obtained by combining the two e ects, i.e. by taking ~ko perpendicular to the large axis of the ellipse, thus ~ko = (0; ko). The resulting complex wavelet, denoted M , reads: Mor

2

13

M (x; y) = e

It is shown in Figure 2, for ko = 5:6 and  = 1:

ikoy

exp(? 21 ( x + y ): 2

(2:25)

2

1.0

-1.0 1.0

0.0

Figure 2: The Morlet wavelet M with ~ko = (0; 5:6);  = 1 (a = 1;  = 0): (top) real and imaginary part; (bottom) phase and modulus.

3 ATR STRATEGY USING THE CWT IN 2 DIMENSIONS 3.1 Description of the strategy Suppose we have an image containing a certain number of targets, embedded in a cluttered environment.How are we going to detect and identify the various targets in an automated way? At this state of our investigation we propose the following two step strategy algorithm: 1. Compute the wavelet transform in the position representation at all relevant scales a = aj and angles  = j (see Refs.11-12 for a complete discussion). This projection de nes a collection of snapshots with varying a and . This o ers the interesting possibility of visualizing the projection as a video sequence, which can be used for image analysis. For achieving the detection, consider the image obtained for each xed a = aj and  = j and threshold the coecients of the transform expressed in position variables (bx ; by ) and then add all the images together. The thresholding is performed in a dynamical way, becoming more severe for smaller a. Its e ect is to suppress the clutter information while preserving the target information. Thus, in the summation the target information is reinforced and becomes visually enhanced.Next, compute the centroids ~b = ~bi ; i = 1; :::; L in the resulting composite image. These centroids correspond to the positions of potential targets. One allows the possibility of false alarms by the adjustement of the thresholds. Eliminate the ambiguous targets. 2. At each remaining centroid ~b = ~bk ; k = 1; : : :; K (K  L); compute the wavelet transform of the composite image in the scale-angle representation. If the centroid ~bk corresponds to a genuine target Tk , the corresponding wavelet transform will exhibit a unique maximum (ak ; k ), which gives the size and the orientation of the target Tk . Moreover, the signature of each target in the scale-angle representation allows the discrimination between di erent targets. allow false alarm Input Data

-

~bk ~bi Detection Discrimination of Target recognition: - scale-angle of potential targets: non obvious targets representation k = 1:::K position representation i = 1:::L K

Suggest Documents