Multi-Target Discrimination with Linear Signal. Decomposition/Direction of Arrival Based ATR. Barry K. Hill, David Cyganski, and Richard F. Vaz. Machine Vision ...
Multi-Target Discrimination with Linear Signal Decomposition/Direction of Arrival Based ATR Barry K. Hill, David Cyganski, and Richard F. Vaz Machine Vision Laboratory Electrical and Computer Engineering Department Worcester Polytechnic Institute Worcester, Massachusetts 01609
ABSTRACT Large computational complexity arises in model-based ATR systems because an object's image is typically a function of several degrees of freedom. Most model-based ATR systems overcome this dependency by incorporating an exhaustive library of image views. This approach, however, requires enormous storage and extensive search processing. Some ATR systems reduce the size of the library by forming composite averaged images at the expense of reducing the captured pose speci c information, usually resulting in a decrease in performance. The Linear Signal Decomposition/Direction of Arrival (LSD/DOA) system, on the other hand, forms an essential-information object model which incorporates pose speci c data into a much smaller data set, thus reducing the size of the image library with less loss of discrimination and pose estimation performance. The LSD/DOA system consists of two independent components: a computationally expensive o-line component which forms the object model and a computationally inexpensive on-line object recognition component. The focus of this paper will be on the development of the multi-object Generalized Likelihood Ratio Test (GLRT) as applied to the LSD/DOA ATR system. Results will be presented from the testing of the LSD/DOA multi-object ATR system for SAR imagery using four targets, represented over a wide range of viewing angles. Keywords: automatic target recognition, detection, SAR, DOA, target classi cation.
1 Introduction One of the major goals of object recognition systems is to quickly and accurately distinguish one target from another. There is often a trade-o between the accuracy and the speed with which a target can be classi ed. The model-based automatic target recognition (ATR) technique developed by the authors incorporates a non-linear processing method known as Linear Signal Decomposition/ Direction of Arrival (LSD/DOA), which will be shown to oer accurate target classi cation with modest processing and model storage requirements. This work sponsored by ARPA and the ARO through grant no. DAAH 04-93-G-0237.
1.1 LSD/DOA ATR Algorithm The fundamental problem in model-based ATR systems is that the targets to be detected appear dierent, depending on the location and orientation of the target with respect to the sensing device(s). Consequently, the target models must account for all views and orientations of the target. Typically, this is accomplished by using either an exhaustive library of target views or a set of parameters which are invariant to target orientation and view. The rst method, which employs an exhaustive library, requires both the storage space necessary to store the library and also the processing time to be able to search through the library. In most cases this is not amenable to real-time applications, due to time and space constraints. Some methods have been developed which reduce the computational requirements of this method, such as averaging or blurring over subsets of views, on-the y template generation, or ordered search strategies, but these still rely on the fundamental search-based paradigm. 5
11
1
The second method, which exploits pose invariant parameters, overcomes the computational and storage requirements associated with the exhaustive library. This method, however, is only eective to the extent to which the invariant properties of the imaged targets are suciently discriminatory of the targets and non-targets. Some implementations are based on geometric invariance properties, as well as generalizations of the matched spatial lter concept, which includes synthetic discriminant function (SDF) lters. The pose invariant techniques are sometimes limited in the number of views that can be simultaneously associated with a single target model since a number of dissimilar views of a target taken together would produce a relatively featureless model. 8
9
2
A recently developed technique known as the Linear Signal Decomposition/Direction of Arrival (LSD/DOA) approach also provides means for model-based ATR of targets which may be viewed from unknown perspectives. Similar to the invariance-based techniques, the LSD/DOA approach does not require large target models, and the recognition process is direct rather than search-based. However, the models used in the LSD/DOA technique do not represent information which is invariant to target pose, but rather encode and exploit the relationship between target pose and signature so that the detection process simultaneously provides both pose estimates and target identity information. That is, these compact models incorporate the variation in target signature as a function of target pose; they exploit the information which is variant under changes in target orientation and position. This is quite distinct from the approaches described above, which rely on models consisting of invariant information. 4
The LSD/DOA algorithm eects a partitioning of the ATR problem into two stages: model construction and pose estimation/recognition. The model construction process involves solution of a large (usually over determined) set of equations to determine the elements of a particular basis for the image suite. This Reciprocal Basis Set (RBS) is developed such that the pose estimation/recognition stage can be performed directly and eciently, with no searching or iteration. That is, the computational burden associated with the ATR problem is largely shifted to the model-building process in this algorithm. 4
Generation of the RBS target models involves a great reduction in data, as a complete suite of object views is reduced to a small set of RBS elements. The number of basis elements used, and hence the size of the target model, can be chosen according to cost/performance considerations, but is in any event very modest compared to the the data from which the model is derived. The basis elements are generated such that linear projection of target images onto the basis elements will result in a set of inner product measures which simultaneously provide a sucient statistic for target matching and represent the data from which target pose parameter estimates can be determined. This is due to the fact that the RBS elements are chosen to encode the target pose into these inner product results, which are called Synthetic Wavefront Samples (SWS). These are so named because, for a given target image, the SWS will be samples of a multidimensional complex exponential wave, the directional cosines of which reveal the pose parameters of the imaged target. A Direction of Arrival (DOA) algorithm then uses the SWS to solve for the target pose parameter estimates. If more RBS functions are used, then this larger target
Complete Object Model
Reciprocal Basis Set Object Model Generator f(x, y;θ)
Off-Line Process On-Line Process
F(x, y)
Selected Object Image
Direct Linear Signal Projection
D.O.A.
θ
Figure 1: LSD/DOA Block Diagram model allows generation of more SWS, which in turn can provide better pose estimates and more reliable target detection. The reader is referred to Cyganski for mathematical and implementational details of the LSD/DOA algorithm; a block diagram depicting the algorithm is given in Figure 1. 4
Previous work has focused on development and enhancement of the LSD/DOA technique itself and on the accuracy with which a particular object's pose can be established under varying conditions. In this paper, we explore the eectiveness of this technique when applied to the problem of target classi cation. We present the means for an LSD/DOA-based ATR system which can classify predetected targets into some number of possible target classes. This classi cation involves a decision process which is based upon a generalized likelihood ratio test (GLRT). Results of testing with four targets are given. 3,10,7
2 Development of Multi-Object GLRT ATR systems generally perform the tasks of object detection and object classi cation/recognition separately. The detection process, also known as pre-screening, is often based on simple, energy-detection techniques. Candidate image regions thus \pre-screened" must then be subjected to a process which can classify objects according to the categories of interest for the application (target classes or particular target types) and subsequently (or simultaneously) provide any other required information, such as precise target position and pose. The classi cation of a candidate target into one of M classes constitutes an M-ary composite hypothesis decision system. This decision process will be explored for the LSD/DOA system, rst for the case of two hypotheses and then for M hypotheses.
2.1 Two Hypothesis Decision In the case of the LSD/DOA system, the signals to be detected are represented by the SWS, the measurements made on the images which encode both object identity and pose. As stated earlier, these SWS are generated from a linear projection of the target image onto the basis elements (RBS). Given statistics which describe the noise eects on the SWS measurements, a decision rule in the form of a generalized likelihood ratio test (GLRT) can be developed as follows: max
pR 1 j 1 (R j ) > < ; pR 0 j 0 (R j ) where R represents the vector of SWS for the given target assuming H is correct, and likewise R are the SWS assuming H is true. Similarly, is the pose assuming H is present and is the pose assuming H is present. g (R) = max 1
~
1
1
0
~
0
0
0
0
1
0
0
1
1
1
The global maximization over target poses that appears in the above expression brings about the problems of storage and search processing in traditional ATR systems discussed in the introduction. In the case of the LSD/DOA system, however, an immediate result of the processing of the image is a pose estimate for the given target hypothesis under test. Hence, an approximate optimizing value of assuming H is true is the pose estimate determined by the LSD/DOA algorithm, ^ , and similarly for . To simplify the expressions that follow we will assume that the targets are equally probable with equal associated cost functions, which reduces to one. Our nal GLRT equation is then 0
0
g (R) =
0
1
pR 1 j 1 (R j ^ ) > < 1: pR 0 j 0 (R j ^ ) ~
^
1
1
~
^
0
0
2.1.1 Noise Statistics of SWS
In general, the noise corrupting the SWS is due to background clutter and target speckle propagating through the system. The noise perturbations of the various SWS vector elements are generally not independent and not stationary. Furthermore, given the fashion in which the clutter and speckle induced errors are introduced into the system, a model for the SWS noise is not easily derived. However, under the simplifying assumption of Gaussian i.i.d. image noise, complete statistics for noise in the SWS image measurements can be derived. Throughout the following we will employ the following notational scheme. All vector (denoted by an underline) and matrix (denoted by a double underline) quantities will be assumed to represent complex quantities. The real and imaginary components have been separated and stacked in a partitioned vector or arranged in quadrants in a partitioned matrix so as to represent complex multiplication with real valued vectors and matrices. This notation is essential for writing expressions for Gaussian distributions of general complex valued random variables. Consider the image to be tested, f , to consist of a pose speci c image t and a noise component n~ ,
f = t + n~ ; where n~ is an independent, identically distributed (i.i.d.) real, zero mean Gaussian distributed random variable. The probability density function of n~ is therefore
pn (n) = G 0; n I0 00 : From this the probability density function of our test image, f , is 2
~
I 0
pf (f ) = pn (t + n) = G t ; n 0 0 ~
~
2
:
The SWS for a given image are generated through the complex inner product of the image and the RBS which can compactly be represented by the following linear matrix equation
4
7
R = A f; where A is a linear combination of the reciprocal basis vectors associated with a RBS. The reader is referred to King for a complete description. 7
Given this linear equation for the LSD/DOA system and given a Gaussian input image, the following statistics of the SWS can easily be derived:
pR (R) = G A t ; n A I0 00 A T
2
~
From the de nition of the LSD/DOA system, A t produces samples of a complex exponential. Letting
Q = n A I0 00 A T ; 2
the distribution for the SWS reduces to
pR (R) = G expj N ; Q : ~
Recall that the GLRT utilizing the pose estimate from the LSD/DOA system was de ned as p (R j ^ ) > g (R) = R 1 j 1 < 1; pR 0 j 0 (R j ^ ) ~
^
1
1
~
^
0
0
therefore, the conditional distributions are
pR 0 j 0 (R j ^ ) = G expj N 0 ; Q ~
^
0
0
^
0
1
=
(2 )N= Q 2
exp ? 21 (R T ? m T ) K (R ? m ) ;
1 2 0
0
pR 1 j 1 (R j ^ ) = G expj N 1 ; Q ^
~
1
=
0
0
0
0
^
1
1
1
(2 )N= Q 2
1
exp ? 21 (R T ? m T ) K (R ? m ) ;
1 2
1
1
1
1
1
where m = expj N 0 are the reconstructed SWS as generated by the pose estimate ^ , and K is the inverse of Q . Likewise for the H hypothesis. The Q matrices in the above equations are often singular owing to their construction and certain properties of the RBS and SWS formulation. In the case of a singular matrix, the pseudo-inverse of Q can be used for K , and Q can be interpreted as the product of all the non-zero singular values of Q. ^
0
0
0
0
1
6
The GLRT is therefore
4 pR 1 j H1 (R j H ) = Q g (R) = pR 0 j H0 (R j H ) Q ~
1
1
~
0
0
0 1
exp ? (R T ? m T ) K (R ? m ) > < 1: exp ? (R T ? m T ) K (R ? m )
1 2
1
2
1 2
1
1
1
1
1
0
0
0
0
0
1
2
Taking the log of this equation, we obtain the following expression in terms of sucient statistics: 1 (R T ? m T ) K (R ? m ) ? 1 (R T ? m T ) K (R ? m ) > 1 ln Q ? 1 ln Q : < 2 2 2 2 0
0
0
0
0
1
1
1
1
1
1
0
2.1.2 Extension to M-ary decision
The above GLRT is easily extended to an M-ary decision system. Combining the terms in the above equation we obtain 1 (R T ? m T ) K (R ? m ) + 1 ln Q ; 1 (R T ? m T ) K (R ? m ) + 1 ln Q > < 2 2 2 2 thus for this application the GLRT reduces to the strategy of choosing the signal with the greatest probability. A compact representation follows: 0
0
0
0
0
0
1
1
1
1
> p (R j H ) < p (R j H ): 1
1
0
0
Thus if a third target is introduced, the following decisions may be performed:
p (R j H ) p (R j H ) p (R j H ) 1
1
2
2
2
2
> < p (R j H ) > < p (R j H ) > < p (R j H ): 0
0
0
0
1
1
1
1
Again, the signal with the greatest probability is chosen. Therefore for the LSD/DOA system, the probability p (Ri j Hi ) must be evaluated for each hypothesis Hi , and the largest resulting probability will indicate the maximum likelihood hypothesis. A block diagram of this process can be found in Figure 2.
3 Multi-Target Discrimination Tests Using the multiple-target GLRT developed above, the LSD/DOA algorithm can be tested to determine its ability to distinguish between a number of targets. In these tests, only the classi cation stage of ATR will be performed, thus, no null hypothesis will be processed.
3.1 SAR Target and Background Images In the LSD/DOA algorithm a set of target exemplars is reduced to a compact target model, the RBS. In order to generate the RBS a set of target images, known as a training suite, is required. Also, since multiple targets are to be tested, comparable suites for various targets must be generated. The basis for our target generation was a set of spotlight SAR phase history les provided by Wright Laboratories, Wright-Patterson AFB. The tests described in the following are based on data for a T72 tank, an M1 tank, a retruck, and a schoolbus. The target exemplars were generated from L band data, a 10 degree elevation angle, and HH and VV polarization data which were used to form a single, polarimetrically whitened image. From this data a sequence of 318 images were generated, representative of monostatic illumination and 318 uniformly spaced viewpoint orientations over 360 degrees of the azimuthal orientation. The SAR target images that were reconstructed were downsampled so as to achieve approximately a 1 ft. by 1 ft. range and cross-range resolution. Figure 3 shows 8 of the 318 training images generated for each of the four targets. The rst row shows examples from the H suite (the T72 tank), followed by examples of H (M1 tank), H ( retruck), and nally, H (schoolbus). 0
1
2
3
The background used for the following tests was SAR clutter image data obtained from Lincoln Laboratories. The images were constructed as polarimetrically whitened images and depict terrain in Stockbridge NY, such as elds, trees, and roads, in 1 ft. by 1 ft. resolution. The targets were overlayed onto the clutter background by masking out a region of the clutter corresponding to the convex hull of the brightest target pixels and inserting the target image into the masked area.
3.2 Corruption with Speckle Noise To complete the generation of realistic test cases, each target image was corrupted by the addition of specklelike noise. As these images are logarithmic intensity SAR images, the originally multiplicative SAR speckle noise process can be introduced to the processed images as an additive noise process. In the case of a single polarization image, the noise process must have a log-gamma probability density. However, in the case of polarimetrically whitened images the noise distribution is dependent on the cross correlation of the various polarizations (HH, VV, HV). This correlation is usually unknown. However, the noise can be approximated by independent Gaussian noise owing to the small inter-sample correlation and the relatively Gaussian character of the distribution that results from combining several log-gamma distributed random variables. Thus in the generation of the target images, Gaussian noise with a speci c mean and variance representing
RBS H 0
Acquired Image
R0 Direct Linear Signal Projection
D.O.A.
θ0
D.O.A.
θ1
D.O.A.
θ2
Signal Model
m
0
RBS H 1
R1 Direct Linear Signal Projection
Signal Model
m
1
RBS H 2
R2 Direct Linear Signal Projection
Signal Model
m
2
m m m 0, 1, 2 ...
θ0 , θ1 ,θ 2 ...
R 0 , R 1 , R 2 ...
Multi-Target GLRT
Figure 2: LSD/DOA Multi-Target GLRT Block Diagram
Chosen Hypothesis
Figure 3: Examples of SAR Training Images Used for RBS Construction. Targets shown are a T72 tank, an M1 tank, a retruck, and a schoolbus, respectively. polarimetrically whitened speckle is added to each log magnitude target pixel. An example of the test images used to test the multi-target discrimination system can be found in Figure 4. These images represent realistic SAR targets overlayed on man-made discrete rich clutter through the process described above.
4 Multi-Object Discrimination Results The ability of the LSD/DOA ATR system to correctly classify a number of targets was evaluated through the classi cation of a series of test images. Both the correct and incorrect classi cations were tabulated and will be presented in the form of a confusion matrix. In a confusion matrix the rows represent targets being tested, and the columns represent the classi cation of the target. Thus, the matrix shows the percentage of the cases in which a given target was classi ed as the possible target hypothesis. For example, a perfect confusion matrix would have 100 percent on the diagonal, showing that for 100 percent of the cases target H was classi ed as H , and target H was classi ed as H , etc. 1
1
2
2
4.1 LSD/DOA Results The Reciprocal Basis Sets (RBS) which constitute the image models or linear lters in the LSD/DOA ATR system were generated using a number of optimized methods recently developed. In the generation of the RBS, only 53 of the 318 model images (as generated above) were used to increase noise immunity. Also, for all of the RBS generated, a pixel-usage weighting scheme was used to tailor the distribution of energy in the RBS functions so as to reduce the in uence of target pixels on the periphery of the target. In the case of symmetric target
Figure 4: SAR Test Images. T72 tank, M1 tank, retruck, and schoolbus respectively Table 1: Confusion Matrix Results Target Target Target Target
H H H H
H
0 1 2 3
Classi cation
0
H
H
1
2
H
3
99.7519 0 0.0194969 0.2286031 0 98.2906 0.155346 1.554054 0 0 93.7642 6.2358 0 0 0.2433 99.7567
models, in particular the schoolbus, a set of 4 RBS were formed, each modeling only 90 degrees of rotation of the target model. This division of the target suite produced RBS models which were more discriminatory of the targets. 6
The LSD/DOA algorithm implemented for this series of tests used the multi-object GLRT which is speci cally optimized for the target lters being applied, as developed above. In addition, the recently developed nonstationary noise optimized Kay-Estimator DOA algorithm was used as the DOA component of the LSD/DOA system. For the test, each of the 318 speckled target images were overlayed onto 1000 dierent background images (as described previously) to create a total of 318,000 test images for each target. Since there were four possible targets, a total of 1,272,000 images were tested. 7
Each image was tested using the system shown in Figure 2. The results were stored, and the confusion matrix was generated. In this test, hypothesis 0 was the T72 tank, H was the M1 tank, H was the retruck, and H was the schoolbus. Table 1 shows the confusion matrix for the described test. Correct classi cation occurred for more than 93% of the instances of each of the four targets; the overall percentage of correct classi cation was 97.9%. One should note that since the multi-target GLRT is highly dependent on the pose estimate, ^, these results also demonstrate accurate pose estimation. 1
2
3
5 Conclusions The LSD/DOA ATR system has been shown to provide useful classi cation of SAR images in speckle with modest storage and processing requirements. Four targets with similar radar signatures{a T72 tank, an M1
tank, a schoolbus, and a retruck{were used to form compact reciprocal basis set object models of the targets at arbitrary azimuthal angles. A generalized likelihood ratio test was developed to provide a multiple-hypothesis decision system for the LSD/DOA technique. Speckle-corrupted SAR images of these targets were then used to test the discriminatory power of a target classi er using this technique. Targets were correctly classi ed in 97.9% of the trials.
6 REFERENCES [1] Ben-Arie, J., and Z.A. Meiri, \3-D Object Recognition by Optimal Matching Search of Multinary Relations Graphs," Computer Vision, Graphics, Image Processing, vol. 37, pp. 345-361, March 1987. [2] Chang, W.T., D. Casasent, and D. Fetterly, \SDF Control of Correlation Plane Structure for 3-D Object Representation and Recognition," SPIE Vol. 507, Processing and Display of Three-Dimensional Data II, 1984. [3] Cyganski, D., B. King, R.F. Vaz, and J.A. Orr, \ROC Analysis of ATR from SAR images using a Model-Based Recognizer Incorporating Pose Information" SPIE 1995 Symposium on OE/Aerospace Sensing and Dual Use Photonics, April 1995, Orlanda, Florida. Viewable on the WWW, http://xfactor.wpi.edu/~Works/Papers.html. [4] Cyganski, D., R.F. Vaz, and C.R. Wright, \Model-Based 3-D Object Pose Estimation from linear image decomposition and direction of arrival techniques," Proc. SPIE Proceedings, Conference on Model-Based Vision, vol. 1827, November 1992, Boston. Viewable on the WWW, http://xfactor.wpi.edu/~Works/Papers.html. [5] Dyer, C.R., and S.B. Ho, \Medial-Axis-Based Shape Smoothing," Proc. Seventh ICPR, pp. 333-335, July 30, 1984. [6] Hill, B.K., to appear in M.S. Thesis, Worcester Polytechnic Institute, 1996. [7] King, B., \Optimization of the LSD/DOA ATR Technique for Non-Stationary Correlated Gaussian Noise" M.S. Thesis, Worcester Polytechnic Institute, 1995. Viewable on the WWW, http://xfactor.wpi.edu/~Works/Ouvres.html. [8] Reeves, A.P., R.J. Prokop, S.E. Andrews, and F.P. Kuhl, \Three-Dimensional Shape Analysis Using Moments and Fourier Descriptors," IEEE Trans. on PAMI, vol. PAMI-10, pp. 937-943, 1988. [9] VanderLugt, A.B., \Signal Detection by Complex Matched Spatial Filtering," IEEE Trans. Inf. Theory, vol. IT-10, p. 139, 1964. [10] Vaz, R.F., D. Cyganski, B. King, \An ROC Comparison of Pose-Invariant and Pose-Dependant Model Based ATR" 4th ATR Systems and Technology Symposium, November 1994, Monterey, California. Viewable on the WWW, http://xfactor.wpi.edu/~Works/Papers.html. [11] Verbout, S.H., W.W. Irving, A.S. Hanes, \Improving a Template-Based Classi er in a SAR Automatic Target Recognition System by Using 3-D Target Information," MIT Lincoln Laboratory Journal, vol. 6, No. 1, pp. 53-71Spring 1993.