Aspect Dependent Drivers for Multi-Perspective Target Classification Michele Vespe, Chris J. Baker, and Hugh D. Griffiths Department of Electronic and Electrical Engineering University College London London, United Kingdom Email:
[email protected] target acquires signatures from a number of viewing angle depending on the trajectory).
Abstract— In this paper, a 2-D classifier using Radial Basis Function Neural Networks (RBFNNs) has been implemented combining two imageries collected by different locations to prove the classification rates enhancement given by aspect diversification. Principal Components Analysis (PCA) is applied to features extracted from a masked version of the SAR image using the sole target's backscattering and shadow information. The classification performance, examined in terms of Receiver Operator Characteristic (ROC) curves is presented using MSTAR data for a population formed by six classes plus two unknown and two independent targets. The resulting performance shows a reduction of the probability of false alarm, related to an improvement of probability of declaration and correct classification in comparison with the traditional single aspect case.
I.
ATR can be based either on 1-D signatures, i.e. features from the range profile collected by the system representing the pattern used to train and test the classifier, or 2-D imageries which are the outcome of the processing of a sequence of 1-D signatures. The former is often used for its simplicity in terms of implementation and signal processing but gives lower classification rates if compared to the latter which guarantees a more detailed representation of the target backscattering in terms of classification performances [7], although there is a need of more accurate signal processing and particular movement requirements of the object. In this paper, the recognition process is based on features extracted from 2-D signatures and is investigated using one and two distinct signatures displaced by a variable aperture, showing the benefits brought by multiple perspectives.
INTRODUCTION
Automatic Target Recognition (ATR) is traditionally attempted using radar signatures from a single point of observation. A possible improvement in classification performance and reliability is directly connected to the capability of describing the target exploiting higher spatial resolution [1] or with the aid of polarimetric information [2]. Despite the quality increase of advanced radar systems, the major sources of misclassification are connected to the perspective that senses the object. Although the possibility of overcoming some of the causes of range profiles variability by averaging a number of imageries, other phenomena (e.g. occlusion, clutter returns and multi-path) persist and can be averaged out only by sensing the target from a distinct perspective. Furthermore, particular target orientations offer signatures dominated by a few scattering centres that cover the point scatterers return, making the imageries similar for any class of the training set, i.e. the necessary information in terms of recognition is obscured. However, there has been little research examining the utility of angular diversity for improved classification performance [3]-[6], although it is intuitively clear that there is additional information in multiple perspectives and it is relatively easy for existing systems to collect such data (e.g. an aircraft flying past the
After a brief description of the MSTAR dataset and subpopulation used, a section introducing the feature extraction algorithm is presented. This includes PCA applied to the masked version of the original image derived after discriminating the target backscattering region and its shadowed area from the clutter. This is followed by the description of the single-aspect classifier based on RBFNNs and its multi-perspective implementation. The twoperspective benefits are subsequently illustrated in terms of correct classification rates and ROC curves. Conclusions are then outlined in the final section. II.
The population of targets consists of ten military ground vehicles from MSTAR database [8]. The imageries are formed in a spotlight mode covering 360º, achieving a resolution of approximately 30 cm in both slant and cross range. A set of images processed from 17º of depression is used to train the classifier, while the testing set is formed by 15º depression images. Although the scattering centres migration starts to be pronounced, classification can be
Electro-Magnetic Remote Sensing (EMRS) Defence Technology Centre established by the UK Ministry of Defence.
0-7803-9497-6/06/$20.00 © 2006 IEEE.
DATA DESCRIPTION
256
successfully attempted, simulating a real world scenario where usually the templates database does not fully describe all the possible azimuth and elevation deviation relative to the target. Six targets are selected to form the set of training classes (T72, BTR70, BMP2, C2S1, T62, and ZIL131). Two independent targets (T72ind and BMP2ind), having a representation class in the template set but without being used to train the classifier, measure the classification performances in case of targets presenting different features from the ones used to represent the class they belong to. Eventually, two unknown targets (BTR60 and D7), without representation in the training set, give a gauge of the classification rates in case of objects not depicted by the classifier. III.
As can be observed from Fig. 1, the result of this discrimination is the detection of the shadow area where the only measurement noise at the receiver is present, and the radar area described by neighbourhood of pixels having higher mean and standard deviation if compared to the clutter return. An image mask is eventually obtained and subsequently applied to filter the original image: the shadow area is represented by constant negative value pixels, the clutter by zero-valued pixels while the target backscattering intensities are preserved as useful information. B. Principal Component Analysis The reduction of the data dimension, especially in image recognition problems, is often necessary to avoid the intrinsic redundancy within the data and the large computational burden during the classification process. Thus we may consider SAR images as providers of feature vectors that are to be separated. After discrimination and segmentation, the radar imagery contains only those pixel values describing the target backscattering and shadow. By reshaping row by row the 2-D image into a 1-D vector, the number of elements representing the original matrix of intensities can be reduced with an information loss which can be assumed negligible. Furthermore, dimensional reduction of this type also attempts to emphasise the differences between patterns and hence enhance classification performance. Principal Components Analysis (PCA) [10] is a statistical method that enables the data to be represented in a different vector basis that removes similarities (which therefore don’t contribute to the classification process) in terms of the directions presenting the smallest variance in the dataset. After subtracting the mean value vector t representing mean value of the training set T from each available template, the zero-mean dataset is produced and the covariance matrix C can be calculated as follows:
FEATURE EXTRACTION
In a typical pattern recognition problem, it is often necessary to reduce the data dimension of the input of the classifier and increase the separation between targets. The procedure of extracting various patterns from a signature or image for the purpose of identifying and interpreting meaningful characteristics about objects from images is usually referred to as Feature Extraction. In this paper, features from the target backscattering and shadow areas are first detected in the image. The dimension of the classifier input vector is then reduced using PCA on the processed image. A. Target Discrimination The need of discriminating the target radar backscattering in SAR images from the clutter return has been exhaustively treated especially in case of data-sets presenting the same target location in both the training and testing sets, that means also the same clutter conformation [9]. For this reason, the feature vector is formed after clutter cancellation. Subsequently, the information processed to form the feature vector is extracted from the images after isolating the target area and its shadow using an adaptive procedure that applies two distinct thresholds to the image on the basis of the clutter statistical parameters.
(a.1)
C = E [(tn − t )(tn − t )T ]
where E is the expected value. After calculating the eigenvectors of the covariance matrix, the K most significant eigenvectors with the largest eigenvalues are selected to form the new basis vector W. The test and training feature vectors can then be transformed as follows:
(b.1)
t ' =W T (t n − t )
(a.2)
(1)
(2)
The number of principal components K is chosen as a function of the correct classification rate achieved. This usually becomes stable once the PCs necessary to fully describe the data have been selected. After testing the classifiers, their mean value correct classification rates are plotted versus the number of principal components representing the feature vectors. For the data considered, the original MSTAR image is a 128x128. Since the target is centred, the images are further reduced to 42x85. The second dimension is along the slant range direction. This is greater than the cross-range one because of the small depression
(b.2)
Figure 1: Two different thresholds based on the clutter statistical parameters are applied to the smoothed versions of the original images (a.1,2). Subsequently, the target and shadow areas are detected to form a filtered images (b.1,2).
257
where the element ∆φi,j = φi – φi represents the angular displacement between the i-th and j-th node. Hence, in a twoperspective environment, after fixing ∆φ1,2 = φ, the twoperspective classifier is tested with all possible pairs of signatures displaced by φ, covering any possible target heading. Although the target has already been assumed to have been detected and tracked, the angular displacements of the nodes are unknown to the network. The network topology Ф, i.e. the angular aperture between the nodes, is not processed information by the multi-perspective classifier. Whilst this is only one of the possible approaches to multiple views of the target, it is relatively simple to implement. Furthermore, the computational burden is reduced to the minimum.
angle that largely extends the shadow region. After applying the discrimination algorithm, the image is reshaped into an M dimensional vector, where M = 3570 is the number of pixels selected. PCA shows that only 50 principal components are sufficient to stabilise the correct classification rate. IV. SINGLE-ASPECT CLASSIFICATION The classification approach chosen is using RBFNNs because of their flexibility and velocity of both learning and execution phases. These particular feed-forward neural networks have only three layers (input, hidden, output)., and only the neurons belonging to the hidden layer show a nonlinear response: the Radial Basis Function (RBF). In this paper a Gaussian RBF has been used. Each RBF is centred on a small cluster that represents a subclass.
A. Target Discrimination The two-perspective classifier has been implemented preserving the structure of multi-perspective classifiers [6]: a first parallel single-perspective processing of the signatures is followed first by a further processing of the partial outcomes from the single-perspective classifiers and, eventually, by the final decision stage. This structure allows for parallel implementation of the procedure and therefore reduces the time of execution.
In Table I, the results of the forced decision environment (i.e. the classifier must declare a target class for any input presented), the correct classification rates of the RBFNNs classifier are shown for each target. Since the classifier is forced to declare, the “Unknown” class is not included. The correct classification rate achieved is CCR = 93.57 % and is a measure of the classifier capability of recognising objects it was trained with. As can be observed, the classification performance on the independent targets is well below the classification rates of the relative targets used to identify a class. This is mainly due to different features exhibited by the independent targets such as different loads or turret position. This aspect makes the classification task still unreliable in many applications. As it will be shown afterwards, this effect leading to upset generalisation is reduced when another perspective is used to describe the target backscattering. TABLE I.
The output of the single-perspective classifier can be thought as a measure of the likelihood that the input vector has been observed from a particular class. Therefore, assuming the signatures related to different perspective as statistically independent, we can consider each singleperspective contribution as a weight for the multi-perspective decision. B. ROC curves The classification process typically requires a high probability of correct classification Pcc , i.e. the probability of recognising a known target [11]. As previously mentioned, this is calculated in terms of Correct Classification Rates (CCR), that is the Pcc, i.e. the probability of recognising a known target [11]. As previously mentioned, this is calculated in terms of CRR, which is the Pcc on a finite number of measurements. By defining the probability of declaration Pd as the probability that the classifier makes a decision for one of the known targets, the Pcc becomes the probability of correct classification given that a declaration has been made.
CLASSIFICATION RESULTS IN A FORCED DECISION ENVIRONMENT ( PD = 1 AND PFA = 1) T72
BTR70
BMP2
C2S1
T62
ZIL131
T72
94.9
0.5
2.0
0.5
2.0
-
T72ind
71.3
2.6
8.7
2.6
14.9
-
BTR70
0.5
98.5
-
0.5
0.5
0.5
BMP2
3.6
1.5
93.3
1.0
-
BMP2ind
14.8
10.2
64.8
4.1
5.6
0.5
C2S1
3.3
9.5
1.5
80.3
5.1
0.4
T62
5.9
0.7
4.4
1.5
87.2
0.4
ZIL131
-
-
3.7
1.1
-
95.3
BTR60(un)
-
0.4
11.3
20.8
31.0
36.5
D7(un)
6.3
49.6
33.2
4.3
1.2
5.5
The declaration properties of a classifier are strictly connected to the unknown threshold specified which determines the rejection of unknown targets. The output of the non linear mapping of the RBFNNs classifier is an Nunipolar cube [0,1]N where N is the number of classes forming the population problems. The decision is made by locating the greatest component of the output vector which can be therefore considered as a degree of confidence of the decision. By specifying different uncertainty thresholds, the desired Pd is obtained. The probability of false alarm Pfa can be defined as the probability of making a declaration when an unknown target is presented as input to the classifier.
V. MULTI-PERSPECTIVE CLASSIFICATION In a N-perspective scenario, the parameter that distinguishes the perspectives topology is the vector:
Φ = {∆ϕ i , j : i, j = 1,..., N }
(3)
258
Therefore, if the classifier is forced to declare (Pd = 1), the uncertainty threshold is set to zero and the false alarm rate is maximum (Pfa = 1).
ROC curve "b" 1
0.95
0.9
TABLE II.
CONFUSION MATRIX AND PROBABILITIES Class B
Class C
Unknown
Class A
Pcc,a
Pm-i,a
Pm-i,a
Pms,a
Class B
Pm-i,b
Pcc,b
Pm-i,b
Pms,b
Class C
Pm-i,c
Pm-i,c
Pcc,c
Pms,c
Unknown
Pfa,a
Pfa,b
Pfa,c
1-ΣPfa
Pcc
0.85
Class A
0.75
0.7
0.65
0.6 0.6
As can be deduced from Table II, where Pms,i is the probability of not declaring the known target belonging to class i, and Pm-i,j is the probability of misidentifying a known target as belonging to class j, the probability of false alarm is the probability of declaring an unknown target.
RESULTS
ROC curve "a" 0.9
0.8
0.7
Pd
0.6
0.5
0.4
0.2
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.8
0.85
0.9
0.95
1
In Fig. 4, the correct classification rates are plotted versus the angular displacement of the nodes for a forced decision environment. The green line is the result of a two-perspective classifier with zero degrees separation (i.e. a singleperspective case). The two-perspective (blue) CCRs emphasise that as soon as the two perspectives decorrelate, the information content of the two signatures increases with the result of aiding the classification task. The average (red) demonstrates the overall efficiency of the second perspective contribution compared to the single-perspective case (green). The CCR measured by the single-perspective classifier can be observed when the two sensors are co-located (i.e. when ∆φ1,2 = 0º). As the angular displacement increases, the two perspectives start to decorrelate yielding a better description of the radar backscattering and a performance improvement.
0.3
0.1
0.75
The ROC curve “b” gives a measure of the percentage of declaration effectively correct, illustrating the relationship between Pd and Pcc; since the probability of correct classification upper bound is the probability of declaration, the bisector represents the ideal classifier, where any declaration is correctly made for the relative class. The ROC curves show an overall improvement with respect to the single-perspective case although a slight dependence on the nodes location. As regards the ROC curve “b”, the benefits of perspective diversification are valuable also for small angular displacements between the perspectives, while the multi-aspect effectiveness on the relationship between Pfa and Pd require a greater perspective separation. To understand better the classification improvements, the two curves need to be related to each other: after fixing the tolerable false alarm rate, the corresponding probability of declaration is then used to find the CCR related to that parameter. For example, if a Pfa = 0.6 is fixed, the singleperspective classifier guarantees that Pd = 0.92 and, from the curve “b”, a Pcc = 0.87. By using a second perspective, the probability of declaration becomes Pd = 0.97 leading to a Pcc = 0.96, showing an increase of correct classification of about 9%.
1
0
0.7
Figure 3: Pcc versus Pd for a single-perspective classifier (green), mean two-perspective performance over 360º (red) achieved and two-perspective performance using different angular displacements (5º, 20º, 55º).
In Fig. 2 and 3, the graphs are related to the ROC curves are shown, in particular, the mean value (red) of all the possible angular displacements between the two radar locations over 360 degrees, the single-perspective case (green) and other perspective locations performances (5º, 20º, 50º and 90º).
0
0.65
Pd
The ROC curves take into account the relationships between the three parameters Pcc, Pd, and Pfa. Depending on the application, the declaration probability is configured and, as a consequence, the false alarm and correct classification rates adjusted. VI.
0.8
1
Pfa
Figure 2: Pd versus Pfa for a single-perspective classifier (green), mean two-perspective performance over 360º (red) achieved and two-perspective performance using different angular displacements (5º, 20º, 55º). The ROC curve “a” shows the relationship between Pfa and Pd : a random classifier would present a linear proportion between false alarm and declaration rates, whereas for a robust classifier, the probability of declaration is greater than the false alarm rate.
259
In particular, the independent targets present a very high overall CCR increase from the single-perspective classifier (71.3% to 84.8% for the T72ind and 64.8% to 78.9% for the BMP2ind).
1
0.99
0.98
0.97
VII. CONCLUSION
Pcc
0.96
The results of employing multiple signatures sensing a radar target have been presented using 2-D imageries of a six-class population set. The ROC curves have been described, showing the remarkable benefits offered by multiple perspectives. Furthermore the perspective diversification effectiveness on the relationship between false alarm rate and probability of declaration requires a greater perspective separation if compared to the one between declaration and correct classification. The independent targets also show a valuable increase in correct classification rates if compared with the results obtained using a traditional single view of the target.
0.95
0.94
0.93
0.92
0.91
0.9 0
50
100
150
200
250
300
350
Angular Displacement (Degrees)
Figure 4:
Correct classification rates of a two-perspective classifier at different angular displacement between the nodes in a forced decision environment.
When the two perspectives are separated by π/2, the correct classification rates drop as a consequence of the likelihood of presenting to the classifier two signatures both dominated by specular reflections, i.e. the low level scatterers, useful in terms of classification, are obscured. The CCR also decreases when the two perspectives are separated by π. In addition to the same effect described when π/2 occurs, in this case the number of geometric symmetries of manmade objects make a number of signatures similar having π separated headings. Nevertheless, the average CCR obtained over 360º (red) is 97.65%, showing an increase of approximately 6%. In Fig. 5, the CCRs are now presented for the single targets. Each curve represents the particular class behaviours at different angular displacements. The frequency and amplitude of CCR fluctuations reveal a strong dependency of the backscattering on the target geometrical properties when examined in a multiple perspective environment.
ACKNOWLEDGMENT The work reported in this paper was funded by the Electro-Magnetic Remote Sensing (EMRS) Defence Technology Centre (DTC), established by the UK Ministry of Defence and run by a consortium of Selex, Thales Defence, Roke Manor Research and Filtronic. REFERENCES [1]
L. M. Novak, S. D. Halversen, G. Owirka and M. Hiett, “Effects of polarization and resolution on SAR ATR”, IEEE Transactions on Aerospace and Electronic Systems vol. 33, Issue 1, Jan. 1997, pp. 102-116. [2] Sadjadi, F., “Improved target classification using optimum polarimetric SAR signatures”, IEEE Transactions on Aerospace and Electronic Systems vol. 38, Issue 1, Jan. 2002 pp. 38-49. [3] Shihao Ji, Xuejun Liao and L. Carin, “Adaptive multiaspect target classification and detection with hidden Markov models”, IEEE Sensors Journal, vol. 5, Issue 5, Oct. 2005, pp. 1035-1042. [4] P. R. Runkle, P. K. Bharadwaj, L. Couchman and L. Carin, “Hidden Markov models for multiaspect target classification”, IEEE Transactions on Signal Processing, vol. 47, Issue 7, July 1999, pp. 2035-2040. [5] M. Vespe, C. J. Baker, H. D. Griffiths, “Node Location for Netted Radar Target Classification”, 2nd Annual IEEE Waveform Diversity Conference, in press. IEEE Transactions on Aerospace [6] M. Vespe, C. J. Baker, H. D. Griffiths, “Multi-Perspective Target Classification”, 2005 IEEE International Radar Conference, pp. 877882. [7] L. M. Novak, “A Comparison of 1-D and 2-D Algorithms for Radar Target Classification”, IEEE International Conference on Systems Engineering, 1991, pp. 612. [8] Air Force Research Laboratory, www.sdms.afrl.af.mil/datasets/mstar/ [9] R. Schumacher, J. Schiller, “Non-cooperative target identification of battlefield targets - classification results based on SAR images”, 2005 IEEE International Radar Conference, pp. 167-172. [10] R. O. Duda, P. E. Hart and D. J. Stork, “Pattern Classification”, John Wiley and Sons, second edition, 2001. [11] ] R. A. Mitchell, J. J. Westerkamp, “Robust statistical feature based aircraft identification”, IEEE transaction on Aerospace and Electronic Systems, vol. 35, Issue 3, July 1999, pp. 1077-1095.
T72 BTR70 T72ind BTR60 c2S1 T62 ZIL131 D7 BMP2Ind BMP2
1
0.95
0.9
0.85
Pcc 0.8
0.75
0.7
0.65
0.6
0
50
100
150
200
∆ϕ1, 2
250
300
350
Figure 5:
Correct classification rates related to different targets of two perspective classifier at different angular displacement between the nodes.
260