Proc. of EuroFusion99, T. Windeatt and J.O’Brien, eds., Stratford-upon-Avon, UK, Oct., 1999
159
Detection of anti-personnel land-mines using sensor-fusion techniques Frank Cremer , John Schavemaker, Eric den Breejen and Klamer Schutte TNO Physics and Electronics Laboratory P.O. Box 96864, NL-2509 JG The Hague, The Netherlands Phone: +31 70 374 0795, Fax: +31 70 374 0654 Email:
[email protected] Abstract In this paper we present the sensor-fusion results based on the measurements obtained within the European research project GEODE (Ground Explosive Ordnance DEtection system) that strives for the realisation of a vehicle-mounted, multisensor, anti-personnel land-mine detection system for humanitarian demining. The applied decision-level sensor-fusion techniques are Bayesian approaches, application of DempsterShafer theory, fuzzy probabilities, rules, and voting techniques. For the evaluation of the performance of sensor fusion, we introduce a novel algorithm that provides a less biased estimate of the performance measured in probability of detection and probability of false alarm. The evaluation method differs from common performance evaluation methods in the sense that it takes into account the number of false alarms as well as the area of false alarms. Furthermore, application of this measure of false alarms leads to intuitive receiver-operating characteristic curves. Keywords: Anti-personnel land-mine detection, sensor-fusion, Bayes, Dempster-Shafer theory, fuzzy probabilities, rules, voting.
1 Introduction The existence of (abandoned) anti-personnel land-mines in a large number of post-war areas forms a major threat to human lives in these ar-
Also affiliated with the Pattern Recognition Group and
the Section of Applied Geophysics, Delft University of Technology, The Netherlands
eas. Manual prodding and clearing of land-mines is not sufficient to stop the increasing in the number of active mines worldwide. As such, the detection of land-mines by any (technical) means is an important research issue. Our current research [1, 2] focuses on the use of multiple sensors to land-mine detection, as opposed to approaches which use a single sensor. The use of one sensor is generally believed to be insufficient for landmine detection meeting the requirements of humanitarian demining for the reason that a single sensor has a false-alarm rate which is too high or a detection rate which is too low. The goals of sensor-fusion are to reduce the probability of false alarms ( fa ), to increase the probability of detection ( ), or to improve a combination of both.
Related work on mine detection applying sensorfusion techniques is reported by a large number of authors [3–16]. Most work, however, lacks a sufficiently described experimental section. Hence, it is difficult to draw conclusions on the system’s performance or to compare it with other systems. Furthermore, the sample sizes of the experiments are small and in a number of experiments no distinction is made between a learn and test set of examples, making the obtained detection rates biased (optimistically). The work described in this paper uses the methods as presented on EuroFusion98 [2] and applies them on real data. Since our SPIE contribution [1], new infrared processing is developed, a new fusion method (voting) is included and the methods are evaluated with independent training and evaluation sets using a leave-one-out evaluation method.
Proc. of EuroFusion99, T. Windeatt and J.O’Brien, eds., Stratford-upon-Avon, UK, Oct., 1999
160
Object
x [m]
y [m]
Metal
Size
Depth
Object
x [m]
y [m]
Metal
Size
Depth
MAUS 1 Mle 72A PFM 1 Cartridge case Piquet 62 MD 82B (M14) Trip wire Mle 51 TS 50 PMN Stone Trip wire VS 2.2 Mle 51 Gum paper Mle 59
0.50 0.25 0.75 0.50 0.75 0.25 0.75 0.75 0.50 0.75 0.75 0.75 0.75 0.25 0.25 0.25
0.25 1.75 2.88 4.50 6.25 7.63 9.50 11.50 13.50 15.50 17.25 18.13 19.25 20.38 21.75 23.38
low low high high no low high no low high no high low no low no
large small large small small small small small large large large small large small small small
surface surface surface buried surface surface surface surface buried surface buried surface buried buried buried buried
Mle 59 VS 1.6 Foot print MK 2 Mortar 60 Cyl. print PRB 409 Mle 72A Mle 59 TS 50 Can VS 69 HEC3A1 Mle 51 BLU 62 PMN
0.75 0.50 0.25 0.25 0.50 0.50 0.50 0.25 0.25 0.50 0.25 0.25 0.50 0.75 0.75 0.75
1.13 2.63 3.88 5.38 7.25 8.50 10.38 12.50 14.50 16.50 17.75 18.50 19.50 21.25 22.50 24.25
no low no high high no low low no low high high low no high high
small large large small large large large small small large large large small small small large
buried buried surface buried buried surface buried surface surface buried buried surface surface buried buried surface
Table 1: All mines and false alarm objects and there location (x,y) in the test lane.
2 Sensors
2.2
Our system has three types of sensors: a dualfrequency metal detector developed by F¨orster, a 3 m-5 m and 8 m-12 m infrared camera, and ground penetrating radars of ELTA (high ground clearance) and Emrad (low ground clearance). In the following sections we describe how sensor data is acquired and processed in such a way that it is suitable for sensor-fusion.
2.1 Test lane Within the GEODE project, THOMSON-CSF DETEXIS has constructed a test lane near Paris, France (see also [1]). This test lane consists of an area of 25 square meter (25 m 1 m) which contains 26 mine objects that are buried or laid on the surface. Additionally, the lane contains six false-alarm objects. Table 1 gives details on the mines and false-alarm objects.
The test lane is divided into three parts representing different types of terrain. The first part is bare agricultural ground, the second part is a vegetation area, and the third part is bare sand. The agricultural part is 15 meter long, the vegetation part five meter, and the sand area is also five meter long. For the measurements, the different sensors were connected to a trolley and moved over the test lane.
Sensor-data acquisition
The use of different kinds of sensors suggests a decision-level fusion process, which is computationally efficient and more easy than data- or feature-level fusion [17]. To perform decisionlevel sensor-fusion, the raw sensor data must be processed and mapped to obtain decision-level data on a reference grid. Within the GEODE consortium, it was agreed that the sensor processing will produce confidence values on a grid with grid cells of 2.5 cm 2.5 cm. This ensures detection of even the smallest land-mines.
A confidence value at a certain grid cell expresses a confidence or belief in the presence of a mine on that position. A confidence value has a high correlation with the associated probability of detection but has no statistical meaning. The confidence values are used to indicate an order in probability of a detection of an object given a certain sensor. This means that a higher confidence value implies a higher probability of a mine, but these do not have to scale linearly.
2.3
Sensor-data processing
The GPR of ELTA is a high ground clearance GPR. An object detected by this sensor is represented on the grid with a discrete disk with a diameter of 40 cm and a certain confidence value. Two additional rings with lower confidence values and a radius of 15 cm surround the disk to
Proc. of EuroFusion99, T. Windeatt and J.O’Brien, eds., Stratford-upon-Avon, UK, Oct., 1999 represent uncertainty in the position of detection. The confidence values for these rings are set to 35% and 5% of the confidence value of the disk centre. The low ground clearance GPR system of Emrad is based upon a four-channel radar with a bandwidth comparable to the expected radar crosssection of anti-personnel mines. Radar data was collected on 5 cm 5 cm grid cells. The spacing of the antenna elements in the radar is, however, approximately 15 cm. Successive scans were taken with the antennas in such a way that the test bed was measured with the required resolution.
The F¨orster dual-frequency, continuous-wave, Minex metal detector is a detector which is sensitive to very small quantities of metal. However, the sensor data processing algorithm does not detect small metal objects which are close in distance to a large metal object, because the weak signals of the small objects are superimposed on the strong signal of the large object. In areas where a strong signal of a large metal object is present, the detector is blind to some degree to other metal objects. This results in a non-zero confidence value for these areas. Images of two infrared cameras in the wavelength bands 3 m-5 m and 8 m-12 m were acquired by Marconi. The 3 m-5 m camera has a better detection performance than the 8 m12 m camera. This is probably the result of a different SNR in the images of the two cameras (the 3 m-5 m camera is of a newer generation). In this paper only the results of the 3 m-5 m camera are presented.
161
grid point as one false alarm, which yields a pessimistic probability of false alarm, or count the area of the misclassified grid points, which yields an optimistic probability of false alarm because it does not take into account the spatial arrangement of grid points. In other words, a large cluster (connected component) of misclassified grid points yields the same probability of false alarm as the same points distributed over the testbed. To fuse these two approaches, we propose to measure false alarms by counting the number of false-alarm clusters smaller than a certain size plus the area of the remaining (large) clusters divided by this particular size. The number of resulting false alarms corresponds with the number of times one has to dig (with a scoop of a certain size) in the soil and discover a false alarm. Furthermore, application of this measure of false alarms leads to intuitive receiver-operator characteristic (ROC) curves, whereas common approaches may show counter-intuitive effects because they do not assume a spatial arrangement of the false alarms. Figure 1 illustrates these effects for a random sensor. The conventional implementation counts the area of the false alarms and marks a mine as detected when it falls within a cluster of grid points classified as mine. For high detection rates (i.e. the realised probability of detection) the number of false alarms decreases because almost all grid points are classified as mine. The SCOOP implementation does not decrease at these high detection rates.
Random Sensor 1 0.9 0.8
For the evaluation of the performance of the individual sensors and sensor-fusion, we introduce a novel algorithm, denoted SCOOP (Split Clusters On Oversized Patches), which provides a less biased estimate of the performance measured in and fa . The evaluation method differs from common performance evaluation methods in the sense that it takes into account the number of false alarms as well as the area of false alarms.
Common approaches count each misclassified
0.7 Detection rate
3 Performance evaluation
0.6 0.5 0.4 0.3 0.2 0.1 0
Conventional ROC implementation SCOOP ROC implementation 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Number of false alarms [m-2]
Figure 1: Conventional ROC curve for a random sensor and the ROC curve with SCOOP implementation.
Proc. of EuroFusion99, T. Windeatt and J.O’Brien, eds., Stratford-upon-Avon, UK, Oct., 1999
4 Sensor fusion Each sensor produces confidence values on a grid covering the testbed. The grid with confidence values of the sensors is the input for the sensorfusion. A confidence value expresses a confidence or belief in a mine detection on a certain position; it is not necessarily a probability, as discussed in Section 2.2. The confidence grid provides a co-registration of the different sensors. The applied sensor-fusion methods are Bayesian approaches, applications of DempsterShafer theory, fuzzy probabilities, rules and voting techniques. Because some of the fusion methods require (conditional) probabilities (Bayes) or probability masses (Dempster-Shafer) as input, we introduce appropriate mappings from confidence values to the input type required.
4.1 Fusion methods The methods are already described in [1,2], but a short description is given here for completeness. Our Baysian approach is assuming independence of the sensor outputs. The joint conditional prob(on mine given sensor values ability ) is given by the product of the conditional probability per sensor, which is assumed to be linear mapped:
(1)
with , a factor representing the uncertainty or influence of sensor . With this mapping it is assured that a higher confidence level leads to a higher a posteriori probability. The uncertainty level can be set by using a training set of examples.
162
results. So the kernel size is a suitable parameter for adapting the influence of each sensor. The resulting fused confidence grid is the fuzzy conjunction (and-function) of these fuzzy membership functions. For voting, the threshold for each sensor value is a tunable parameter. The votes are summed together and divided by the number of sensors (three in this case) to acquire a grid of fused confidence levels. The rule-based method is the only method we have evaluated which has more than one parameter per sensor. Each subrule consists of conjunction of three clauses representing a threshold on each sensor. Subrules are combined using the or function, so only one of the subrules has to be true. The thresholds are determined using a suboptimal heuristic method to limit the number of evaluations. Except for the rule-based method, all methods produce a fused confidence level map, so a final threshold is needed to discriminate between background and mines.
5
Experiments
We carried out experiments with the sensorfusion techniques on sensor data, as obtained from sensor measurements at the GEODE consortium testbed near Paris. The results of the different fusion techniques are compared and evaluated with the above-mentioned SCOOP algorithm.
5.1
Performance of individual sensors
For Dempster-Shafer, the unassigned probability mass is used as a tunable parameter in a similar way as with Bayes. The remaining probability mass is divided over the mass assigned to a mine and the mass assigned to background, with a ratio depending on .
Figure 2 shows the detection results (expressed in confidence values) of the four different sensors on the test lane and the location of the mines. Because the precise location and size of the mines cm is used to are not known, a virtual box of represent each mine. When there is a detection of a sensor within this virtual box, the mine is detected.
The kernel size of fuzzy probabilities has an effect on the influence of each sensor. The larger the kernel, the less influence it has on the fusion
From the results presented in Figure 2 we can obtain the ROC curves for the different sensors by thresholding the confidence values with different
!#"
$
&%
Proc. of EuroFusion99, T. Windeatt and J.O’Brien, eds., Stratford-upon-Avon, UK, Oct., 1999
163
Figure 2: Detection results of the four different sensors on the test lane from top to bottom: the high ground clearance GPR, the low ground clearance GPR, the ground truth, the metal detector and the infrared. The confidence values are encoded in grey: darker shades of grey imply higher confidences.
thresholds. Each threshold results in a certain detection rate and a number of false alarms for each sensor. For the calculation of the number of false alarms we use the SCOOP evaluation method, as described in Section 3. Figure 3 presents the ROC curves of the individual sensors. Sensor performance 1
The sensor processing has been optimised on the whole test lane for each sensor independently. The ROC curves of the sensors may not be reproducable on an independent test set. They must therefore not be seen as absolute performance measures. The confidence grids of the sensors can only be used to compare the relative performance of each sensor among other sensors and to compare sensor-fusion methods.
0.9
Detection rate [-]
0.8
5.2
0.7 0.6 0.5 0.4 0.3 High ground clearance GPR Low ground clearance GPR Metal detector Infrared camera
0.2 0.1 0 0
1
2
3
4
5
6
7
8
9
10
Number of false alarms [m-2]
Figure 3: ROC curves of the confidence values of the four sensors. This figure shows that the choice of the best sensor depends on the required maximum number of false alarms. The infrared sensor has the highest detection rates for a wide range of number of false alarms, with an exception of an area around . The metal detector has the 1 false alarm highest detection rate for just above 1 false alarm ; it detects all metal containing mines (about 2/3) and all metal containing false alarms. The low ground clearance GPR reaches 100% detection rate at just above 8 false alarms . The high ground clearance GPR only detects large objects, but it has ROC points below 1 false alarm which both the low clearance GPR and the metal detector lack.
')(*
'+(*
'+(*
' (*
Sensor-fusion experiments
Training and evaluation of the fusion methods is first performed on the same set of data. As a result, the estimates of the system’s performance may be optimistic and the system may not be robust. However, the experiments give us insight in the capability of a fusion method to adapt to a specific data set. Its robustness when other independent data sets are considered will be evaluated further on. For each method its corresponding mapping function and the final threshold are optimised on the complete data set. The results in Figure 4 show that all the sensorfusion methods lead to better results than individual sensors. No results are encountered in which the performance of an individual is better than any sensor-fusion result. The rule-based method gave the best results. This could be due to the fact that this method has a large amount of free parameters, leading to specific adaptation to the sample instead of adapting generic mine features. The other fusion methods perform almost equally well, however the Bayes implementation seems slightly worse, but this difference may not be signifant. This ranking differs from our previous findings
Proc. of EuroFusion99, T. Windeatt and J.O’Brien, eds., Stratford-upon-Avon, UK, Oct., 1999 Elta Foerster Marconi 5 µm
1
1
0.9
0.9
0.8
0.8
0.7
0.7
Detection rate [-]
Detection rate [-]
Emrad Foerster Marconi 5 µm
0.6 0.5 0.4 Best sensor Dempster-Shafer Bayes Rules Fuzzy probabilities Voting
0.3 0.2 0.1 0 0
1
2
3
4
5
6
7
164
8
0.6 0.5 0.4 Best sensor Dempster-Shafer Bayes Rules Fuzzy probabilities Voting
0.3 0.2 0.1 0
9
Number of false alarms [m-2]
10
0
1
2
3
4
5
6
7
8
9
10
Number of false alarms [m-2]
(a)
(b)
Figure 4: ROC curves for the different sensor-fusion methods for (a) the sensor combination low ground clearance GPR, metal detector, and infrared. and (b) the sensor combination high ground clearance GPR, metal detector, and infrared.
with different infrared processing, see [1], since fuzzy probabilities was at that time the best method after rule-based fusion. So this means that the ranking of fusion methods, besides rule based fusion, is sensor-data dependent. The detection performance of each method depends on how accurately each method can separate mines from background, i.e. its discriminant function. For different sensor data, different fusion methods may produce better discriminant functions. Further research on discriminant functions will help to select the best sensor-fusion method.
out evaluation method is used, see [18]. In the leave-one-out evaluation method, the parameters for each method are acquired on a training set, which contains all but one sample (a region con). The actaining one mine and on average 1 quired parameters from the training set are tested on the single sample left out (the evaluation set). This is repeated for all mines and their surrounding region as evaluation set. The results from the training and evaluation sets are summed and normalised to acquire a detection rate and number of . false alarms per
The new implemented fusion-method voting performs equally well compared to the other four parameters sensor-fusion methods.
The leave-one-out evaluation method only makes sense for 100% detection. For lower detection rates different methods will leave out different mines. This leads to jumps in the ROC curve for all methods and therefore comparison is difficult. So the leave-one-out evaluation will only give accurate results for 100% detection on the train set, leading to only one point on the ROC curve.
5.3 Leave-one-out sensor-fusion results The results of the previous section were acquired by training and evaluating fusion methods on the same data set. In general, this gives biased results, but this is not a major disadvantage since our focus is on comparing sensor-fusion methods on their performance. However, there may be differences between methods in the capability of representing the training set. For instance, the rule-based method has more free parameters than other methods, so it can possibly better represent the training set. For an unbiased comparison of these methods, with the limited data set we have, a leave-one-
' *
',*
To acquire more points on this ROC curve, the data set should be reduced in a controlled manner. After the first leave-one-out loop, resulting into one ROC point, one mine is completely removed from the data set. The mine that is removed is the mine indicated by the best sensor implementation (with this mine removed the false alarm rate would decrease the most). Each time a mine is removed, a new and smaller data set is created which will be evaluated using the leave-one-out implementation leading to a new point on the ROC with lower detection rates.
Proc. of EuroFusion99, T. Windeatt and J.O’Brien, eds., Stratford-upon-Avon, UK, Oct., 1999
-
Evaluation set
1
1
0.9
0.9
0.8
0.8
0.7
0.7 Detection rate
Detection rate
Training set
-
0.6 0.5 0.4 Best sensor Bayes Dempster-Shafer Rules Fuzzy Probalities Voting
0.3 0.2 0.1 0 0
1
2
3
4
5
6
165
7
8
0.6 0.5 0.4 Best sensor Bayes Dempster-Shafer Rules Fuzzy Probabilities Voting
0.3 0.2 0.1 0 9
10
0
1
2
False alarms [m-2]
3
4
5
6
7
8
9
10
False alarms [m-2]
(a)
(b)
Figure 5: ROC curves for the different sensor-fusion methods evaluated with leave-one out for the sensor combination low ground clearance GPR, metal detector, and infrared. The training set results are given in (a) and the test set results are given in (b).
More mines are removed, until there are only 10 mines present in the final data set. Why this specific order of mine removal was chosen, was because our goal was to make a comparison between sensor-fusion methods and the best sensor implementation. This best sensor implementation can be considered a base line sensorfusion method. The results of this leave-one-out evaluation in Figure 5 show that some of the sensor-fusion methods perform better than the best sensor on the evaluation set. Furthermore the method with most parameters, the rule-based method, performs best on the training set, but has the worst performance on the evaluation set. This performance loss is a confirmation of what we already expected, see [1]. The Dempster-Shafer implementation seems to be the most robust, while Fuzzy Probabilities gives very unpredictable results.
6 Conclusions For the same training and evaluation set, all sensor-fusion methods perform better than the best sensor; the rule-based method has the best performance. The results of independent training and evaluation sets, obtained by using a leave-one-out method, show that the Dempster-Shafer imple-
mentation still performs better than the best sensor. The decrease in performance of the rulebased method is the largest, which is according to our expectations. This method is clearly over trained due to the many optimisation parameters. The fuzzy probabilities method gives very unpredictable results. The newly implemented fusionmethod voting performs similar compared to the other methods. The actual performance (if a very large data set is used) of these methods will be somewhere between the results of the training and evaluation sets.
Acknowledgements This research is partly funded by the European Union as ESPRIT project GEODE, number 26337. The consortium consisted of the following companies: Emrad limited, United Kingdom, ELTA Electronics Industries Ltd, Israel, Institute Dr. F¨orster, Germany, Marconi communications, Italy, THOMSOM-CSF DETEXIS, France and TNO Physics and Electronics Laboratory, The Netherlands.
References [1] E. den Breejen, K. Schutte, and F. Cremer, “Sensor fusion for anti personnel landmine
Proc. of EuroFusion99, T. Windeatt and J.O’Brien, eds., Stratford-upon-Avon, UK, Oct., 1999 detection, a case study,” in Dubey and Harvey [19]. [2] F. Cremer, E. den Breejen, and K. Schutte, “Sensor data fusion for anti-personnel landmine detection,” in Bedworth and O’Brien [20], pp. 55–60. [3] M. G. J. Breuers, P. B. W. Schwering, and S. P. van den Broek, “Sensor fusion algorithms for the detection of land mines,” in Dubey and Harvey [19]. [4] B. A. Baertlein and A. Gunatilaka, “Improving detection of land mines through sensor fusion,” in Proc. SPIE Vol. 3392: Detection and Remediation Technologies for Mines and Minelike Targets III (A. C. Dubey, J. F. Harvey, and J. T. Broach, eds.), (Orlando (FL), USA), Apr. 1998. [5] J. E. Baker, “Adaptive multi-sensor integration for mine detection,” in Dubey and Barnard [21], pp. 452–466. [6] B. Bargel et al., “Model-based sensor fusion for minefield detection,” in Dubey et al. [22], pp. 509–518. [7] G. A. Clark et al., “Data fusion for the detection of buried land mines,” in Proceedings of the International Symposium on Substance Identification technologies, (Innsbruck, Austria), Oct. 1993. [8] M. Fritzsche and O. L¨ohlein, “Multisensor fusion for the detection of buried landmines,” in Bedworth and O’Brien [20], pp. 93–100. [9] R. Garriott et al., “A multisensor system for mine detection,” in Dubey et al. [23], pp. 259–268. [10] T. J. Gorman, “IGMMDT: A multisensor approach to mine detection,” in Dubey et al. [23], pp. 269–274. [11] T. Hanshaw, “Multi-sensor application for mines and mine-like target detection in the operational environment,” in Dubey et al. [23], pp. 249–258. [12] T. Hanshaw, “Multi-sensor fusion for the detection of mines and mine like targets,” in Dubey et al. [22], pp. 152–158.
166
[13] P. J¨agerbro et al., “Combination of GPR and metal detector for mine detection,” in MD98 [24], pp. 177–181. [14] D. W. McMichael, “Data fusion for vehicleborne mine detection,” in MD98 [24], pp. 167–171. [15] X. Miao et al., “Detection of mines and minelike targets using principal component and neural-network methods,” IEEE transactions on Neural Networks, vol. 9, May 1998. [16] K. Scheerer, “Airborne multisensor system for the autonomous detection of land mines,” in Dubey and Barnard [21], pp. 478–486. [17] E. Waltz and J. Llinas, Multisensor datafusion. Artech House, 1990. [18] S. Weiss and C. Kulikowski, Computer systems that learn. Morgan Kaufmann Publishers, 1991. [19] A. C. Dubey and J. F. Harvey, eds., Proc. SPIE Vol. 3710, Detection and Remediation Technologies for Mines and Minelike Targets IV, (Orlando (FL), USA), Apr. 1999. [20] M. Bedworth and J. O’Brien, eds., Proceedings of the International Conference on Data Fusion (EuroFusion98), (Great Malvern, UK), Oct. 1998. [21] A. C. Dubey and R. L. Barnard, eds., Proc. SPIE Vol. 3079: Detection and Remediation Technologies for Mines and Minelike Targets II, (Orlando (FL), USA), Apr. 1997. [22] A. C. Dubey, I. Cindrich, and J. M. Ralston, eds., Proc. SPIE Vol. 2496: Detection Technologies for Mines and Minelike Targets, (Orlando (FL), USA), Apr. 1995. [23] A. C. Dubey, R. L. Barnard, and C. J. Lowe, eds., Proc. SPIE Vol. 2765, Detection and Remediation Technologies for Mines and Minelike Targets, (Orlando (FL), USA), Apr. 1996. [24] Proceedings of the Second International Conference on Detection of Abandoned Land Mines, (Edinburgh, UK), Oct. 1998.