Subsurface Sensing Technologies and Applications Vol. 3, No. 3, July 2002 (2002)
137
J. K. Paik, C. P. Lee, and M. A. Abidi, "Image Processing-Based Mine Detection Techniques Using Multiple Sensors: A Review." Subsurface Sensing Technologies and Applications: An International Journal, Vol. 3, No. 3, pp. 153-202, July 2002.
Image Processing-Based Mine Detection Techniques: A Review Joonki Paik,* Cheolha P. Lee, and Mongi A. Abidi Imaging, Robotics, and Intelligent Systems Laboratory, Department of Electrical and Computer Engineering, The University of Tennessee, Knoxville Receiûed October 2, 2001; reûised February 8, 2002 Various mine detection techniques are reviewed with particular emphasis on signal and image processing methods. Based on the target, mines are classified into two types; anti-tank mine (ATM) and anti-personnel mine (APM). Because of the variety of mine types, current mine detection techniques are diversified. The assumption is made that most mine detection techniques consist of sensor, signal processing, and decision processes. For the sensor part, ground penetration radar (GPR), infrared (IR), and ultrasound (US) sensors are reviewed and their characteristics are summarized for the corresponding output signals. For the signal processing and decision parts, a set of image processing techniques including filtering, enhancement, feature extraction, and segmentation are surveyed. Segmentation is used to extract mine signal from various competing signals. For most image processing techniques covered by this paper, mine detection related experimental results are included or reproduced from existing works. Key Words. Mine detection, anti-personnel mine, anti-tank mine, unexploded ordinance, ground penetrating radar, infrared sensor, ultrasound sensor, image processing, filtering, segmentation, enhancement.
1. Introduction More than 26,000 people are killed or maimed by mines every year, which is equivalent to one victim every 20 min. For example, in Cambodia one out of every 236 people is a landmine amputee. The casualty ratio rises to one out of every 140 people in Angola, which has more mines than people. In addition to fatal casualties and enormous financial losses, mines *To whom all correspondence should be addressed. Post: 331 Ferris Hall, 1508 Middle Drive, The University of Tennessee, Knoxville, TN 37996-2100; fax: (865) 974-5459; e-mail:
[email protected] 153 1566-0184兾02兾0700-0153兾0 2002 Plenum Publishing Corporation
154
Paik, Lee, and Abidi Table 1. Worldwide Landmine Distribution and Clearance Status Mines (million)
Countries
UN a
Afghanistan 10 Angola 15 Bosnia 3 Cambodia 6 Croatia 3 Egypt 23 Eritrea 1 Iran 16 Iraq 20 Laos NA Mozambique 3 Somalia 1 Sudan 1 Vietnam 3.5
USSDb
Cleared mines
Mined area (km2)
Cleared area (km2)
Casualtiesc
7 158,000 550∼780 202 300∼360兾month 15 10,000 Unknown 2.4 120∼200兾month 1 49,010 300 84 50兾month 6 83,000 3000 73.3 38,786 or 100兾month 0.4 8000 11,910 30 677 22.5 11,000,000 3,910 924 8301 1 Unknown Unknown 2.48 2000 16 200,000 40,000 0 6000 10 37,000 Unknown 1.25 6715 NA 251 43,098 Unknown 10,649 or 16∼18兾month 1 58,000 Unknown 28 1759 1 32,511 Unknown 127 4500 1 Unknown 800,000 0 700,000 3.5 58,747 Unknown 65 180兾month
a
UN Landmine Database 1997 [1]. US State Department Report ‘‘Hidden Killer 1998. The Global Landmine Crisis’’ [2]. c Casualty reporting varies drastically among countries; estimates provided by UN or the host government. b
ruin large areas of fertile farmland and waterways. In Cambodia, approximately 40% of the rice fields have been mined and abandoned [1]. Most tragic is that many victims are children and most mine-afflicted countries are poverty stricken, as well. Worldwide landmine distribution and its clearance status are summarized in Table 1. Because of the potentially catastrophic results of unintentional mine encounters, the process of detecting and removing mines, called demining, is particularly important. Manual demining is extremely dangerous; one deminer has been killed for every 2,000 mines removed, with even more civilian victims. The cost to purchase and lay a typical antipersonnel mine ranges from $3 to $30, while the cost to remove a single mine ranges from $300 to $1000. The European Commission and the United States have invested 138 million dollars for demining actions during last two years, but these cleared mines are just the tip of the iceberg [1]. In 1994, approximately 200,000 mines were removed, while two million new mines were planted. Many experts believe that it would take more than ten centuries to remove every mine in the world with the current clearance rate, even if no additional mines were planted [3]. Because mines can be made of both metallic and nonmetallic materials, detection using only conventional metal detectors cannot give a promising result. Report also indicate metal detectors are subject to many false
Image Processing-Based Mine Detection Techniques: A Review
155
alarms in the former battlefield due to the presence of small fragments of munitions. Although manual detection, called probing, works well for a wide variety of mines, high labor cost and the slow pace involved are encouraging development of other techniques. Although some military demining equipment has been developed and used during the Gulf War by the US Army, civilian related demining, called humanitarian demining, is quite different from the military work. The object of humanitarian demining is to find and remove abandoned landmines without any hazard to the environment. These landmines were intended for military use when they were planted, but their duty has expired. Furthermore, humanitarian demining equipment is required to be more accurate than military-purpose equipment because the military can afford a certain degree of casualty risk. The UN requires a probability of 99.96% mine detection accuracy to find a 4 cm radius object at a 10 cm depth, and localization ability of up to a 0.5 m radius [1]. To meet the strict requirements for humanitarian demining, various techniques in the area of sensor physics, signal processing, and robotics have been studied during the last decade. This paper surveys a wide variety of demining technologies with emphasis on various sensors and related signal processing techniques. The scope of this paper will be limited to humanitarian demining rather than the military approach. Most mine detection techniques consist of sensor, signal processing, and decision processes. For the sensor part, ground penetration radar (GPR), infrared (IR), and ultrasound (US) sensors are reviewed and their characteristics are summarized for the corresponding output signals. For the signal processing and decision parts, a commonly used set of image processing techniques including filtering, enhancement, feature extraction, and segmentation are surveyed. Segmentation is used to extract a mine signal from various competing signals. Two sets of minefield data collected from test minefields were used to show how the mine detection algorithms work with inhomogeneous background [4,5]. This paper is organized as follows. In Section 2 various types of mines are classified based on the target. Section 3 summarizes three different sensor technologies; GPR, IR, and US sensors. In Section 4 signal and image processing techniques for mine detection are reviewed with a comprehensive set of experimental results. Section 5 provides several examples, which show how general image processing algorithms can be applied to a specific mine detection case, and Section 6 concludes the paper. 2. Classification of Mines Various types of mines have been manufactured and laid. According to the potential target, mines can be classified into antitank mines (ATM)
156
Paik, Lee, and Abidi Table 2. Typical Specifications of Three Different Types of Mines [6]
Type Target Weight Size (in diameter) Case material Detonation pressure a b
UXU a Unspecified, general Various Various Mostly metal Unpredictable
ATM Vehicle Heavy (6∼11 kg) Large (13∼40 cm) Metal, plastic 120 kgb
APM Human Light (0.1∼4 kg) Small (6∼15 cm) Plastic 0.5 kgb
Some references define UXO in a more general category, including all kinds of mines. The values present the minimum pressure to detonate the most sensitive mine in each category.
and antipersonnel mines (APM). Typical specifications for two different types of mines together with unexploded ordinance (UXO) are summarized in Table 2. Generally, UXO represents misfired shells or unexploded bombs that still remain for some reason. UXOs are usually found beneath the former battlefields. Since UXO has a collective meaning including various types of mines, we will present details for only ATMs and APMs in this section.
2.1. Anti-Tank Mine (ATM) Most ATMs are made of metallic material, and their size is bigger than that of APMs as indicated in Table 2. Since they have been designed to destroy vehicles, their detonation pressure is very high and they generate large metallic splinters after explosion. Two typical ATMs are shown in Figure 1. The TM-62M is a larger-sized metallic case mine with a diameter of 31 cm [6]. This device’s detonator is so insensitive that a human can approach without explosion. The TMA-2 is a different type of ATM built
Figure 1. Two typical ATMs; (a) TM-62M and (b) TMA-2 [6].
Image Processing-Based Mine Detection Techniques: A Review
157
Table 3. Specifications for Two Different ATMs (TM-62M and TMA-2) Model No. Dimensions Weight Case Sensitivity Manufact. nation
TM-62M Height 112 mm, diameter 316 mm 8.47 kg Steel 200 kg Former Soviet Union
TMA-2 Height 140 mm, width 260B200 mm 7.5 kg Plastic 120 kg Former Yugoslavia
in a plastic case. Table 3 summarizes specifications for the TM-62M and TMA-2.
2.2. Anti-Personnel Mine (APM) APMs are the most difficult type of mine to find and remove, and most civilian victims have been injured by this type of mine. Most APMs are made of nonmetallic material, and they are much smaller than ATMs. APMs’ detonators are so sensitive that less than 10 kg of pressure can make them explode. APMs can be divided into three types; (i) blasting, (ii) bounding fragmentation, and (iii) directional fragmentation [6,7]. The blasting type mines are the most common targets for humanitarian demining work. A blasting mine is relatively smaller and lighter than other types of mine. Blasting mines are usually buried underground, but some models can be scattered by an airplane or floated on a river. For this reason, they can be found on the surface, underground, and at the riverside. Because of its simple mechanism and low material cost, small military groups can easily manufacture this type of mine. Such haphazard manufacturing and deployment of the blasting-type APM has resulted in serious mine problems especially for poorer countries that cannot afford to invest in demining work. The bounding fragmentation type mines are relatively larger than the blasting type. This type of mine can destroy a larger area, while the blasting type mines can damage only a target within a limited distance. Bounding fragment mines are either buried underground or deployed on the surface. Direct pressure or a trip wire activates their detonators. Once the trigger is activated, they bounce up to a given altitude and explode with their lethal fragments spreading into an area of up to 30 m radius. Most directional fragmentation type mines are deployed on the surface, and during explosion they spread their fragments in a specific direction. Some models’ lethal range reaches over 200 m. Since they are detonated by manual operation as well as a trip wire, sometimes this type
158
Paik, Lee, and Abidi
(a)
(b)
(c)
(d)
Figure 2. Typical APMs; (a) PRB-M35, (b) PMN, (c) VALMARA-69, and (d) MON-100 [6].
of mine is considered as an active weapon. Some notable APMs are shown in Figure 2, and Table 4 summarizes their specifications. Both the PRB-M35 and PMN fall within the realm of blasting-type mines, which can be detonated by 8 kg of pressure. The PRB-M35 is one of the smallest mines with diameter of approximately 6 cm, which is as small as the diameter of a Coke can. If these mines are buried or scattered on the ground covered with vegetation, they are very difficult to find and eliminate. Even lighter mines can be spread by floating on water, and their distribution is unpredictable after heavy rains or flooding. The PMN is another example of a cheap, nonmetallic mine with a cover made of rubber plate. The Valmara-69 is a bounding fragmentation type mine. Once detonated, the device propels upward and explodes with over 2,000 fragments spread over an area 27 meters in radius. The MON-100 is a directional fragmentation type mine. Its lethal range reaches over 100 m covering a 9.5 m arc.
Table 4. Specifications for APMs in Figure 2 [4] Model No. Type Dimension height diameter Weight Case Sensitivity Lethal range Manufact. nation
PRB-M35 Blasting
PMN Blasting
VALMARA-69 Bounding fragment
MON-100 Directional fragment
58 mm 64 mm 158 g Plastic 8 kg
56 mm 112 mm 600 g Rubber 8 kg
82 mm 236 mm 5.0 kg Steel Depends on fuses
NA Belgium
NA Former Soviet Union
105 mm 130 mm 3.3 kg Plastic 10.8 kg directly, 6 kg through trip wire Radius 27 m Italy
100 m by 9.5m arc Former Soviet Union
Image Processing-Based Mine Detection Techniques: A Review
159
3. Sensor Technology Since World War II, various kinds of sensors have been employed for detecting mines. In this section we introduce three different types of sensors that have made major contributions in the mine detection field. Brief specifications and low-level, hardware-related signal processing techniques to operate each sensor will also be provided. Hardware-independent, high-level signal and image processing techniques will be discussed in the next section. 3.1. Ground Penetrating Radar (GPR) GPR consists of an active sensor, which emits electromagnetic (EM) waves through a wideband antenna and collects signals reflected from its surroundings. The principle of GPR is almost the same as in a seismic wave measurement system except for the carrier signal. The commonly used frequency band of the GPR, EM wave is between 100 MHz and 100 GHz [7]. This band is wide enough to carry the necessary information. Reflection occurs when the emitted signal encounters a surface between two electrically different materials. The direction and intensity of the reflection depend on the roughness of the surface and electrical properties of the medium material [7]. A rough surface reflects the incident wave in a diffused manner, while a smooth surface tends to reflect the wave in one direction, where the angle between the surface normal and the reflected wave is the same to the angle between the surface normal and the incident wave. The electrical properties of the medium determines the amount of refraction and absorption of the EM waves and subsequently affects the direction and intensity of the reflection. The penetration depth of the wave into soil usually depends on two factors, the humidity in the soil and the wavelength of the EM wave [7]. The content of water in the soil significantly reduces the depth of penetration of a wave with relatively shorter wavelength. Based on the reflection and penetration properties, GPR works best with low-frequency EM waves in dry sand. Low-frequency signals, however, tend to make low-resolution maps of data, which decreases the accuracy of mine detection. Since the EM waves cannot penetrate water, GPR cannot detect underwater mines, which are common in many countries [9]. GPR provides information on both the existence and location of mines. The presence of an object is detected by checking for interruption through the round trip path of the signal. The distance between the sensor and an object is measured by using the time delay, ∆t, between the emitting and receiving moments of the signal as û RG ∆t (1) 2
160
Paik, Lee, and Abidi
where û represents the velocity of the EM wave in the medium, and R the distance of the object from the sensor [7]. Since many parameters of the EM waves, including the velocity, vary according to the content of soil, soil parameters should be estimated prior to taking the measurement [10]. 3.1.1. A, B, and C-Scan GPR data can be represented in three different forms, A, B, and Cscans, according to the scanning dimension. Figure 3 shows the 3D coordinate system defined on a section of earth, where the xy-plane represents the ground surface and the z-axis represents the direction into the ground. The A-scan signal is obtained by a stationary measurement after placing an antenna above a specific position, such as (x′, y′) in Figure 3. The collected signal is presented in the form of a group of signal strength versus time delay. Figure 4 shows an example of an A-scanned signal acquired using an ultra-wide band (UWB) GPR under laboratory conditions. The horizontal axis of the one-dimensional (1D) graph in Figure 4 corresponds to the direction of the z-axis with origin at (x′, y′), which is depicted by the downward arrow originating from (x′, y′) in Figure 3. A PMN, APM as shown in Figure 2b, of 112 mm diameter and 56 mm height, was buried for measuring purposes at 5 cm depth in a sandbox of 50B50 cm dimensions. Measurements were repeated 2500 times at
Figure 3. The 3D coordinate system defined on a section of ground.
Image Processing-Based Mine Detection Techniques: A Review
161
Figure 4. An example of an A-scan signal (1B500) [11].
intervals of 1 cm in both the x and y directions. Five hundred data points were sampled at intervals of 10 psec per measurement [11]. As shown in Figure 4, there are two peaks in the range between data points 50 and 200. They indicate interruptions along the downward path. The positions of these peaks correspond to the distance between the antenna and various reflecting surfaces. The first peak represents the air-to-ground reflection, and the second peak represents the target mine. A-scanned signal measured at the position (x′, y′) is a 1D signal and can be mathematically represented as fA (z)Gf (x, y, z)兩x Gx′, y Gy′
(2)
where z varies from 1 to N, the total number of data samples [9]. B-scan signal is obtained as the horizontal collection from the ensemble of A-scans. The collected signal is presented as intensity on the plane of scanned width versus time delay. Therefore, the B-scanned signal measured at yGy′ can be considered as a 2D signal and can be represented as fB (x, z)Gf (x, y, z)兩y Gy′
(3)
162
Paik, Lee, and Abidi
Figure 5. An example of a B-scan (500B50) [11].
where x, the horizontal position of measurement, varies from 1 to L, the maximum width of the antenna locus, and z varies from 1 to N, the number of data samples at each measurement [9]. A 2D B-scanned signal is depicted by the vertical plane containing multiple A-scans in Figure 3. Figure 5 shows the B-scanned signal for the same object of the Ascan shown in Figure 4. One B-scan consists of 50 A-scans. The vertical axis corresponds to the horizontal axis of A-scan shown in Figure 4, and the horizontal axis represents the scanned width, which is the number of Ascans. The intensity or color of each pixel indicates the signal strength, and corresponds to the vertical axis of Figure 4. The horizontal line at data point 100 in the vertical axis corresponds to the air-to-ground surface, and the hyperbola shaped object at data point 150 in the vertical axis corresponds to the target mine. The A-scan could detect only the existence of the two objects in Figure 4, but the B-scan can distinguish a mine-like target from the air-to-ground surface and can give more information about the position of the object as shown in Figure 5.
Image Processing-Based Mine Detection Techniques: A Review
163
C-scan signal is obtained from the ensemble of B-scans, measured by repeated line scans along the plane. The collected C-scanned signal forms a 3D signal, which is depicted by the hexahedron shown in Figure 3. In the 3D coordinate system, the x and y axes respectively represent the horizontal and the vertical positions of the target, and the z-axis represents the depth of the target. A 3D C-scan signal can be represented as fC (x, y, z)
(4)
where x and y vary from 1 to L and 1 to M, respectively, and z varies from 1 to N [9]. L and M represent the planar size of the scanned area, and N represents the total number of data samples taken at each measurement. Since visualization of a three-dimensional data is not easy, a C-scan is usually represented by a collection of horizontal slices for a specific data point, that is xy-planes at each specific position on the z-axis. Each slice corresponds to a certain depth level, which is equivalent to the vertical axis of the B-scan. Figure 6 shows the C-scan of the same object used to obtain Figure 4 and Figure 5. Figure 4 was acquired at the point (25,25) heading down. Figure 5 was acquired at the line yG25 also heading down. Figure 6a was acquired at 1.33 nsec after the signal emission, and the subsequent images are acquired at intervals of 0.03 nsec. Although we cannot clearly define the air-to-ground boundary from these images, we can distinguish the target mine from its background, and can roughly figure out the shape of the target. Figure 6d shows the top of the PMN, APM consisting of a small detonating cap and a large cylindrical case. Figure 6i shows the bottom part of the target. The measured size of the target is larger than its real size.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
Figure 6. An example of a C-scan (50B50) [11]; consisting of horizontal slices at data points 133 to 166 at intervals of 3 depth points.
164
Paik, Lee, and Abidi
The diameter of a PMN is 112 mm, and the size of the scanned area is 50B50 cm. This magnification distortion in C-scan can be reduced by fusing the corresponding B-scan data [10].
3.1.2. Preprocessing In this subsection low-level signal processing techniques related to GPR are presented. As Figure 5 indicates, a target tends to show a hyperbolic shape in the B-scan because the EM wave propagates in an omnidirectional manner. Figure 7 shows the B-scan acquisition process. An antenna moves along a parallel line to the surface, and acquires the reflected signal from the object at regular intervals. Each vertical line indicates an A-scan, and black dots represent the position of the impulse, which indicates the existence of the object. Since the object is the closest to position A, the measured time delay at A is as short as d, while the time delay measured at B or C is equal to dC∆t. The time delay ∆t gives information about the position of the object. This delay can also vary due to the local soil conditions because the electrical properties of the medium affects the velocity of the EM wave. The curvature of the hyperbola gives information about the existence and position of the object and the soil condition. Hyperbolas can be detected by using the Hough transformation. In detecting hyperbolas, each hyperbola must be separated from the background clutter [10]. Although the Hough transformation may detect the existence of hyperbolas, it does
Figure 7. A schematic diagram of the B-scan acquisition process [10].
Image Processing-Based Mine Detection Techniques: A Review
165
Figure 8. Clutter removal; (a) original GPR image and (b) image with background clutter removed [14].
not reduce the magnification distortion inherent in C-scan. Migration is a technique used to provide an exact physical location and shape of the reflectors in the subsurface. The goal of migration is to recombine scattered measuring points into one position [10], which means recombining the black dots shown in Figure 7 around position A. One critical problem in GPR data processing is how to remove the air-to-ground reflection, which is shown as the dominant horizontal lines in Figure 8a. Background clutter, caused by inhomogeneous soil content, results in another problem. The primary sources of clutter, which are sensordependent, includes man-made metal objects, natural rocks, vegetation rough terrain, to name a few. Clutter removal is a daunting task in mine detection areas, and a sensor calibration-based clutter estimation technique, for example, was proposed in [12]. Some experimental results with natural rocks and vegetation terrain will be provided in the following section. Although those results are related with minefield IR data, they could give an idea of image processing techniques for clutter removal with GPR data by analogy. As a sensing tool for subsurface environment, the GPR was studied in [13]. 3.2. Infrared (IR) Sensor IR radiation is the portion of the EM spectrum lying between visible rays and the microwaves regions with wavelengths between 0.75 µm and 1 mm [7]. Although all EM radiation produces heat, IR radiation can be more readily detected in the form of heat. Heated materials provide good
166
Paik, Lee, and Abidi
sources of infrared radiation. For this reason, IR radiation is also referred to as thermal radiation. Since visualization is easier than with other sensors, IR has been widely used for mine detection. Another advantage of IR is that this process does not need as much serious preprocessing as GPR. However, the performance of IR is highly dependent on the environment at the moment of measurement. There are two different methods for sensing IR waves. The passive IR system senses only natural radiation from the object, while the active IR system provides an extra heat source and receives the artificial radiation created by that heat source [15].
3.2.1. Detectors The IR detector is a transducer that converts the energy of EM radiation into an electrical signal. There are two types of IR detectors, the photon and the thermal detectors [16]. The photon detector or counter essentially measures the rate of quantum absorption, whereas the thermal detector measures the rate of energy absorption [16]. Therefore, the photon detector is the selective IR detector, responding only to those photons with sufficiently short wavelengths. The response at other wavelength ranges is proportional to the rate at which photons of that wavelength are absorbed. Thermal detectors respond to only the intensity of absorbed radiant power regardless of the spectral content [16]. Thus, they respond equally well to radiant energy of all wavelengths.
3.2.2. Dynamic Thermography The general concept of using IR thermography for mine detection is based on the fact that mines may have different thermal properties from the surrounding material. If the response is due to an energy flux that varies with time, the objects will follow a temperature curve that will not coincide with the soil. When this contrast is made by alteration of the heat flow due to the presence of the buried mine, it is called the ûolume effect [17]. On the other hand, when the contrast results from the disturbed soil layer created by the burying operation, this is called the surface effect [17]. The surface effect is detectable for only a limited time after burial. During this detectable period the thermal contrast is quite distinctive. Two different effects are
Image Processing-Based Mine Detection Techniques: A Review
167
Figure 9. Thermal effects; (a) volume effect and (b) surface effect.
shown in Figure 9. Once a sequence of images has been acquired, various processing techniques can be applied to enhance the contrast between the potential targets and background. This is called dynamic thermography [18]. In order to obtain sample data, seven different types of mines were laid on or under an approximately 20B20 cm test area. Specific types of mines and their locations (ground truth) are summarized in Table 5. Figure 10 shows Table 5. Summary of Various Types of Mines and their Placement [4] Type of mine M15 Type of placement Underground Ground trutha 67.50 a
M19 Underground 139.42
PGMDM Surface
RAAM Surface
191.42
222.42
FFV028 Underground 75.109
TN62 Underground 131.109
VS16 Underground 204.112
This parameter represents the positon on a 256B256 digitized grid.
Figure 10. The contrast enhanced, first image frame of a 256B256 IR image sequence for detecting mines as described in Table 5 [4].
168
Paik, Lee, and Abidi
Figure 11. The time-varying images of the area of interest (222B140), as indicated in the dark outline of Figure 10.
the first image frame of the dynamic thermography sequence for detecting mines as described in Table 5. The data were collected at the test minefield in Fort Belvoir, Virginia by using an E-OIR, Amber Galileo, LWR sensor, which can detect in the 3 to 5 µm band. The sensor was located on a tripod in a remote location. From 3 pm to 11 pm, data were captured every 15 min [4]. Six sampled images out of the ensemble are shown in Figure 11. In the acquired image, contrast is very low and noise is dominant over the signal. Therefore, post-processing techniques for contrast enhancement and noise removal are necessary to analyze the IR data. Such signal and image processing techniques are discussed in Section 4. More IR-related research results can be found in [19–21]. 3.3. Ultrasound Sensor The audio frequency range is between 20 and 20,000 Hz. Ultrasound waves have the frequency band above this audible range. The principle of ultrasound sensing systems is very similar to GPR except that ultrasound uses much lower frequency waves than the GPR system. The ultrasound system emits ultrasound signals and collects reflected signals from the surroundings. Note that a sound wave propagates as a mechanical disturbance of molecules in the form of waves [9], while a radar signal makes no physical disturbance in the medium. When a sound wave propagates through a medium, the wave consists of the molecules of the medium oscillating around their equilibrium position.
Image Processing-Based Mine Detection Techniques: A Review
169
Table 6. Speed of Sound in Different Media [9] Material Speed of sound [m兾s]
Steel 5000
Lead 1300
Water 1460
Soft tissue 1500
Bones 2500–4900
The speed of sound is dependent on the physical properties, density, and elasticity of the medium. The speed of sound propagation, denoted by c, is given as cGf · λ [m兾sec]
(5)
where λ represents the wavelength of the wave, and f the frequency. Sometimes c is called a material constant because it is constant for a certain material [9]. In a uniform homogeneous medium, the ultrasound wave propagates along a straight line and is reflected and refracted when the wave encounters a boundary between two different media. At the boundary, the speed of the wave and the density of the medium affect the behavior of propagation. In mine detection, the frequency of the ultrasound wave decides the penetration depth as is also true for GPR. The lower frequency wave tends to penetrate better than the high frequency wave [9]. The ultrasound wave propagates well in humid or underwater conditions, but it is significantly attenuated in air, while the EM wave of GPR behaves oppositely in the same conditions [7]. Table 6 summarizes the speed of sound propagation in different materials. The denser the material is, the greater the speed. The ultrasound signal can be visualized using A, B, and C-scans, as with GPR.
4. Signal and Image Processing Techniques In general mine detecting processes, 2D information for mine location is the most important. Independent of the sensor used, the sensor output can be represented in the form of 2D data, which can be considered an image. A slice in the ensemble of C-scanned GPR data shown in Figure 6, for example, can be considered as a 2D image, where local contrast in pixel intensity provides a clue for potential existence and location of mines. 2D data from the sensors are highly subject to degradation due to various factors, such as: (i) noise due to unpredictable combination of soil contents, (ii) low-resolution due to the limited performance of a sensor, and (iii) low-contrast due to the limited dynamic range of the sensor output. For this reason, the data must be enhanced by using various signal and image processing techniques.
170
Paik, Lee, and Abidi
In this section notable hardware or sensor-independent signal and image processing algorithms are summarized and their applications to mine detection are also described.
4.1. Filtering Noise is unavoidable in the output of most sensors. In this subsection two filtering techniques, the Wiener filter and the alternating sequential filter implemented for the purpose of noise removal, are reviewed for the purpose of noise removal. The Wiener filter is also known as the minimum mean square error (MMSE) filter in the image processing area [21,23]. The alternating sequential filter is based on gray-scale morphology [16].
4.1.1. Wiener Filter The Wiener filter is a signal-dependent filter that restores the original signal by minimizing the mean square error between the estimated and the original signals. Let f (m, n) and g(m, n) be arbitrary, zero mean, random sequences with sizes M1BM2 and N1BN2 , respectively. If we assume that g(m, n) is the output of the deterministic linear system with impulse response h(m, n) and additive noise η (m, n), then g(m, n)G∑ ∑ h(m, n; i, j)f (i, j)Cη (m, n) i
(6)
j
This equation can be rewritten in the matrix-vector expression as gGHfCη
(7)
where g and η represent N1 N2B1 vectors, f an M1 M2B1 vector, and H an N1 N2BM1 M2 block matrix [23]. Consider the linear estimation problem, for which the original undegraded image, f, is to be estimated, given the noisy degraded observation, g. Intuitively, we may think that the following is the solution of the linear equation in (7). fGH −1 (gAη )
(8)
Whether it is possible to compute H−1 or not, the estimate given in Eq. (8) is meaningless because g and η represent only one sample of the corresponding random sequence. In other words, in order to estimate a random sequence, it is reasonable to use statistical characteristics of given sequences as well as the information of the given sample sequence. The most
Image Processing-Based Mine Detection Techniques: A Review
171
popular way to estimate a random sequence is to compute the best linear estimate of ˆf from, fˆ GGg
(9)
which minimizes the average mean square error, 1 M1 M2
E[( fAfˆ )T ( fAfˆ )]
(10)
In Eq. (9), G represents an M1 M2BN1 N2 block matrix, and E[·], in Eq. (10), represents the averaging operation of the corresponding random sequence. By the orthogonality property of the random sequences, the estimate ˆf , that minimizes Eq. (10), must satisfy E{( fAfˆ )gT }G0
(11)
Substituting Eq. (9) into Eq. (11) for ˆf gives (12)
RfgAGRgg G0
where Rfg and Rgg , respectively, represent the cross covariance matrix of f and g and the auto-covariance matrix of g, such that Rfg GE{ fgT} and Rgg GE{ggT}
(13)
Using the relationships in (7) and (13), and assuming that f is uncorrelated with η , we can obtain the linearly estimated matrix, G, from (12) as T T −1 GGRfgR−1 gg GRff H (HRff H CRηη )
(14)
which is called the Wiener filter. If the degradation operation occurs in a space-invariant manner, H becomes a doubly block Toeplitz matrix. According to their definitions, both Rff and Rηη are also doubly block Toeplitz. If we assume a doubly block circulant approximation for each doubly block Toeplitz matrix and assume that M1 GM2 GN1 GN2 GN, Eq. (14) can be diagonalized by the two-dimensional DFT. Let F be the NBN DFT matrix, then the diagonalization process can be described as DG GDff D*H (DH Dff D*HCDηη)−1
(15)
where DG GFGF*, Dff GFRff F*, DH GFHF*, and Dηη GFRηη F* [23]. The (k, l)th diagonal element of DG in (15) can be obtained as ˜ (k, l)G G
˜ *(k, l) Sff (k, l)H ˜ (k, l)兩 Sff (k, l)CSηη (k, l) 兩H 2
G
˜ *(k, l) H ˜ (k, l)兩 CSηη (k, l) 兩H Sff (k, l) 2
(16)
172
Paik, Lee, and Abidi
˜ and H ˜ , respectively, represent the two-dimensional DFTs of the where G impulse responses of the Wiener filter and the degradation system. Sff and Sηη represent spectral density functions of f and η , which are the two-dimensional DFTs of Rff and Rηη , respectively. In many applications with degradation due to only noise, we can assume that the impulse response of the linear system is the unit impulse as h(m, n)Gδ (m, n)
(17)
or equivalently, its Fourier transform is the unity as ˜ (k, l)G1 H
for all (k, l)
(18)
If (18) is satisfied, the Wiener filter derived in (16) can be simplified as ˜ (k, l)G G
Sff (k, l) Sff (k, l)CSηη (k, l)
(19)
The two-dimensional DFT of the estimate is obtained as Ffˆ GFGgGFGF*FfGDG gˆ
(20)
which requires only N 2 complex multiplications. Therefore, a random sequence can be estimated from the space-invariant degraded observation by using the frequency-domain Wiener filter. 4.1.2. Gray-Scale Morphology Mathematical morphology has been applied primarily for binary image processing. Basic or the first level functions, such as dilation and erosion, are performed by structure elements with various shapes and sizes. Repetition of the basic functions forms the second level functions, such as opening and closing. By appropriately combining those operations, a regionbased processing, such as boundary extraction, region filling, and thinning, can be realized. Gray-scale morphology can provide more complicated processing, such as gradient extraction, contrast enhancement, and regionbased segmentation (watershed algorithm) as well as noise removal and smoothing which are typical applications of binary morphology. In order to explain morphological operators, the structuring element should be defined first. A structuring element can be considered as a simple matrix or a small window that represents a certain local property of the whole image. A structuring element defines the region of support around the origin, and it adds an offset value to each pixel on the defined region of support [24].
Image Processing-Based Mine Detection Techniques: A Review
173
Figure 12. Structure elements for morphological operations; (a) 5B5 octagonal window with uniform offset, (b) 5B5 diamond neighborhood with uniform offset, and (c) 5B5 rectangular neighborhood with pyramidal offset.
Figure 12a shows an octagonally shaped structuring element. The octagonal structuring element is widely used for mine detection because the octagon most resembles the round shape of mines. The origin has 5B5 neighboring pixels except at the four corners. Figure 12b shows a diamondshaped structuring element. It has relatively fewer neighboring pixels than the octagonal element. Figure 12c shows a rectangular-shaped structuring element with pyramidal offset distribution. The origin has the highest offset value while boundary pixels have relatively lower offsets. If a structuring element has uniform offset values, it is called a flattop filter. The basic operators, dilation, erosion, opening, and closing, will be introduced in the following section. A. Morphological Operators. Here we define some morphological operators by comparing with simple, linear 2D filtering [21]. The 2D discrete convolution of an MBN image f and the impulse response of a filter h is defined as h(m, n)* f (m, n)G
1
MA1 NA1
MN
k G0 l G0
∑
∑ h(mAk, nAl)f (k, l)
(21)
for mG0, ... , MA1, and nG0, ... , NA1. On the other hand dilation of an image f by a structuring element b is defined as
δ b ( f )G( f ⊕ b)(m, n) Gmax{ f (mAk, nAl)Cb(k, l)兩(mAk, nAl)∈Df ; (k, l)∈Db} (22) where Df and Db represent the domains of f and b, respectively. In (22), the displacement parameter condition, (mAk, nAl)∈Df , implies that the structuring element should completely be contained by the
174
Paik, Lee, and Abidi
set being dilated. This operation can be compared with the 2D convolution given in (21), where the max operation corresponds to summations in the convolution and addition of f and b corresponds to the multiplication of f and h in the convolution. f (mAk, nAl) represents f (k, l) flipped with respect to the origin and then shifted by (m, n). Since dilation is based on choosing the maximum value of xCb in a neighborhood defined by the specific structuring element, it has two effects: (i) the output image tends to be brighter than the input if all offset values are positive, and (ii) the dark details of the input image are either reduced or eliminated if the structuring element is larger than the dark area. Erosion of an image f by a structuring element b is defined as
ε b ( f)G( f 丢 b)(m, n) Gmin{f (mCk, nCl)Ab(k, l)兩(mCk, nCl)∈Df ; (k, l)∈Db },
(23)
where Df and Db are the domains of f and b, respectively. In Eqs. (22) and (23) the dilation and erosion functions are dual under the same condition. The function f (mCk, nCl) represents f (m, n) shifted by (−k, −l). Since erosion is based on choosing the minimum value of fAb in the neighborhood defined by the structuring element, its effects are opposite to dilations. In other words if all the offset values of the structuring element are positive, the output image tends to be darker than the input. And the bright details of the input image are either reduced or eliminated if the size of the bright area is smaller than the structuring element. The opening of an image function f by the structuring element b is defined as
γ b ( f)Gf ° bG( f 丢 b) ⊕ b
(24)
which is equivalent to the erosion of f by b followed by the dilation by b. The opening operation is used to remove small bright details, while keeping the overall gray levels unchanged and relatively larger bright features undisturbed. The initial erosion removes small bright details, and at the same time makes the image darker. The subsequent dilation increases the brightness of the image without reintroducing the bright details, which have been removed by the previous erosion. The closing of an image function f by the structuring element b is defined as
ϕb ( f )Gf • bG( f ⊕ b) 丢 b
(25)
which is equivalent to the dilation of f by b followed by the erosion by b. The closing operation is dual with respect to the opening operation. The closing effects are opposite to the combined effects of dilation and erosion.
Image Processing-Based Mine Detection Techniques: A Review
175
Figure 13. Experimental results of gray-scale morphological operations; (a) original image, (b) dilated image, (c) eroded image, (d) opened image, and (e) closed image.
Closing is generally used to remove small dark details, while keeping the overall gray levels unchanged and relatively larger, dark features undisturbed. The initial dilation removes small dark details and at the same time makes the image brighter. The subsequent erosion decreases the brightness of the image without reintroducing the dark details, which have been removed by the previous dilation. Figure 13 shows simulation results of morphological operations using a 5B5 flat octagonal structuring element. Brief intensity values statistics are also summarized in Table 7. In Figure 13b and e, bright details are enhanced, and dark areas are shrunk due to the removal of dark pixels. On the other hand, the average intensity value is significantly increased in (b), but not in (e). In (c) and (d), dark details are enhanced, and bright areas are shrunk due to the removal of bright pixels. The average intensity value is significantly decreased in (c), but not in (d). B. Morphological Gradient. The main goal of the morphological gradient transformation is to highlight gray level contours. When 2D image function f is continuously differentiable, its gradient can be obtained as g( f )G
1冢
∂f
∂x
2
∂f
冣 冢∂y冣 C
2
(26)
One simple way to approximate the gradient is to calculate the difference between the highest and the lowest pixel intensity values within a prespecified window, centered at the point of interest, say (x, y) [25]. In other Table 7. Statistics for Intensity Values from Figure 13. For 8-bit Gray-scale Mapping, the Maximum Gray Value of a Pixel is Equal to 255 and the Minimum 0 Image Average Minimum Maximum
(a) 98.7 3 238
(b) 120.7 9 238
(c) 78.3 3 213
(d) 92.2 3 213
(e) 105.3 9 238
176
Paik, Lee, and Abidi
Figure 14. Morphological gradient; (a) the original image and (b) the corresponding gradient image.
words, it is the difference between the dilated function δ ( f ) and the eroded function ε ( f ), expressed as g( f )Gδ ( f )Aε ( f )
(27)
Morphological gradient obtained by simulation is shown in Figure 14. C. Smoothing and Noise Reduction Using the Alternating Sequential Filter. The combination of opening and closing operations can remove noise and smooth the texture in an image. This is called the alternating sequential filter (ASF). Usually, the ASF performs well with repetition rather than a single operation. There are two different types of ASFs. The first type, known as the white ASF is defined as Φn ( f )Gϕ1 γ 1 ϕ2 γ 2 ϕ3 γ 3 ... ϕn γ n
(28)
where ϕ denotes the opening operation, γ denotes the closing operation, and each subscript represents the size of the corresponding structure element [16]. Equation (28) can be rewritten as Φn ( f)Gγ n (ϕn ... (γ 2 (ϕ2 (γ 1 (ϕ1 ( f ))))))
(29)
The white ASF performs both opening and closing on the object image with the smallest structural element, respectively. The filter then performs another opening and closing with the larger structural element, and this process keeps repeating.
Image Processing-Based Mine Detection Techniques: A Review
177
The black ASF performs the dual operation of the white ASF. Every step is the same as the white ASF except the order of opening and closing operations is switched [16]. The black ASF is defined as Ψn ( f )Gγ 1 ϕ1 γ 2 ϕ2 γ 3 ϕ3 ... γ n ϕn
(30)
which can be rewritten as Φn ( f)Gϕn (γ n ... (ϕ2 (γ 2 (ϕ1 (γ 1 ( f))))))
(31)
The goal of the ASF is to remove noise or to smooth an image while preserving the major components of the image. The performance of the ASF highly depends on the maximum size of the structural element, or equivalently on the size of the last structural element with repetition. To preserve details in the image, relatively smaller structural elements should be used. Figure 15 shows simulation results for the white ASF. The original image shown in Figure 15a is a processed image from an IR sequence in a test minefield [5]. Figure 15b is obtained by using the white ASF of (29), where (a) is used as the input image f and nG7. Figure 15c is the result of application of a white ASF with nG15, and (d) likewise, with nG23. Only the odd numbered structural elements have been used for symmetrical operations. In Figure 15 the large white circle located in the lower left corner of each image is suspected to be a mine, but the other black and white dots or small circles are negligible. The ASFs with relatively larger structural elements have efficiently removed those negligibly small dots and circles, as shown in Figure 15c and d. At the following segmentation step, Figure 15a will cause over-segmentation while (d) can be a reasonably conditioned input for segmentation. Figure 16a–d show the graphs of intensity values on the corresponding black lines in Figure 15a–d, respectively. The graphs (a), (b), (c), and (d) correspond to (a), (b), (c), and (d) in Figure 15. The original image has many small peaks and valleys as shown in Figure 16a. At each combination of opening and closing operations the ASF smoothes
Figure 15. Experimental results of the white ASF; (a) the original image [5], (b) the image filtered by a 7B7 ASF, (c) the image filtered by a 15B15 ASF, and (d) the image filtered by a 23B23 ASF.
178
Paik, Lee, and Abidi
Figure 16. Intensity value on the black line in Figure 15.
every object that is smaller than the corresponding structural element. When the ASF is used with a 7B7 structuring element, there remain a number of small peaks and valleys, which are circled in Figure 16b. Those circled peaks and valleys disappeared when the ASF is used with a 15B15 structuring element, as shown in Figure 16c. Even if the size of the structuring element is increased to 23B23, the desired large white circle still remains, as shown in Figure 16d. As shown in Table 8, the average gray value of the filtered image has not significantly changed, while the dynamic range, which can be considered as the difference between the maximum and minimum intensity values, has been reduced somewhat. Table 8. Statistics for Intensity Values in Image of Figure 16 Image Average Minimum Maximum
(a) 144.3 0 255
(b) 143.8 20 252
(c) 143.0 57 245
(d) 141.5 68 239
Image Processing-Based Mine Detection Techniques: A Review
179
The primary advantage of the ASF is that this filter can select the size of the object to be detected by determining the maximum size of structural elements. Another important property of the ASF is that it does not affect the overall statistics of the image.
4.2. Feature Extraction Both GPR and IR sensors produce huge amounts of data for Cscanned image sequences and dynamic thermography, respectively. Combination of two or more different types of sensors results in multiple, heterogeneous data. Extraction of the desired features from the large-scale, heterogeneous data is a daunting task. That orthogonal transformations can serve as a tool for removing redundant data and analyzing the desired property in large-scale data is well known. The discrete cosine transform (DCT) used in image and video compression is one example of this type of transformation, and the discrete Fourier transform or singular value decomposition is another. In this subsection two orthogonal transformations are presented that have produced promising feature extraction results in the mine detection area.
4.2.1. Karhunen–Loeve (KL) Transformation The principle behind the KL transform is a series expansion of the continuous random process. The discrete counterpart, also known as the Hotelling transform, was studied by Hotelling, who established the theory of the orthogonal transform for a discrete random vector. Given a real random vector, the orthonormalized eigenvectors of the vector’s autocorrelation matrix serve as the basis vectors of the KL transform. According to matrix theory, the coefficients of the KL transform are equivalent to the eigenvalues of the autocorrelation matrix. Since most energy of the input random signal concentrates on the first few coefficients, the KL transform is also called the principal component analysis [21,23]. If the KL transform is extended for 2D random images, a reduced set of basis images can represent the input random image with minimized representation error. Since the input image of the KL transform is assumed to be a 2D random process, the KL transform is suitable for analysis of time-varying image sequences produced by mine detecting sensors. Kempen et al. adopted the KL transform to analyze dynamic infrared image sequences for antipersonnel mine detection [18]. Consider a dynamic image sequence denoted by fmn , for mG1, ..., M, and nG1, ..., N, where M and N respectively represent the number of
180
Paik, Lee, and Abidi
Figure 17. An image sequence with M pixels and N images.
pixels in an image and the number of images in a sequence. Figure 17 shows a typical image sequence with N images and M pixels, where a vector corresponding to one pixel position along an image sequence is called a dynamic pixel or a dixel. A dixel represents the dynamic thermal evolution of a point in N dimensional space [18]. Note that dixels originating from the same object tend to form a cluster. In the KL transform domain, the basis vectors corresponding to the major transform coefficients represent the directions that maximize the distinction between clusters [25]. The dixel vector is defined as dm G [ fm1 fm2 ... fmn ]T in the dimensional Euclidean space, where the subscript m represents the position of the pixel in each image. For example, if there are only two images in the input image sequence, each dixel is represented by a 2D vector, as dm G [ fm1 fm2 [T, mG1,..., M. The M different dixel vectors form the dixel cloud, as shown in Figure 18. The normalized image sequence is obtained as fˆ mn GfmnAµn where
µn GE[ fmn]G
1
M
∑ fmn
M m G1
(32)
Image Processing-Based Mine Detection Techniques: A Review
181
Figure 18. A dixel-cloud in the 2D dixel space [16].
A unity vector uG[u(1) u(2) ... u(N)]T, 兩兩u兩兩G1, can convert the dixel vector dm into a scalar quantity rm , expressed as rm Gd Tm u
(33)
Note that the mean value of rm is equal to zero due to the normalization given in (32). The goal of the KL transform is to find the optimum u, which maximizes the variance of rm , that is E[r2m]GE[uTdm d Tm u]GuTCu
(34)
where C denotes the covariance matrix of dm , as CGE[dm d Tm].. Intuitively, the variance of rm can be considered as the degree of spread of dixels, and um , the corresponding direction. A 2D dixel space is shown in Figure 18, where u1 represents the optimum unit vector corresponding to the maximum variance of r1 . As shown in the figure, the samples are most widely distributed in the direction of u1, and then in the direction of u2 . To find the optimum direction, the constrained optimization problem must be solved as maximize E[r2m]
subject to uTuG1
(35)
182
Paik, Lee, and Abidi
The solution of (35) is obtained by solving the following equation with the Lagrange multiplier as ∇h(u)Aλ ∇g(u)G0
(36)
h(u)GuTCu
(37)
g(u)GuTuA1
(38)
where
and
According to (37), ∇h(u) can be computed as N
冢
∇h(u)G ∑ δ u(i) i G1
∂ ∂u(i)
h
冣
(39)
Since C is a symmetric matrix ∂ ∂u(i)
hG
∂
∂
冤∂u(i) u 冥 CuCu C 冤∂u(i) u冥 T
T
Ge TiCuCuTCei G2e Tiu
(40)
where ei represents the ith unit vector. According to (38), ∇g(u) in (36) can be computed as ∂ ∂u(i)
gG
∂
∂
冤∂u(i) u 冥 uCu 冤∂u(i) u冥 T
T
(41)
Ge TiuCuTei G2e Tiu From (40) and (41), (36) is rewritten as N
∇h(u)Aλ ∇g(u)G ∑ [δ u(i)e Ti (CuAλ u)] i G1
G[δ u(1) δ u(2)... δ u(N)][CuAλ u]G0
(42)
which is only possible if CuGλ u. This results in an eigen analysis problem, where λ is the eigenvalue and u is the corresponding eigenvector. ui , iG1, ..., N, represents N possible solutions. Since Cui Gλ i u, (37) can be reduced to h(ui)GuTiCui GuTiλ i ui Gλ i
(43)
Equation (43) indicates that the eigenvector ui , corresponding to the largest eigenvalue λ i , represents the direction for which the quadratic
Image Processing-Based Mine Detection Techniques: A Review
183
Figure 19. An IR image sequence of a minefield [5]; images taken at (a) noon, (b) afternoon, and (c) evening.
moment is maximized [18]. The parameter for the new orthogonal set of axes ri , where iG1, ..., N, can be computed as ri Gd Tm ui
(44)
As shown in Figure 18, the KL transform extracts two dixel axes u1 and u2 on the 2D dixel space, (x1 , x2). In the case of actual mine detection, the data dimension is usually greater than two. The GPR C-scanned image shown in Figure 6 has 500 images, and the dynamic IR image shown in Figure 10 has 94 images. By projecting dixels onto the r1 axis, feature extraction of an image sequence can be performed. Usually, the first orthonomal axis is considered as the feature direction, but sometimes multiple directions can be considered to obtain the optimal result. Figure 19 shows a set of sample images for the same position taken at different times with an infrared camera, AGEMA, with wavelength ranging from 3 to 5 µm. In Figure 19 (a) was captured at noon, (b) at 5 p.m., and (c) at 10 p.m. The complete set includes 49 images captured during a 24 hr period. Since the data transformed by the KLT is not the pixel value of a gray image but the relative difference between each pixel, contrast enhancement is required in the post processing. Figure 20 shows the result of the KLT applied to the images of Figure 19. In Figure 20 (a) represents the first transformed image, (b) the second transformed image, and (c) the eighth transformed image. Since the first transformed image is expected to have the most discriminative features, this image has the highest contrast. Unless a priori knowledge is given, the first transformed image is, in general, used for the feature data.
184
Paik, Lee, and Abidi
Figure 20. Transformed images from Figure 19 by KLT; (a) the 1st transformed image, (b) the 2nd transformed image, and (c) the 8th transformed image. For all images, contrast is enhanced by linear stretching.
4.2.2. Kitller–Young Transformation (KYT) Because the KLT treats all classes as a single scattergram, KLT chooses the main axes by considering the minimal representation error rather than the maximum discrimination ability. If the noise component is prominent in the entire sequence, noise may be considered as an important factor in selecting the main axes. The Kitller–Young Transformation (KYT) compensates for the weak discrimination ability of KLT by normalizing the variance within the classes [18]. In this case, the total covariance matrix C can be decomposed as CGσCµ
(45)
where σ represents the covariance matrix within the classes and µ represents the covariance matrix between the class averages. The solution can be achieved by solving the following eigenvalue problem
µuGλσ u
(46)
where the eigenvector ui , corresponding to the largest eigenvalue λ i , provides the direction for which the distinction between the classes is at its maximum. Figure 21 shows a typical KYT process. When two dixel classes are given as shown in (a), KYT rotates the original dixel classes as shown in (b). Then, the variance is normalized within the classes as shown in (c), and KLT applied to find the direction of the main axes as shown in (d). Finally, the classes are transformed into the original scattergram as shown in (e). The arrows KY1 and KY2 indicate the first and second transformed images by KYT and KLT, respectively.
Image Processing-Based Mine Detection Techniques: A Review 185
Figure 21. A typical KYT Process; (a) two dixel classes, (b) rotated dixel classes, (c) rotated dixel classes with variance normalized, (d) main axes obtained by the KLT, and (e) original scattergram with the main axes by inverse rotation [18].
186
Paik, Lee, and Abidi
Although some concepts of KYT are similar to KLT in the sense of the eigenvalue problem, some additional information should be determined to perform this transformation. Such additional information includes, for example, the class average, variance, and relative weight. These classes are determined by delimiting the dixel clouds manually. 4.3. Contrast Enhancement Since the contrast between the background and the mine target is not usually high enough, the raw sensor image can hardly give satisfactory information. The purpose of contrast enhancement is to enhance the difference between the mine target and the background to distinguish between them. Two methods, morphological contrast enhancement and histogram equalization, are introduced in this subsection. 4.3.1. Morphological Contrast Enhancement Morphological filtering for enhancing mine images has been proposed by Ederra in [16]. The first step of Ederra’s algorithm is to find peaks and valleys from the original image. Peaks represent the brighter parts of a gray-scale image, while valleys represent the darker part. Peaks are obtained by subtracting the morphologically opened image from the original image, and valleys by subtracting the original image from the morphologically closed image as p( f )GfAγ ( f ) and û( f )Gϕ ( f )Af
(47)
where f represents the original, p( f ) the peaks, û( f ) the valleys, γ ( f ) the opened, and ϕ ( f ) the closed images. Contrast can be enhanced by multiplying the peaks and valleys by constants as p′( f )Gc1 p( f )
(48)
where c1 G
兩max( f )Amax(I )兩 max[ p( f )]
and I indicates the dynamic range of the gray-scale image. For example, an 8-bit gray level image has a dynamic range [0, 255], where max(I )G255 and min(I)G0. Then û′( f )Gc2 û( f ) (49) where 兩min( f )Amin(I )兩 c2 G max[û( f )]
Image Processing-Based Mine Detection Techniques: A Review
187
The contrast-enhanced image is obtained by the summation of the original, the peak, and the negative valley images as f ′GfCp′( f )Aû′( f )
(50)
An example of a morphological contrast enhancement is shown in Figure 22. 4.3.2. Histogram Equalization The probability of gray level fk in an image f of dynamic range L can be described as pf ( fk)G
nk m
(51)
where fk ∈[0, LA1], nk represents the number of pixels with gray level fk , and m the total number of pixels in the image. A plot of pf ( fk) versus fk is called the histogram. The goal of histogram equalization is to obtain an image with a uniform histogram, which can be achieved by k nj G ∑ pf ( f j) j G0 m j G0
k
gk GT( fk)G ∑
(52)
under the assumption that T( fk) is a single-valued, monotonically increasing function, T( fk)∈[0, LA1], for fk ∈[0, LA1] [21]. The histogram-equalized image g has the uniform gray level probability as pg (gk)G
nk Gc m
(53)
where c is a constant through the entire gray level fk ∈[0, LA1]. Figure 22 presents examples of various contrast enhancement methods. The first column represents either the original or a contrastenhanced image, the second column represents the intensity profile on the black line of the image in the first column, and the third column represents the global histogram of the image in the first column. The original image shown in Figure 22a is the same as shown in Figure 19a, which was captured with an IR camera from a test minefield. A possible mine target can be identified in the lower left corner of the image. The gray level of the original image (a) is limited within a range of 150 to 200. With such poor contrast the target cannot successfully be identified. The linear stretched image (b) looks slightly better than the image shown in (a), but most pixels are still distributed in the upper half of the
188
Paik, Lee, and Abidi
Figure 22. Different sets of an image, intensity profile on the black line of the image, and the corresponding histogram (a) original image, (b) linearly contrast stretched image, (c) morphologically contrast-enhanced image using an octagonal structuring element, and (d) histogram equalized image [5].
Image Processing-Based Mine Detection Techniques: A Review 189
190
Paik, Lee, and Abidi
gray level range, and the target is still difficult to identify. The morphologically contrast-enhanced image shown in (c) was obtained by Eq. (50) using a 7B7 octagonal structuring element, but the gray level was not sufficiently enhanced. Small peaks and valleys can be easily removed by ASF, but eventually a wider range of gray level is desired. The histogram-equalized image given in (d) shows the best result. This histogram shows almost uniform distribution except for critically high or low levels. ASF can easily remove small peaks and valleys, making the target area readily identifiable. 4.4. Segmentation Using Watershed There are two different approaches to image segmentation. The first is the boundary-based approach, which detects local changes. The second is the region-based approach, which searches for pixel and region similarities. The watershed algorithm falls in the latter, region-based approach, and is used when edge information is not good enough to segment the image. Although this algorithm’s concept originated from geology, it has also been introduced in the context of mathematical morphology. Image data can be interpreted as a topographic surface where the gray levels represent altitudes. A catchments basin is defined as a region in which all points flow down and converge. The high-altitude region, corresponds to the watersheds, and the low-altitude region corresponds to the catchments basins. If we consider a local region where all rainwater flows to a single location, this might not seem to be applicable to intensity-based images, but it makes sense if the object is a gradient magnitude image. In this case, the catchments basins correspond to the homogeneous gray, level region, and the watersheds corresponds to the high-gradient region. 4.4.1. Basic Concept There are two different approaches to watershed image segmentation. The first approach starts with finding a downstream path from each pixel of the image to the regional minimum. The regional minimum is defined as a point, which does not have a descending path in its neighborhood. We can define a point on a digital surface S as s(x, f (x)), s∈Z 2BZ, where x∈Z 2 represents the 2D location of the point and f (x)∈Z the altitude of the point. A path on surface S can be defined as a sequence of points {si (xi , f (xi ))}. If two points si and sj are on a descending path, f (xi )⁄f (xj ) is always true.
for i ¤ j
(54)
Image Processing-Based Mine Detection Techniques: A Review
191
In other words, a point s∈S belongs to a minimum if there is no existing downstream path starting from s. A catchments basin is defined as a set of pixels for which all the downstream paths end up at the same minimum altitude. Each catchments basin represents a region of the segmented image. There are no general rules to uniquely define the downstream paths on digital surfaces, while its continuous counterpart is well-determined by calculating local gradients. The second approach is dual to the first. Instead of identifying the downstream paths, the catchment basins are filled from the bottom [26,27]. It is assumed that there is a hole in each local minimum, and the topographic surface is immersed in water step by step. If two catchments basins merge as a result of further immersion, a dam is built all the way to the highest surface altitude. The dam represents the watershed line. When the flooding reaches the highest level, only the dam, called the watershed line, remains.
4.4.2. Geodesic Functions A digital, gray-tone image can be represented by a function, f: Z 2 → Z. The point of the space Z 2 can be considered as the vertices of a rectangular or hexagonal grid, and f (x) the gray value of the image at point x. From now on, all spaces will be assumed as Z 2 unless otherwise stated. A section of f at level i is defined as Xi ( f )G{ f (x) ¤ i} and Zi ( f )G{ f (x) ¤ 1}
(55)
Their complementary relationship is also given as C (f ) Xi ( f )GZ iA1
(56)
The distance between a point y in a region Y and the nearest point of Y C is defined as d(y)Gdist(y, Y C)
for y∈Y
(57)
where Y C represents the complementary set of Y. A section of d at level i is given as Xi (d )G{y: d(y) ¤ i}GY 丢 Bi
(58)
where Bi is a disk of radius i, and 丢 represents morphological erosion [27,28].
192
Paik, Lee, and Abidi
Figure 24. An example of geodesic distance function; (a) the black dot represents a point x and the white H-shaped represents a region X, and (b) geodesic distance function from x within X; brightness is proportional to the geodesic distance.
Figure 23 shows an example of a distance function. A set of points Y and the complementary set Y C are given as the white and black areas in (a), respectively. The distance function of every point of Y to Y C is shown in (b). The brightest area indicates the pixels with the maximum distance to the complementary set. Geodesic distance is the distance between two points within a set where the two points belong. The geodesic distance function, dX (x, y), is defined as the length of the shortest path between x and y, where both points exist in the set X. Figure 24 shows an example of geodesic distance function. There is a point x in the set X in (a). The black dot represents a
Figure 23. An example of a distance function; (a) a binary image and (b) its distance function.
Image Processing-Based Mine Detection Techniques: A Review
193
pixel x, and the white H-shaped region represents the set X. The geodesic distance function from the point x to an arbitrary point y in the set X is represented as a gray level as shown in (b). The brighter values represent the longer distances. The dotted line indicates the same Euclidean distance. Since the paths toward the right-hand part of the H-shaped region have to take a bypass, the distance to the upper left-hand part of the H-shaped region is relatively shorter than to the right-hand part, while the Euclidean distance is the same.
4.4.3. Reconstruction Letting Y be any set, included in X, the set of all the points in X at a finite geodesic distance from Y can be computed as RX (Y)G{x∈X: ∃y∈Y, dX (x, y) ≠ S}
(59)
RX (Y) is called the X-reconstructed set by the marker set Y [28,29]. This set consists of all the connected components of X, centered at Y. Two gray image functions f and g are considered in the same way with the condition f⁄g. The corresponding sections of these two functions at level i are Xi (g) and Xi ( f ). Since f⁄g, Xi ( f ) is obviously included in Xi (g). For every level i, a new set can be obtained by reconstructing Xi (g) using Xi ( f ) as a marker. The new sets, RXi (g) (Xi ( f )), define a group of embedded sections of a new function, called the reconstruction of g by f, and is denoted as Rg ( f ). The dual reconstruction of g by f, under the condition f ¤ g, is denoted as R*g ( f ). This procedure is obtained by reconstructing the sections Zi (g) using Z( f ) as a marker. Xi ( f ) and Zi ( f ) are complementary to each other as indicated in (56). This procedure and its dual reconstruction processes extract the regional maximum and minimum, respectively. In order to find the regional maximum, the function f and fA1 are overlapped. Figure 25a shows the vertical slice of the overlapped functions. Then, the reconstruction of f using fA1 as a marker is obtained as Rf ( fA1). This is the white area shown in Figure 25b. Since the profile shown in Figure 25 is a slice of a two-dimensional gray level image, the actual shape of Rf ( fA1) has a volume. The set of local maximum M( f ) can be found by the difference between the function f and Rf ( fA1) as M( f )GfARf ( fA1)
(60)
M( f ) can be considered as the dark gray area shown in Figure 25b, and has the following relationships
194
Paik, Lee, and Abidi
Figure 25. Finding regional maxima and minima by reconstruction; (a) function f and fA1, (b) reconstruction Rf ( fA1) and regional maxima KM ( f ), (c) functions f, fC1, and regional minima km ( f ), and (d) reconstruction R*f ( fC1).
kM( f ) (k)G
冦0,
1,
x∈M( f ) x∉M( f )
(61)
For the regional minimum case, the functions f and fC1 are overlapped as shown in Figure 25c. The dual reconstruction of f using fC1 as a marker is obtained as R*f ( fC1), which is represented by the gray area in Figure 25d. The set of regional minimum M( f ) can be found by the difference between R*f ( fC1) and f as m( f )GR*f ( fC1)Af
(62)
m( f ) is presented as a set of binary data, the same as M( f ) in Figure 25c. Then km( f ) (x)G
冦0,
1,
x∈m( f ) x∉m( f )
(63)
These sets of regional maxima and minima will be used for markers in the marker-based watershed algorithm.
Image Processing-Based Mine Detection Techniques: A Review
195
Let Y be composed of n connected components Yi . Then, the geodesic zone of influence of Yi is the set of points of X that are at a finite geodesic distance from Yi and are closer to Yi than to any other Yj . The geodesic zone of influence of Yi is denoted as ZX (Yi ) [27]. Then zX (Yi)G{x∈X: dX (x, Yi ) ≠ S, ∀j ≠ i, dX (x, Yi )FdX (x, Yj )}
(64)
The entire set of zones of influence Y in X, IZX (Y), is defined as IZX (Y)G* zX (Yi)
(65)
i
The zones of geodesic skeleton influence of Y in X is obtained as the boundaries of ZX (Yi ) in the set X, and is denoted as SKIZX (Yi ) [27]. This is defined as SKIZX (Y)GX\IZX (Y)
(66)
where ‘‘\’’ represents the set difference. In Figure 26, the light gray region is ZX (Yi ), the sets of the zones of influence Y in X. The narrow region, which is not included in both ZX (Y1) and ZX (Y2) but in the upper set of X, is the SKIZ for the upper area, and the region not included in both ZX (Y3) and ZX (Y4) but in the lower set of X is the SKIZ for the lower area. The watershed transformation by flooding may be directly transposed into the method using the sections of the function f. Figure 27 is the topological interpretation of Figure 26. There is a section Zi ( f ) of f at the level i, and the flood has reached the level i in Figure 27a. In the next step, the flooding of ZiC1 ( f ) is performed in the zones of influence of connected components of Zi ( f ). The SKIZ, which are not included by any of Zi ( f )
Figure 26. Geodesic SKIZ of a set Y included in X.
196
Paik, Lee, and Abidi
Figure 27. Watershed construction using a geodesic SKIZ. Flooding is performed on only two levels, from i to iC1, for convenience; (a) the flood has reached the level i, (b) SKIZ remained, and (c) the minimum at level iC1 is added to the flooded area.
but ZiC1 ( f ) remains as a result of the flooding as shown in (b). Some connected components of ZiC1( f ), which have not been reached by the flood, are defined as minimum at the level iC1. This is the white area in (a). This minimum should be added to the flooded area in (c). The section at the level i of the catchments basins of f is obtained by WiC1 ( f )GIZZi G1 ( f ) (Zi ( f ))∫∪miC1 ( f )
(67)
where mi ( f ) is the minima of the function at the level i [27]. IZZiC1 ( f ) (Zi ( f )) for Figure 27 is the gray area in (b) excluding the SKIZ. The minima at level iC1 are given by miC1 ( f )GZiC1 ( f )\RZiC1 (f ) (Zi ( f ))
(68)
where RZiC1 (f ) (Zi ( f )) is the reconstruction of ZiC1 ( f ) using Zi ( f ) as a marker. WiC1 ( f ) for Figure 27 is the gray area in (c) excluding the boundary and SKIZ. This iterative algorithm is initiated with W−1 ( f )Gφ . At the end of the process, the watershed line DL( f ) is equal to the complementary set of the highest section of the catchments basins [27], and is defined as C DL( f )GW N (f )
(69)
where max( f )GN. The watershed line in Figure 27 is the boundary line of (c) including the SKIZ. 5. Experimental Results The AGEMA IR sensor, available in the 3 ∼5 µm band, was used to detect two buried mines under a gravel surface. Table 9 and Table 10 profile Table 9. Site Specification of Meerdeal Test Minefield [5] Collector
Minefield location
Soil condition
Sensor type
RMA
Meerdael, Belgium
Sand
AGEMA (3–5 µm)
Image Processing-Based Mine Detection Techniques: A Review
197
Table 10. Data Specification Acquired with AGEMA Sensor at a Gravel Field [5,30] No. of targets
Date and time
No. of frames
2
April 2, 11. 50∼April 3, 11. 30, 1998
48 (1 per 30 min)
the site and data specifications. The data set consists of 48 images, taken at 30 min intervals during a 24 hr period with a size of 256B256. The cellshaped texture comes from the gravel terrain.
5.1. Static Thermography For the static thermography analysis, one sample image was taken to obtain feature data. The contrast is enhanced using gray-scale morphology as shown in Figure 28a [16]. The enhanced image by the ASF is shown in Figure 28b [16]. The filtered image is segmented by the markerbased watershed algorithm, as shown in Figure 28c [26–29]. Comparing the segmented result and the ground truth data, one notices the target in the right-hand side has not been detected. Also, there are a few false alarms in the middle and right-hand side of the image.
5.2. Dynamic Thermography with Single Feature The goal of this experiment is to find every possible mine target. A feature image is extracted by KLT from a set of image sequence [18] below. This image is the projection onto the first dominant axes of pixels in the
Figure 28. Static Thermography [5]; (a) contrast enhanced sample image, (b) filtered, (c) segmented, and (d) ground truth [30].
198
Paik, Lee, and Abidi
Figure 29. Dynamic thermography with single feature; (a) contrast enhanced feature image, (b) filtered, (c) segmented, and (d) ground truth [30].
dixel space. Figure 29a shows the contrast enhanced feature image. This image clearly shows better-discriminated features than in the static thermography case. After the same filtering and segmentation process as in the previous case, two targets are successfully detected as shown in the segmented image, Figure 29c, but there are still four false alarms even though the large segmented set has been disregarded. 5.3 Dynamic Thermography with Multiple Features The goal of this experiment is to separate the possible mine targets from the background and to discriminate the actual mine targets from false alarms. The same feature extraction method as in the previous case is used, but two feature images are extracted this time as shown in Figure 30a and Figure 31a. These images are the first and second dominant axis of pixels in dixel space. Figure 29a and Figure 30a are identical. After the same filtering and segmentation process as in the previous cases, two segmented images are achieved as shown in Figure 30c and Figure 31c. The properties of mine targets in an IR image can be assumed as: (i) mine targets are usually round in shape, and (ii) if there is more than
Figure 30. Dynamic thermography with multiple features. (a) contrast enhanced feature image 1, (b) filtered, (c) segmented, and (d) candidate set 1.
Image Processing-Based Mine Detection Techniques: A Review
199
Figure 31. Dynamic thermography with multiple features; (a) contrast enhanced feature image 2, (b) filtered, (c) segmented, and (d) candidate set 2.
one mine within a region, two or more mine targets cannot be connected to each other in a feature image. In other words, two mine targets cannot be in the neighborhood in a segmented set. Considering these properties, two sets of candidates are obtained as shown in Figure 30d and Figure 31d. Considering only candidates appearing twice as mine targets, three objects are selected as mine targets as shown in Figure 32a. Two actual mine targets are successfully found, and the number of false alarm is reduced to one. A problem still remains, however. The number of appearances in the feature image will be an ambiguous parameter if more than two feature images are considered. This relationship, between the number of appearances in the feature image and the probability of the object being a mine target, should be clarified based on previous experimental experience. 6. Conclusions Sensor and image processing technologies have been studied for the purpose of mine detection. Because of the variety of mine types and deployment methods, mine detecting requires a full gamut of state-of-the-art
Figure 32. Result of proposed application; (a) selected targets, (b) actual mine targets, (c) ground truth [30].
200
Paik, Lee, and Abidi
technologies, which includes sensors, signal and image processing, real-time hardware, and numerical optimization techniques, to name a few. As the image processing techniques have received more attention in the related application areas, various image processing methods have been proposed for mine detection. This paper reviewed and summarized the upto-date signal and image processing techniques that have been applied to the mine detection area. Mines can be classified into: (i) anti-personnel and (ii) anti-tank mines, based on the target. Together with these two major types of mine, three different sensors: GPR, IR, and US sensors, have been introduced and summarized. After the sensing process, a final decision on mine existence is made with the help of image processing. Various sensors give different signals. The sensor output is considered as a two or a higher dimensional signal, and image processing techniques are applied to enhance and identify the shape of a target mine. Image processing techniques for mine detection have been classified into filtering, feature extraction, and contrast enhancement categories. For removing noise and undesired components in the sensor image, two different filters were introduced. The Wiener filter is a signal-dependent filter that restores the original signal by minimizing the mean square error between the estimated and the original signals. The morphological filter can efficiently remove noise by combining multiple morphological operations, and can also provide more complicated processing such as gradient extraction, contrast enhancement, and segmentation. Extracting mine-like shapes from the sensor image is a crucial task in the mine detection process. The Karhunen–Loeve transformation takes dynamic infrared image sequences, and represents the input image using a reduced set of basis images with minimized representation error. The Kitller–Young transformation compensates for the weak noise discrimination ability of KLT by normalizing the variance within the classes. The reason for choosing IR data is not that IR is more popular or important than EM, but that IR data is easier to explain how to apply general image processing techniques. Although we provided experimental results with emphasis on IR data, it is straightforward to extend the similar image processing techniques to EM or US data. The most serious problem in mine-detection applications is the ambiguity of the target signal due to low contrast. In order to enhance contrast, morphological contrast enhancement and histogram equalization methods have been surveyed. Although these two methods are used in general image processing applications, specific mine signals have been used to evaluate performance of the two methods.
Image Processing-Based Mine Detection Techniques: A Review
201
Many research groups have developed new detection devices with multiple sensors, and also the corresponding technology, called sensor fusion, to combine outputs from multiple sensors. This survey will serve as a signal and image processing background to better aid in understanding of existing technologies and in developing new technologies for mine detection.
References 1. The United Nations Mine Action Services, http:兾兾www.un.org兾Depts兾dpko兾mine 2. U.S. Department of State, 1998, Hidden killers 1998: The global landmine crisis, Bureau of Political–Military Affairs, Office of Humanitarian Demining Programs. 3. Sieber, A., 1995, Localization and identification of anti-personnel mines, European Commission Joint Research Center International Workshop. 4. E-OIR of USA, 1998, Fort Belvoir Minefield in Virginia. 5. The Royal Military Academy of Belgium, 1998, Meerdaal test minefield in Belgium. 6. Landmine database of the Norwegian peoples aid mine actions in Angola; http:兾兾 www.angola.npaid.org兾 7. Machler, P., 1995, Detection technologies for anti-personnel mines, Proc. Symposium on Autonomous Vehicles in Mine Countermeasures, v. 6, p. 150–54. 8. Kempen, L., 1997, Physical principles for anti-personnel mine detection: A survey of three sensing principles: Technical Report, IRIS-TR-0047, Department of Electronics and Information Processing, Vrije Universiteit Brussel. 9. Ekstein, R., 1997, Anti-personnel mine detection signal processing and detection principles, MS Thesis, Department of Electronics and Information Processing, Vrije Universiteit Brussel. 10. L. Kempen, L. and Sahli, H., 1999, Ground penetrating radar data processing: A selective survey of the state of the art literature: Technical Report, IRIS-TR-0060, Department of Electronics and Information Processing, Vrije Universiteit Brussel. 11. UWBGPR measurement at the Royal Military Academy, 1999, Belgium. 12. Brooks, J., Kempen, L., and Sahli, H., 1999, Ground penetration radar data processing: Clutter characterization and remova: Technical Report, IRIS-TR-0059, Department of Electronics and Information Processing, Vrije Universiteit Brussel. 13. Peters Jr., L., Daniels, J., and Young, J., 1994, Ground penetrating radar as subsurface environmental sensing tools, Proc. IEEE International Conference, v. 82, no. 12, p. 1802– 1822. 14. Acheroy, M., Piette, M., Baudoin, Y., and Salmon, J., 2000, Belgian project on Humanitarian Demining (HUDEM) Sensor Design and Signal Processing Aspects. 15. Kempen, L., Katarzin, A., Pizurion, Y., Corneli, C., and Sahli, H., 1999, Digital signal兾 image processing for mine detection, Part 2: Ground based approach, Proc. Euro Conference on Sensor Systems and Signal Processing Techniques applied to the Detection of Mines and Unexploded Ordnance, p. 54–59. 16. Ederra, G., 1999, Mathematical morphology techniques applied to anti-personnel mine detection, MS Thesis, Department of Electronics and Information Processing, Vrije Universiteit Brussel. 17. Bruschini, C. and Gros, B., 1997, A Survey of current sensor technology research for the detection of landmines, Proc. International Workshop on Sustainable Humanitarian Demining, v. 6, p. 18–27.
202
Paik, Lee, and Abidi
18. Kempen, L., Kaczmarec, M., Sahli, H., and Cornelis, J., 1998, Dynamic infrared image sequence analysis for anti-personnel mine detection, Proc. IEEE Benelux Signal Processing Chapter, Signal Processing Symposium, p. 215–218. 19. Russell, K., McFee, J., and Sirovyak, W., 1997, Remote performance prediction for infrared imaging of buried mines, Proc. SPIE Detection and Remediation Technologies for Mines and Minelike Targets II, v. 3079, p. 762–769. 20. Thermal neutron analysis, Ancore Inc., http:兾兾www.ancore.com 21. Schachne, M., Kempen, L., Milojevic, D., Sahli, H., Ham, Ph., Acheroy, M., and Cornelis, J., 1998, Mine detection by means of dynamic thermography: Simulation and experiments, Proc. IEE 2nd International Conference on the Detection of Abandoned Landmines, p. 124–128. 22. Gonzalez, R. and Woods, R., 1992, Digital image processing, Addison-Wesley. 23. Jain, A. K., 1989, Fundamentals of digital image processing, Prentice-Hall. 24. Heijimans, H., 1994, Morphological image operators, Academic Press. 25. Theodoridis, S. and Koutroumbas, K. 1998, Pattern recognition, Academic Press. 26. Beucher, S. and Lantuejoul, C., 1979, Use of watershed in contour detection, Proc. International Workshop on Image Processing: Real Time Edge and Motion Detection and Estimation. 27. Beucher, S., 1991, The watershed transformation applied to image segmentation, Proc. 10th Conference on Signal and Image Processing in Microscopy and Microanalysis. 28. Dougherty, E., 1992, Mathematical morphology in image processing, Marcel Dekker. 29. Roerdink, J. and Meijster, A., 2000, The watershed transform: Definitions, algorithms, and parallel strategies: Fundamenta Informaticae, v. 41, p. 187–228. 30. Verlinde, P., Acheroy, M., and Baudoin, Y., 2001, The Belgian Humanitarian Demining Project (HUDEM) and the European Research Context, Proc. Chiba University Workshop on Humanitarian Demining.