Hardware Efficient Underwater Mine Detection and ...

4 downloads 0 Views 302KB Size Report
Index Terms—Row-wise Processing, Shadow Detection, Statistical Features,. Top-hat Transform. 1. Introduction. Traditionally, human operators have been.
Hardware Efficient Underwater Mine Detection and Classification Neetika Bansal1, Karan Shetti2, Timo Bretschneider2, and Konstantinos Siantidis3 1

2

Nanyang Technological University, [email protected] EADS Innovation Works, {karan-rajendra.shetti, timo.bretschneider}@eads.net 3 ATLAS ELEKTRONIK GmbH, [email protected]

Abstract— Detection and classification of mine-like objects in side-scan sonar images needs to compensate for variability of objects, noise and background signatures. The unsupervised algorithm presented in this paper addresses improvements with respect to previous work and focuses on object and shadow detection based on morphological operators. Feature extraction from the detected objects and their classification into two classes, namely mine or non-mine like objects is described. Rowwise processing technique is applied for decreasing computational costs and memory usage to allow easy porting of the algorithm to an embedded architecture. The performance of the algorithms is measured against the obtained ground-truth. Index Terms—Row-wise Processing, Shadow Detection, Statistical Features, Top-hat Transform

1. Introduction

of the target architecture and low power consumption also need to be considered.

Traditionally, human operators have been used to identify underwater mines in side-scan sonar images. However, this process can be time consuming and results may be inconsistent. Therefore there has been a significant push to automate the entire process, while the growing usage of Autonomous Underwater Vehicles (AUV) for complex missions has further increased the need for such algorithms.

This paper is an extension of the work outlined in [1] and proposes an alternative approach by using top-hat transform for detection of potential mine-like objects. A row-wise approach, i.e. stream processing, is adopted for enabling an efficient real-time hardware implementation. Two image data sets were available from two different side-scan systems, i.e. Klein 2000 and EdgeTech. The precision and recall figures for detection of mine-like objects and shadow regions for the two data sets were calculated with respect to a manually generated ground-truth.

An Automatic Detection and Classification (ADAC) solution is only practical if it can perform the task in real-time during a mission. With an operational AUV speed of 4kn and 0.1m along-track resolution, all computations must be carried out within 48ms/row to attain real-time capability. Besides high accuracy and low false alarm rate, additional constraints such as computational complexity, small form factor

The developed algorithm achieves high precision and satisfactory recall figures for both sets of data. The incorporated classification

PROCEEDINGS OF SYMPOL-2011

approach successfully eliminates any remaining false alarms after the detection stage. The paper is organized as follows: Section 2 provides a literature review, while Section 3 presents the utilized data sets and selected morphological preliminaries. Section 4 documents the processing strategy including object detection, shadow detection, feature extraction and final classification. The assessment of the obtained results is carried out in Section 5. Section 6 provides a conclusion to the paper.

2. Literature Review Automatic detection and classification of underwater mines is a non-trivial problem because of the varying characteristics of the background seafloor and different shapes, sizes and orientations of actual mines as well as mine-like objects. The literature review in [1] summarized a number of different methods, for instance Markov Random Field model, Active Contour model and Canonical Correlation and Coordinate Analysis. Aung et al. [1] proposed an approach for real-time processing of side-scan sonar data onboard. An implementation on a suitable, lowpower hardware architecture was provided to meet the limitations of space and electrical power resources. A row-wise processing technique was followed to reduce the dimensionality of the problem and, hence, to reduce processing time. The continuous wavelet transform was used at different scales for enabling a certain level of size invariance in the detection. However, further analysis of the obtained precision and recall figures showed the presence of a shift between the original and the detected object. Offset computation and readjustment of the detected image accordingly was considered as a possible solution, but a further analysis showed that the magnitude and direction of the shift is inconsistent. Grayscale morphological operators were identified as powerful and robust tools for sea

mines detection by Lange and Vincent [2]. The so-called Alternating Sequential Filter (ASF) was applied on available images for noise reduction followed by top-hat (white) and antitop-hat transforms (black) for detecting minebodies and mine-shadows, respectively. Batman and Goutsias [3] presented an unsupervised detection of sea mines using mathematical morphology. They laid emphasis on the simplicity, efficiency and speed of the design with a 0.95 probability of detection. The highlights detection was done by-means of the so-called opening-by-reconstruction top-hat operator, which detects intensity peaks in an image. Shadow detection was carried out by applying an opening operation followed by a variation of morphological gradient operator. Features like mean, standard deviation, maximum amplitude, area, height, width of highlights and shadows were extracted to obtain a feature vector. Finally, the classification module was designed using variants of Gaussian density related discriminant functions. Swartzman and Kooiman [4] presented an algorithm for fast identification of mine-like objects in side-scan images. The image was divided into equal-sized bins and an opening operation with a non-linear structuring element was applied. Afterwards the result was thresholded for finding highlights and shadows. A connected components algorithm was used for locating consecutive pixels with similar intensities and their sizes and positions were noted. The matching of a mine-like object with its shadow was carried out based on their proximity. The high density regions for such objects were located by k-means clustering based on extracted statistical features. Siantidis and Hölscher-Höbing [5] presented an approach in which the images were pre-processed for reduction of noise and a segmentation method based on Markov Random Fields was used for the detection of highlights. Textural features were considered suitable for mine classification and 14 statistical features were extracted before reducing the

Neetika Bansal et al.: Hardware Efficient Underwater Mine Detection and Classification

dimensionality by applying a principal component analysis. Finally, a fuzzy classifier classified the detected objects in two classes, i.e. mine and non-mine like objects, achieving a probability of 1.0 for the detection stage while resulting in a false alarm rate of 0.39. To conform to row-wise processing, this paper investigates the use of linear structuring elements at multiple scales. An additional denoising step is introduced to reduce the number of false alarms. Also, use of anti-top hat transform for shadow detection is found to be unsuitable for the available dataset because of randomly cluttered low intensity pixels present in the background. Hence, a novel approach for shadow-detection is proposed.

3. Preliminaries 3.1 Available Data Set A total of ten side-scan sonar images, acquired using the Klein 2000 side-scan sonar system with a sound frequency of 500kHz were used for the development and testing of the proposed processing strategy. The images have a pixel resolution of 0.1m0.1m. Some images contain one underwater mine each while some do not have any mines. The characteristics of the seafloor vary sharply between images as well as within given images. In order to demonstrate the universality of the developed solution, another data set obtained using the EdgeTech side-scan sonar system was used.

3.2 Top-hat Transform A morphological operator can be used to extract information about a region in an image. It uses a structuring element to define the region and a set operation to characterize the type of operation. The result quantifies how the structuring element, selected based on the expected geometric properties of the objects, fits the image. Dilation (region growing) and erosion (region shrinking) are the most

common morphological operators which form the basis of more complex operations. Top-hat (white) transform is a technique based on image morphology for extracting higher intensity patterns from images while attenuating noise. It can be defined as the grayscale opening of an image with an appropriate structuring element subtracted from the original image. The grayscale opening operation is an erosion operation followed by a dilation operation used to select certain intensity patterns and removing others based on the size of the structuring element. The top-hat operation on a single row with row number r for a linear structuring element is defined as

Tw r,   X r   X r    ,

(1)

where X(r)α represents an opening operation performed on a row X(r) with a linear structuring element α.

4. Algorithm Design 4.1 Object Detection This paper proposes an algorithm based on top-hat transform for detection of potential mine-like objects in images obtained from sidescan sonar. These images are subjected to a pre-processing stage as outlined by Aung et al. [1]. The resultant images are ground-range projected and compensated for attenuation of sound waves in water. Since the detection method needs to account for a complex, heterogeneous seafloor and also for objects of varying shapes and sizes, a modified approach of using top-hat at different scales is followed. This is congruent with the fact that the top-hat transform extracts objects of size equal to or smaller than the size of the structuring element used. To maintain the objective of using a row-by-row processing strategy, a linear structuring element is used for applying top-hat transform on a single row. The PROCEEDINGS OF SYMPOL-2011

scales chosen for α through experimentation are given by the row vector I1,(2n+1) of length 2n+1 with n={1,2,3,4}. The top-hat transform of a higher order scale can be derived directly using different structuring elements. However, this approach requires a significant amount of resources when implemented in hardware. Another approach is to recursively use the result of the previous scale and a constant structuring element. For instance, the erosion operation with a structuring element of scale I1,5 can be obtained by eroding twice with a structuring element of size I1,3. This fact holds true for dilation operation as well. While this approach introduces dependencies and reduces the degree of parallelism between different scales, it significantly lowers the amount of required hardware resources and allows more parallelism within a single row. For the given hardware, i.e. a Stretch S6105 software-configurable processor [6] in this work, the chosen alternative improves the overall execution time of the entire process. This approach is depicted in Figure 1. The results of top-hat transform for a row r at the four different scales are added up to give the final result

T final r    Tw r , I1, 2 n1 , 4

(2)

n1

which is then subjected to adaptive thresholding. Adaptive or dynamic thresholding is applied to compensate for the varying characteristics of the seafloor. In particular, a first-infirst-out sliding buffer that stores the ‘row history’ for the current row is used. The ‘row history’ is defined as the standard deviation of pixel intensities for the previous five mine-free rows (0.5m). Five rows are used to model the variation in seafloor statistics, ensuring that spatially only the most relevant ‘row-history’ is used. The final threshold θn is defined as the mean of the standard deviations of five rows multiplied by a parameter β setting the level of sensitivity:

θ n  β  μσ X r  1,..., σ X r  5.

(3)

Figure 1. Schematic representation of top-hat operation using the direct and indirect approach The pixels that exceed the threshold labeled as region of interest (ROI) and the binary image is subjected to further verification. The next step is the de-noising of the resulting binary image to reduce the number of false alarms, i.e. heavily clustered high-intensity noise pixels that do not fulfill the size expectation of mine-like objects. Firstly, erosion followed by dilation with a linear structuring element of I1,5 eliminates insignificant ROIs. Secondly, a vertical size threshold, for instance 5 pixels (0.5m), is enforced on the prior knowledge of the minimum size of potential mine-like objects. Accordingly, five rows are buffered and summed column-wise. In the resulting sliding sum vector, if any pixel exceeds the size threshold, the particular ROI, described by its location (xmin,xmax,ymin,ymax), is kept for further analysis.

4.2 Shadow Detection A shadow can be defined as a region of homogenous near-zero intensity pixels and can

Neetika Bansal et al.: Hardware Efficient Underwater Mine Detection and Classification

be used to validate an adjacent mine-like object, i.e. for disqualifying false alarms. For shadow detection, row-wise median filtering is applied on the normalized image reducing the effect of speckle noise, while preserving the edges. In the subsequent thresholding step, a pixel in a row is marked as a shadow pixel if its pixel intensity, P(x,y) is lesser than or equal to γ times the mean of the intensities of the pixels in that row, i.e.

Px, y   γ  μPx,0,..., Px, Ncol 

(4)

Lastly, shadow localization is carried out in which the detected object is matched with its corresponding shadow region. Two schemes are proposed in the following. 4.2.1 One-dimensional Scheme. Given the functional principle of side-scan sonar, the vertical extent of the shadow (ymax–ymin) is approximately equal to that of the mine-like object. The horizontal extent of the shadow (xmax–xmin) is found by utilizing this fact and detected mine objects and shadows are matched. However, the presence of falsely detected shadow regions can affect the performance of the algorithm negatively. 4.2.2 Two-dimensional Scheme. In the second approach, the connected components for the binary image obtained after thresholding are found. A grouping of pixels into components is done based on pixel connectivity which represents their spatial adjacency. A connectivity of four or higher is found to be suitable for the data set. The result is an image with a unique label for every connected component. Since the shadow region is homogenous, it is marked as one connected component in the binary image. In general, the size of this region is larger than that of a component representing cluttered noise pixels since noise regions lack homogeneity. Hence, the largest connected component within the approximate vertical extent of the mine-like object is marked as the shadow region. This approach gives better

result since it disqualifies detected noise regions on the basis of both their proximity to detected objects and size.

4.3 Feature Extraction As suggested in previous work [3,5], statistical texture features are used to represent the data obtained from side-scan sonar. Two first order statistical features, mean and variance are found to be sufficient to represent the three classes: mine, shadow and background. These features are advantageous as they are computationally inexpensive to implement in hardware. In order to adapt to the changes in the background, a sample of previous 50 rows from ymin for each object is chosen. This ensures that most relevant background information is used for extracting features. Second order statistical features, e.g. correlation, contrast, homogeneity were also extracted and analyzed using co-occurrence matrices. These features, however, did not significantly improve the result in case of the available data. They are also computationally more expensive, therefore only first order features were used in the classification.

4.4 Classification The purpose for this stage is to classify detected objects as mine and non-mine like objects, which to a certain extent was already performed by the matching of potential minelike objects with adjacent shadow regions. Secondly, a k-means classifier using the Euclidean distance is used for further elimination of false alarms, i.e. k=2. For real-time implementation, the training was carried out offline. The result of the training stage, containing the position of the cluster centers is then utilized on the hardware for finding Euclidean distance of extracted features from the respective centre. The object under

PROCEEDINGS OF SYMPOL-2011

Even at a pixel resolution of 0.1m0.1m, some pixels exhibit an ambiguous behavior in terms of class association, i.e. mixed classes occur. Thus, six types of classes were used for the quantitative analysis – namely the pure classes mine, shadow, background as well as the mixed classes mine-shadow, mine-background, and shadow-background. This methodology provides a practical perspective on the quantitative measurements in the presence of ambiguous classes. The ground-truth was derived manually for each image in the dataset. The precision PC and recall RC for a class C can be computed using

PC 

C p  Cd  Cm  Cd

Table 1. Precision (P) and recall (R) figures for CWT and top-hat for Klein 2000 data Image

Port

5. Results and Analysis

the object in the binary detection image overlaps with ground-truth. For both CWT and tophat segment level performance is 100%.

Starboard

consideration is classified under the nearest cluster center.

1000 1048 1122 1301 1302 1334 1356 1033 1135 1341

CWT P R 0.08 0.23 0.10 0.26 0.13 0.20 0.11 0.29 0.05 0.49 0.12 0.27 0.06 0.43 0.07 0.10 0.11 0.30 0.05 0.51

Top-hat P R 1.00 0.40 0.88 0.59 0.97 0.24 0.95 0.22 0.09 0.84 0.94 0.43 0.70 0.54 1.00 0.15 1.00 0.52 0.68 0.83

and

Cd

Figure 2. Venn diagram for calculating precision and recall figures

The precision and recall figures for the proposed top-hat algorithm using Klein 2000 data are given in Table 1. The same figures for CWT [1] re-computed based on the new method of measurement are also tabulated for comparison with top-hat. Both precision and recall figures show improvement for top-hat transform images since the shift problem present in the CWT was addressed. With the introduction of a de-noising step, the number of false alarms was reduced to almost zero and hence, precision figures are near unity for most images. For image 1302, the presence of highly cluttered noise raises a false alarm and reduced precision greatly. Perfect recall is not achieved since most mines comprise of pixels of varying intensities, some of which do not cross the threshold. Reducing the threshold would decrease the precision and hence, an intermediate level is chosen which maintains very high precision and satisfactory recall figures.

The focus of the CWT-based algorithm [1] was attaining optimal segment-level performance. The segment-level performance of the algorithm is defined such that an object is considered successfully detected if at least 25% of

Table 2 lists the precision and recall figures for Klein 2000 data for both proposed shadow localization schemes. It can be seen that the recall figures obtained from the two schemes are comparable, while generally the precision

RC 

C p  Cd  Cm  Cd Cd  Cm  Cd

(5)

,

where the operator || represents the number of contained pixels of that particular operand. The variables Cp, Cm and Cd represent the pixels of the pure, mixed and detected class C, respectively. A visual representation in the form of a Venn diagram is shown in Figure 2.

Cm

Cp

Cd

Neetika Bansal et al.: Hardware Efficient Underwater Mine Detection and Classification

was improved by the two-dimensional connected component approach. For images that have a significant level of speckle noise in the background leading to visually indistinct shadows, smoothing fails and the shadow for the mines are not detected at all. Table 2. Precision (P) and recall (R) figures for shadow detection using one-dimensional and two-dimensional schemes for Klein 2000 data

Starboard

Port

Image 1000 1048 1122 1301 1302 1334 1356 1033 1135 1341

Onedimensional P R 0.54 0.58 Not detected Not detected 0.61 0.47 0.23 0.24 0.68 0.87 0.94 0.74 0.82 0.96 1.00 0.08 Not detected

Twodimensional P R 0.93 0.58 Not detected Not detected 0.69 0.59 1.00 0.25 0.85 0.85 0.95 0.85 0.88 0.98 1.00 0.07 Not detected

Table 3. Precision (P) and recall (R) figures for object detection using tophat transform and shadow detection using two-dimensional scheme for EdgeTech data Image A1 A8

Top-hat P 0.51 0.70

R 0.32 0.58

classification stage. The mine-shadow matching step preliminary disqualifies any false alarms. Then further, the k-means classifier successfully classifies detected objects as mine or not mine to increase confidence in the obtained result. The results for a sample image at different stages of the processing scheme are presented in Figure 3.

(a)

(d)

(b)

(e)

(c)

(f)

Twodimensional P R 0.60 0.78 0.43 0.86

The results obtained for EdgeTech data is presented in Table 3. From the limited data available, i.e. only two images contain actual mines, it can be concluded that the results obtained are comparable to the ones achieved for Klein 2000 data, providing some evidence for the universality of the proposed algorithm. The false alarms that still remain after the denoising step are successfully disqualified in the

Figure 3. (a) Original image, (b) object detection using top-hat, (c) subsequent

PROCEEDINGS OF SYMPOL-2011

de-noised image, (d) median filtered image, (e) detected shadow region, and (f) matched mine-like object with shadow region

6. Conclusion The proposed approach for detection and classification of potential mine-like objects contributes mainly in terms of its simplicity, robustness, ease of hardware implementation and performance accuracy. The use of linear structuring elements for morphological operators ensures that row-wise processing approach is maintained, which aids a real-time capable implementation of the developed algorithms. The use of top-hat transform successfully eliminates the shift problem present in CWT and greatly improves the precision and recall figures. The object detection algorithm was further supplemented with de-noising step to provide high accuracy of detection and reduce the number of false alarms. The developed shadow localization techniques enabled the extraction of suitable statistical features and proved to be sufficient to disqualify any remaining false alarms after the detection stage. The embedded implementation for object detection on the Stretch 6105 with the presented optimizations gave a timing of 0.7ms/row. Since the maximum tolerated time

for the entire processing strategy is 48ms/row and object detection forms a major portion of that, this approach evidently achieves real-time capability.

References [1] M. Aung, T. Bretschneider, K. Lee, K. Siantidis, “Real-time embedded underwater mine detection in side-scan sonar data”, Proceedings of the European Conference on Underwater Acoustics, 2010. [2] H. Lange, L. Vincent, “Advanced gray-scale morphological filters for the detection of sea mines in side-scan sonar imagery”, Proceedings of the SPIE 4038, pp. 362-372, 2000. [3] S. Batman, J. Goutsias, “Robust morphological detection of sea mines in side-scan sonar images”, Proceedings of the SPIE 4394, pp. 1103-1115, 2001. [4] G.L. Swartzman, W.C. Kooiman, “Morphological image processing for locating minelike objects from side scan sonar images”, Proceedings of the SPIE 3710, pp. 536-542, 1999. [5] K. Siantidis, U. Hölscher-Höbing, “A system for automatic detection and classification for a mine countermeasure AUV”, Proceedings of the Int. Conference and Exhibition on Underwater Acoustic Measurements, 2009. [6] Anonymous, Stretch, last accessed May 2011, http://www.stretchinc.com/products/s6000.php

Neetika Bansal et al.: Hardware Efficient Underwater Mine Detection and Classification

Suggest Documents