A hierarchical classification system for object ... - Semantic Scholar

294 downloads 0 Views 642KB Size Report
Gian Luca Foresti, Senior Member, IEEE, and Stefania Gentili. Abstract—In this ...... [5] G. Tascini, P. Zingaretti, and G. P. Conte, “Real-time inspection by sub-.
66

IEEE JOURNAL OF OCEANIC ENGINEERING, VOL. 27, NO. 1, JANUARY 2002

A Hierarchical Classification System for Object Recognition in Underwater Environments Gian Luca Foresti, Senior Member, IEEE, and Stefania Gentili

Abstract—In this paper, a hierarchical system, in which each level is composed by a neural-based classifier, is proposed to recognize objects in underwater images. The system has been designed to help an autonomous underwater vehicle in sea-bottom survey operations, like pipeline inspections. The input image is divided into square regions (macro-pixels) and a neural tree is used to classify each region into different object classes (pipeline, sea-bottom, or anodes). Each macro-pixel is then analyzed according to some geometric and environment constraints: macro-pixels with doubt classification are divided into four parts and re-classified. The process is iterated until the desired accuracy is reached. Experimental results, which have been performed on a large set of real underwater images acquired in different sea environments, demonstrate the robustness and the accuracy of the proposed system. Index Terms—Autonomous underwater vehicles, hierarchical classifier, image classification, neural trees, underwater images.

I. INTRODUCTION

I

N THE last years, several studies have been focused on the problem of understanding images for autonomous underwater vehicle (AUV) navigation. In fact, recent market analysis has underlined that the cost of using AUVswill be in the next years lower than the actual costs of remote operated vehicles (ROVs) controlled by human operators. AUVsare designed to perform autonomously operations such as navigation and obstacle avoidance [1]–[4], visual inspection of man-made structures like pipelines [5] and cables [6], inspection and docking of off-shore structures [7], detection of submersed object (e.g., mines [8]), object manipulation [9], etc. Tascini et al. [5] developed a real-time imaging system for detecting and tracking a pipeline. The system is based on the integration of pipeline edges positions coming from six horizontal strips in the observed image. Most existing methods for detecting the position of pipeline edges are based on the Hough transform [10], [11] or on classical edge detection methods [12]. However, if longitudinal strips appear along the pipeline due to sand, seaweed or to particular pipeline characteristics, these methods produce wrong classification of pipeline edges. An improvement in the detection of underwater pipelines as been proposed in [13]. The method uses a multilayer perceptron (MLP) network with a back-propagation learning algorithm to recognize pipeline edges in complex underwater environments. It is able to find right pipeline borders also if they are partially covered by sand or seaweed, and in the presence of longitudinal Manuscript received September 26, 1999; revised September 1, 2000. The authors are with the Department of Mathematics and Computer Science (DIMI), University of Udine, 206, 33100 Udine, Italy (e-mail: foresti@ dimi.uniud.it; [email protected]). Publisher Item Identifier S 0364-9059(02)00658-1.

strips. The main drawback of this method is that the region classified as pipeline border is large about 30–50 pixels on standard video images (768 576 pixels), so that the estimation of the edge position and edge orientation is approximated. In this paper, a hierarchical neural-tree classifier (HNTC) is proposed to recognize objects in underwater images. It has been developed to help an AUV in complex sea-bottom survey operations, like pipeline inspections. The proposed HNTC reaches a high classification accuracy, without losing either the great generalization capability of artificial neural networks or the ability to recognize straight object borders of Hough-based methods [10], [11]. The input of the proposed system is represented by an underwater image compensated in intensity [14], [15] and transformed into a two-dimensional (2-D) top-view representation. Fig.1(a) represents an example of a pipeline image, while Fig. 1(b) and (c) shows the same image after luminosity compensation and 2-D top-view transformation, respectively. The regions (called macro-pixels) input image is divided into and a neural tree (NT) [16]–[19] is used to classify each region into different object classes (e.g., pipeline, sea-bottom and anodes). Each macro-pixel is then analyzed according to geometric constraints: macro-pixels with uncertain classification are divided into four parts and re-classified. The process is iterated until the desired approximation is reached. Moreover, the system detects and recognizes some landmarks placed on the pipeline (i.e., anodes) the position of which is apriori known and it can be used to infer the AUV position along the pipeline [14]. The main advantage of using a NT instead of a classical neural network, (e.g., MLP, feedforward, etc.), is that the NT structure evolves autonomously according to the distribution of the patterns which compose the training set. Apriori information about the network structure (number of internal nodes, number of hidden neurons, number of hidden layers, connectivity between layers, etc.) is not required [19]. The paper is organized as follows. Section II gives a general description of the proposed system and, in particular, a description of the hierarchical classification method. In Section III, a description of the decision strategy adopted to judge the correctness of the classification of some macro-pixels is presented. In Section IV, the pipeline characteristics and the features used by the classifier, for every type of image, are described. Finally, Section V shows some experimental results on real underwater images. II. HIERARCHICAL CLASSIFIER The general architecture of the HNTC is presented in Fig. 2. The input image, after luminance compensation and 2-D top-

0364–9059/02$17.00 © 2002 IEEE

FORESTI AND GENTILI: HIERARCHICAL CLASSIFICATION SYSTEM FOR OBJECT RECOGNITION

Fig. 1.

67

Example of a real pipeline image: (a) Original image. (b) Image after compensation in luminosity. (c)Iimage after 2-D top-view transformation.

Fig. 2. General scheme of the hierarchical classification system.

view transformation, is divided into square regions of pixels , called macro-pixels. Appropriate features are automatically extracted from each macro-pixel and a neural tree is used to classify the macro-pixel

as belonging to a given object class (pipeline, sea bottom or anode). Classification errors or inaccuracies, which generally are focused on macro-pixels being on the object borders, are detected by a geometrical reasoning [14] and the corresponding

68

IEEE JOURNAL OF OCEANIC ENGINEERING, VOL. 27, NO. 1, JANUARY 2002

Fig. 3. Intermediate results of the hierarchical classification process (dark grey level correspond to pipeline, grey level ones to sea bottom and light grey level ones are uncertain pixels).

macro-pixels are considered as “uncertain classified”. In order to improve the final classification, uncertain macro-pixels are reclassified with greater accuracy by a multi scale method [20]. Fig. 3(a) shows the output of the first classification level, where sea-bottom regions are represented in grey level and pipeline regions in dark grey level, respectively. Macro-pixels characterized by uncertain classification are labeled in a different way [with a light grey level label in Fig. 3(b)]. The input image is non overlapping macro-pixels then divided into [Fig. 3(c)] and the new classification is performed only on the uncertain macro-pixels [Fig. 3(d)]. As shown in Fig. 3(d), the more detailed classification of discontinuity regions allows to estimate with great accuracy the position of pipeline edges. It is worth noting that there are some macro-pixels being on discontinuity regions that are not correctly classified. Again, uncertain macro-pixels [represented in light grey level in Fig. 3(e)] are localized, divided into four regions and re-classified until a desired level of accuracy has been reached or the macro-pixel side became equal to 2. This method has the advantage that it allows a very fast classification of inner parts of pipeline or sea bottom regions (at the first level) with respect to a pixel by pixel classification of the whole image. On the other hand, discontinuity regions, where high accuracy is necessary, are classified at the last level. Moreover, inside the object regions, at the first level, the method reduces possible misclassification errors due to the presence of noise (which is typically distributed in small local areas of the input image). When the whole image has been classified at the latest level, the algorithm calculates the position of the object borders in the original image coordinate system. In the case of pipelines, due to the presence of seaweed and encrustations, and to variations of the diameter, it happens that the pipeline border is not a straight line, and the diameter of the structure is not constant. To supply a unique information on border positions, the number of pixels classified as pipeline in every column of the output image is evaluated; a column is considered belonging to the is greater than a fixed threshold. Expipeline if and only if perimental tests suggest to fix the value of the threshold to one

Fig. 4. Result of pipeline classification on the original image in Fig. 2 showing the detected pipeline borders.

half of the column height. Then, pipeline borders are deduced as the straight lines before the first pipeline column and after the last pipeline column (from left to right). In Fig. 4, the output of the processing (pipeline classification and pipeline border estimation) on the image in Fig. 2 is shown. A. NT Classifier A NT is a new hybrid concept whose creation was motivated by the combination of advantages of neural networks (NNs) [21] and decision trees (DTs) [22]. The proposed NT (for a more detailed description see [19]) is composed by two different typologies of nodes: (a) internal nodes, which are represented by perceptrons with a sigmoidal activation function and (b) leaf nodes, which are simple class nodes (they assign to the current pattern

FORESTI AND GENTILI: HIERARCHICAL CLASSIFICATION SYSTEM FOR OBJECT RECOGNITION

69

the class to which their belong). Each internal node takes inputs, which represent the selected object features, and generates outputs, called activation values, one for each class. be the vector of input patterns and Let TS a convenient training set. The NT grows during the learning phase. At the beginning, the NT is processed by the root node, which tries to subdivide the training set into simpler subsets. A new level of nodes (children nodes) is added to the tree. If one or more subsets are correctly classified (entirely assigned to the correct class ), the corresponding children node becomes a leaf node. The other subsets are used to train other perceptrons that try to divide them into subsets. The algorithm ends when all the current nodes are leaves. Once the NT has been successfully trained, it can be used to classify new patterns. A new pattern is presented to the root of the tree, and the classification procedure move toward the tree in a top-down way following at each node a path determined by the highest activation values. When a leaf node is reached, the pattern is labeled with the classification provided by the current node. III. UNCERTAIN PIXEL DETERMINATION In this section, a description of the strategy adopted to estimate which pixels are uncertain is presented. The macro-pixels to be analyzed (the output of the classification) can assume two different values: 0 for pixel classified as sea bottom (class ) and to 1 for pixels classified as pipeline or anode (class ). Let be the image showing the classification of the origbe the pixel in the posiinal image at the level. Let in the image obtained after the th classification; tion corresponds to a macro-pixel in the original image of pixels centered in . In order to evaluate whenever the classification of the macro-pixel is correct, the classification obtained by neighborhood macro-pixels is considered. The mean classification value surrounding macro-pixels is evaluated as follows: of (1) and compared with two thresholds Th and Th . The following value is assigned to the classified pixels If corresponding to the class . corresponding to the If class . corresponding If to a new class , i.e., uncertain pixel. This method allows both to correct classification errors and to evaluate uncertain pixels to be classified again. The main idea to using geometric information is based on the fact that three possible situations can occur. • A pixel is classified as sea bottom but it is surrounded by pixels classified as pipeline. • A pixel is classified as pipeline but it is surrounded by pixels classified as sea bottom. • The neighborhood pixels are partially composed by pixel classified as sea bottom and pixels classified as object.

Fig. 5. Examples of real underwater images containing pipelines belonging to the four different classes: (a) Red. (b) Black. (c) Spots. (d) Gravel.

In the first two cases, as the probability of wrong classification of each macro-pixel is independent from the probability of wrong classification of the other ones, it is likely that the pixel classification is wrong and the algorithm automatically corrects the classification of the pixel. However, as in the third case, if the surrounding pixels are partially classified as pipeline and partially as sea bottom two possibilities can occur. • The considered macro-pixel is closed to the pipeline border. • There are some classification problems on the whole neighbor. In both cases, a more detailed classification of the macro-pixel is necessary. In order to obtain the described desired behavior, and . the thresholds should be: IV. UNDERWATER IMAGE ANALYSIS EXTRACTION

AND

FEATURE

An evaluation of the pipeline characteristics has been performed on a set of about 2000 images, acquired in very different environments and with different light conditions (the pipeline and the sea bottom characteristics remarkably change from an image to an other). Four different pipeline typologies, named red, black, spots, and gravel, have been considered, and, due to the extreme differences among them, it was necessary to train four different NTs. A pre-processing algorithm automatically evaluates, on the basis of the main image characteristics, at which type each image belongs [14]. In the following, the four pipeline typologies and the selected features are described. A. “Red” Pipeline Red pipelines represent the more common kind of pipelines in the test set (about 82%). Fig. 5(a) presents an example of image belongs to the class “Red” pipeline. From a compara-

70

IEEE JOURNAL OF OCEANIC ENGINEERING, VOL. 27, NO. 1, JANUARY 2002

Fig. 6. Representation of the position inside the macro-pixel of selected pixels used by the classification process.

tive study of the Red component of the image and the Blue one, it emerged that the pipeline presents higher pixel values in the red image, while the sea bottom presents higher values in the blue one. For this reason, the first element of the feature vector has been obtained by subtracting to the median value , computed on of the blue component the whole elements composing the macro-pixel, the red one

(2) is the total number of pixels inside the macro-pixel where (e.g., 1024, 256 and 64 respectively for 32 32, 16 16 and is the red component of the th 8 8 macro-pixels), the blue component of the same pixel. The pixel and remaining features have been selected from a representative set of pixels belonging to the considered macro-pixels in the gray-scale image. Fig. 6 shows the mask that is superimposed to the macro-pixel (represented as a square) in order to selected representative pixels positions (represented by black dots). The feature vector is composed of 18 elements

(3)

are the values of the pixels selected as feawhere tures in the grey-scale image. In Fig. 7(a) the distribution of the features is presented in the (Blue-Red) plane versus the grey level plane. The black dot correspond to pipeline features while the grey ones the sea bottom features. B. “Black” Pipeline A smaller set (about 4.5%) of test images contains a black pipeline on a brighter sea bottom. Fig. 5(b) presents an image belonging to the class “Black” pipeline. In this case, the subtraction of the red component of the image from the blue one generates an image in which pipeline pixels have positive values and decidedly lower than sea bottom ones. For the Black type, a new NT with the same features of the Red type has been developed. However, the problem is slightly more complicated by the fact

Fig. 7. Feature distribution in the (Blue-Red) versus the Grey plane. (a) Red pipelines (black points represent pipeline features, dark points grey sea bottom and anode ones). (b) Black pipelines (black points represent pipeline features, dark points grey sea bottom and anode ones, light grey anode ones).

that, in several images, the anode response is more similar to the sea bottom than to the pipeline object. In Fig. 7(b) the features distribution in the (Blue-Red) plane versus the gray plane is presented. Like in Fig. 7(a), the black dot corresponds to pipeline features while the gray ones to sea bottom features; moreover, the features corresponding to the anode class are presented as light gray dots. For this reason, it is necessary to consider the anode object as a different class and to distinguish anodes from both pipeline and sea bottom at the same time. A post-processing merging step is added to merge pixels belonging to pipeline and anode class. C. “Spot” Pipeline This is an intermediate class between the Red and the Black one, and correspond approximately to the 10% of the whole test set. Fig. 5(c) presents an example of image belongs to the class “Spot” pipeline. Spot pipelines do not present definite characteristics except a great variability of the intensity (the spots) on the pipeline both in the gray-scale image and in the difference image between the red and blue components. Therefore, the feature used were the variance of the red component minus the blue component of the image, and the variance of the gray scale image. [Please see (4) at the bottom of the next page.] where represents the mean value of . The feature distribution in the is shown in Fig. 8(a). plane

FORESTI AND GENTILI: HIERARCHICAL CLASSIFICATION SYSTEM FOR OBJECT RECOGNITION

71

discriminate the images belonging to this class. Moreover, high order moments don’t allow to discriminate between the pipeline and the sea bottom. However, as this kind of pipeline presents two lateral strips, the classification is focused on the detection of these strips (Fig. 9). The features extracted are the mean value of the macro-pixel on the gray scale image and seventeen pixels in the same position of Fig. 6

(5) In Fig. 8(b), the mean on the gray-scale versus the mean of the is shown. features E. Anode Characteristics From an accurate study of the image characteristics, it happens that, for the 87% of the images containing anodes in the test set, the anode presents characteristics very similar to the pipeline, so that a NT trained to recognize pipeline from sea bottom, recognizes anodes as pipeline. For all the pipeline typologies, except the black (where the anode position is evaluated by using the same tree applied to find pipelines), extensive experiments demonstrate that the characteristics of the image computed on the macro-pixels, vary form pipeline and anode regions (6)

Fig. 8. Feature distributions; (a) Spot pipelines: variance in the (Blue-Red) versus variance in the grey image (black points represent pipeline features, dark points grey sea bottom and anode ones). (b) Gravel pipelines: the mean on the grey-scale versus the mean of the values of the features Gr . . . Gr (black points represent pipeline features, dark points grey sea bottom and anode ones). (c) Anode: mean of the image ' versus mean of the selected pixels of the image ' (black points represent anode features, dark points grey pipeline ones).

D. “Gravel” Pipeline This type represents very complex images containing pipelines on a gravel or a seaweed. The whole test set contains about 3.5% of these images. Fig. 5(d) presents an image belongs to the class “Gravel” pipeline. This pipeline presents two lateral bands of intensity brighter than the sea bottom. No significant texture characteristic of the pipeline or the sea bottom has been found to

are the red, green, and blue components of the where image. However, as the illumination conditions of the image has been selected to exchange suddenly, the image tract the features. The term is the average value of the pipeline intensity (the position of the pipeline is evaluated thanks to the classification into the two classes “pipeline” and “sea bottom” made before the anode classification) computed in a small area placed in the bottom of the image (where the effect of water absorption are limited) (Fig. 10). From the image , 18 features have been extracted, containing the mean of the image and 17 pixels values as for gravel type. In Fig. 8(c) the feature distribution in the plane mean of the image versus the mean of the 17 selected pixels on the image is shown. Once the pixel have been classified, a mathematical morphological operator (an expand-shrike filter with a 3 3 structuring element) [23] is applied to eliminate isolated pixels that can be classified as anode from the classified image, due to the presence of particularly luminous small objects (e.g., seaweed, white encrustations, small fishes and fluctuating objects between the camera and the pipeline that position of the center result over illuminated). Finally, the of mass of the anode is computed as the median value of the

(4)

72

Fig. 9.

Fig. 10.

IEEE JOURNAL OF OCEANIC ENGINEERING, VOL. 27, NO. 1, JANUARY 2002

Gravel pipeline typology classification and applied strategy for image reconstruction.

Area of the input image in which the parameter  is evaluated.

coordinate of pixels classified as anode, and transformed into the original image coordinate system (7) is the -coordinate in the classified image, is the where is the maximum size number of pixel classified as anode, of the macro-pixel and is the level at which the anode classiposition of the center fication is performed. To evaluate the of mass of the anode, a priori knowledge that the anode covers , where all the pipeline is used, and are left and right edge positions in the original image coordinate system. The choice of considering the coordinate as the mean of pipeline border positions, eliminates the problems connected with partial detection of anodes, due to sand covering the anode or bad illumination conditions. Fig. 11(a) and (b) shows an image containing an anode and the classified image with displayed the obtained position of the center of mass of the anode, respectively. V. EXPERIMENTAL RESULTS In this section, experimental results obtained on different underwater images, containing pipelines belonging to the four typologies with and without anodes are presented.

Three classes have been selected to represent objects in the considered scene: 1) pipeline; 2) sea bottom; and 3) anodes. A training set composed by 640 patterns (200 for the pipeline and the sea bottom classes and 240 for the anode class) taken from about 500 images acquired in different underwater environments and in different water and light conditions has been considered. In the following, some examples of performances of the algorithm on different type of pipeline is presented. In Fig. 12(a), an image containing a pipeline of the red type and an anode partially occluded is shown. In Fig. 12(b), the final classified image is presented. The anode (dark grey level) is well detected only in the right part of the image, while the pipeline (light grey level) is well detected. However, the system evaluates correctly (on the basis of pipeline border positions) the center of mass of the anode in (176, 88). The -coordinates of borders and , respectively. are correctly detected in Fig.12(c) shows the original image in which detected edges are superimposed. Fig.13(a) presents another example of red pipeline. In this image, pipeline texture is different from the previous one, due to different seaweed on the pipeline. As it concerns image color, the sea bottom is less blue than the previous image and there is some red seaweed on the sea bottom on the left side. Moreover, the right side of the anode is slightly more blue than the usual. However, also in this case, the estimated pipeline borders, com-

FORESTI AND GENTILI: HIERARCHICAL CLASSIFICATION SYSTEM FOR OBJECT RECOGNITION

73

Fig. 11. An example of anode detection: (a) original image and (b) classified one. Grey level pixels correspond to sea bottom, light grey level pixels to pipeline and dark grey level ones to anode; the sign x represents the position of the center of mass of the detected anode.

Fig. 12. Classification results on a real image containing a pipeline of the Red typology.

Fig. 13. Classification results on a real image containing a pipeline of the Red typology.

puted in and , are very close to the real ones. The center of mass of the anode is found in (196, 140). In Fig. 14(a), an original image belonging to the black type and containing an anode is shown. In this case, the anode is more similar to the sea bottom than to the pipeline. Fig.14(b) shows the final output of the appropriate NT (with three classes); the evaluated position of the center of mass of the anode, has been

correctly detected in (212, 288). Fig.14(c) shows the detected . As the anode is pipeline borders found in larger than the pipeline, the supplied right edge corresponds to the anode edge. An example of anode on a black pipeline placed on a sea bottom composed by sand and gravel is presented in Fig. 15. As the sand covers part of the pipeline left border, the first classification with 32 32 macro pixels presents several in-

74

IEEE JOURNAL OF OCEANIC ENGINEERING, VOL. 27, NO. 1, JANUARY 2002

Fig. 14.

Classification results on a real image containing a pipeline of the Black typology.

Fig. 15.

Classification results on a real image containing a pipeline of the Black typology.

Fig. 16.

Classification results on a real image containing a pipeline of the Spot typology.

accuracies. Successive classifications progressively correct the result. Only few isolated pixels still remain in the final classifiand cation. Pipeline borders are found in and the center of mass of the anode is found in (200, 101). In Fig. 16(a), a Spot type image containing a small part of an anode is presented. Fig. 16(b) shows the classified image, where both pipeline and anode are correctly detected. A white rubbish on the bottom-right part of the image is wrongly recognized as anode, but it does not affect the anode position evaluation.

The center of mass of the anode is correctly detected in (160, 207). Fig.16(c) shows the detected pipeline borders (found, reand ). The left border is spectively, in slightly translated due to small irregularities of the pipeline regions closed to the borders, and to a bad illumination. Fig.17 presents another example of spot pipeline. In this case, no anode is present in the image. The detected edges are slightly more acand curate than in the previous case, and are found in . As no anode is present on the gravel class pipelines,

FORESTI AND GENTILI: HIERARCHICAL CLASSIFICATION SYSTEM FOR OBJECT RECOGNITION

75

Fig. 17. Classification results on a real image containing a pipeline of the Spot typology.

Fig. 18.

Classification results on a real image containing a pipeline of the Gravel typology.

Fig. 19.

Classification results on a real image containing a pipeline of the Gravel typology.

no anode detection is performed. As described before, for this type, it was not possible to find features able to discriminate the pipeline from the sea bottom, due to the extreme variability of the pipeline characteristics. However, the detection of the lateral bands allows a correct evaluation of the position of the pipeline

borders. For images in Figs. 18 and 19, the edges are found reand , and in and spectively in . In the first case, they are correct, in the second, there is a small shift due to the presence of sand very similar to pipeline very close to it.

76

IEEE JOURNAL OF OCEANIC ENGINEERING, VOL. 27, NO. 1, JANUARY 2002

TABLE I

TABLE II

P [A j :B ] AND P [:A j B ] ON A REPRESENTATIVE IMAGE SET

MEAN ERROR

IN PIPELINE BORDER DETERMINATION ON A REPRESENTATIVE IMAGE SET.

A. Statistics on Pipeline Detection

TABLE III

To evaluate the system performances on pipeline detection, the mean error has been evaluated on the position of detected edges on a representative set

MEAN ERROR

IN THE DETERMINATION OF THE THE ANODE ON THE IMAGE SET IN

CENTER OF MASS TABLE II

OF

(8) is the where is the obtained abscissa of the edge position, is the number right one (estimated by a human operator) and of analyzed edges. The values of the mean error calculated for every type on a representative set of 100 images is presented in Table I. The higher error is obtained in the Spot class, because, in some cases, the pipeline region closest to the borders tends to have a smaller variance with respect to the pipeline itself and consequently, it can be confused with the sea bottom, while, in other cases, rubbish accumulated near to pipeline borders may have a high variance and may be confused with the pipeline. Best performances are obtained on black type images, where the pipeline is easily detectable, and on gravel one, where, even though the classification is poorer due to the high pipeline variability, the only well detected regions are those near to the borders. B. Statistics on Anode Detection To evaluate the system performances on anode classification, the probability of finding the anode when it is not present in the image and the probability of not finding it when it is in the image have been computed. Let be the event “the anode has been found” and the event “the anode is inside the image” and are evaluated for a the probabilities representative set of about 100 images. Obtained results are presented in Table II. The erroneous advises of anode presence in spots and red type are mainly due to segments of pipeline with strips painted on (Fig. 20) or white letters and numbers (Fig. 21). In particular, Fig. 20 presents a particularly complex image in which one edge is partially confused with sea bottom and a painted strip, similar to an anode, is present in the central part of the pipeline. For pipeline with a longitudinal central white strip, a post-processing procedure, ignoring data coming from the center of the image, could be used. This solution, however, could limit the ability of the algorithm to find real anodes. Another problem that can cause erroneous identifications of anodes is the asymmetric illumination produced by artificial

lights placed on the AUV, that can generate a lateral spot of luminosity and, consequently, a false anode detection. This problem may be solved with a more accurate illumination or with ad hoc pre-processing algorithm. As it concerns the probability of not finding anodes, the main problem is the presence of sand or seaweed, that can cover the pipeline and make the anode be identical to the pipeline. The only thing that can allow also a human operator to find it, is the diameter size changing. The NT classifies those anodes as “pipeline”. This problem can be solved only in the context of a tracking procedure, recognizing as possible anodes the sudden diameter changes. No data are presented in Table II for for the gravel type, because any image of anodes on for this type of pipeline are present in the test set. Finally, the (in pixels) in the determination of the center of mean error mass of the anode is computed: (9) and are the coordinates of the estimated position where and are the correct of the center of mass of the anode, is the number of anodes in the test sequence. ones and The obtained mean errors for the four different typologies are presented in Table III for the same set of Table II. C. Robustness of the Method In normal operative conditions, the navigation system of the AUV, thanks to the integration of data coming from different sensors (CCD camera, forward looking sonar, odometer), is able to maintain the vehicle at a constant high and with the same orientation of the pipeline [13]. The vehicle is maneuvered by using several flaps located in the vehicle aft and the thrusters contained in the fore and aft.

FORESTI AND GENTILI: HIERARCHICAL CLASSIFICATION SYSTEM FOR OBJECT RECOGNITION

77

Fig. 20. An example of a complex image where a white strip is painted on the pipeline.

Fig. 21.

An example of a complex image where numbers and letters are painted on the pipeline.

Fig. 22. Classification results on a real images containing a pipeline slightly off vertical.

However, in order to demonstrate the robustness of the proposed method, some tests have been performed on images acquired in presence of strong currents and consequently with the pipeline rotated with respect to the optical axis of the camera. The classification performances remain unchanged in the final images; however, the method is slightly slower, due to the larger number of pixels to be classified near the borders because of geometrical effects. Fig. 22 presents an example of a pipeline of the Red class rotated of about 10 degrees. In Fig. 22(a), the step before the last classification is shown. Due to the fact that the pipeline edges are not vertical, the geometrical approach forces the algorithm to classify again a large number of pixels near to the pipeline edges. However, as

the classification performances of the neural tree do not depend on geometrical aspects, the final classification [Fig. 22(b)] leads to a correct segmentation of the pipeline [Fig. 22(c)]. VI. CONCLUSION In this paper, a neural tree-based hierarchical classifier is proposed to help an AUV in sea-bottom survey operations. The main objective of the system is to recognize objects in underwater images. A neural tree classifier has been used, instead of a classical neural network, because it does not require any apriori information about the network structure (number of internal nodes, number of hidden neurons, number of hidden layers, connectivity between layers, etc.) for the training phase. Extensive

78

IEEE JOURNAL OF OCEANIC ENGINEERING, VOL. 27, NO. 1, JANUARY 2002

experimental results on a large set of real underwater images, acquired on different environments and with different conditions, demonstrate that the proposed system reaches a high classification accuracy, greater than that of classical methods, without loosing either the generalization capability of artificial neural networks either the ability to recognize straight object borders of Hough-based methods.

[20] M. K. Schneider, P. W. Fieguth, W. C. Karl, and A. S. Willsky, “Multiscale methods for the segmentation and reconstruction of signals and images,” IEEE Trans. Image Processing, vol. 9, no. 3, pp. 456–468, 2000. [21] B. D. Ripley, Pattern Recognition and Neural Networks. Cambridge, U.K.: Cambridge Univ. Press, 1996. [22] S. Rasoul and D. Landgrebe, “A survey of decision tree classifier methodology,” IIEEE Trans. Syst., Man, Cybern., vol. 21, Mar. 1991. [23] I. Pitas and A. N. Venetsanopoulos, Nonlinear Digital Filters: Principles and Applications. Norwell, MA: Kluwer , 1990.

ACKNOWLEDGMENT The authors would like to thank TECNOMARE, Venezia (IT) and CEOM, Palermo (IT), companies for providing underwater images and Prof. G. G. Pieroni for his valuable discussions and suggestions. REFERENCES [1] R. L. Marks, S. M. Rock, and M. J. Lee, “Real-time video mosaicking of the ocean floor,” IEEE J. Oceanic Eng., vol. 20, pp. 229–241, May 1995. [2] N. Gracias and J. Santos-Victor, “Underwater video mosaics as visual navigation maps,” Comput. Vis. Image Under., vol. 79, no. 1, pp. 66–91, 2000. [3] S. Negahdaripour and A. Khamene, “Motion-based compression of underwater video imagery for the operations of unmanned submersible vehicles,” Comput. Vis. Image Under., vol. 79, no. 1, pp. 162–183, 2000. [4] M. A. Hasan and M. R. Azimisadjadi, “A modified block FTF adaptive algorithm with applications to underwater target detection,” IEEE Trans. Signal Processing, vol. 44, pp. 2172–2185, Sept. 1996. [5] G. Tascini, P. Zingaretti, and G. P. Conte, “Real-time inspection by submarine imagesy,” J. Electron. Imag., vol. 5, pp. 432–442, 1996. [6] A. Grau, J. Climent, and J. Aranda, “Real-time architecture for cable tracking using texture descriptors,” in Proc. Oceans’98, Nice, France, Sept. 29–Oct. 1, 1998, pp. 1496–1500. [7] D. Brutzmann, M. Burns, M. Campbell, D. Davis, T. Healey, M. Holden, B. Leonhardt, D. Marco, D. McLarin, B. McGhee, and R. W. NPS, “Phoenix AUV software integration and in-water testing,” Proc. IEEE Autonomous Underwater Vehicles Conf., pp. 99–108, 1996. [8] S. D. Fleischer and S. M. Rock, “Experimental validation of a real-time vision sensor and navigation system for intelligent underwater vehicles,” Proc. IEEE Autonomous Underwater Vehicles Conf. Conference, 1998. [9] S. McMillan, D. E. Orin, and R. B. McGhee, “Efficient dynamic simulation of an underwater vehicle with a robotic manipulator,” IEEE Trans. Syst., Man, Cybern., vol. 25, pp. 1194–1206, Aug. 1995. [10] H. Illingworth and J. Kittler, “A survey on the hough transform,” Comput. Vis., Graph. Image Process., vol. 44, pp. 87–116, 1988. [11] M. Atiquzzaman, “Multiresolution hough transform—An efficient method of detecting patterns in images,” IEEE Trans. Pattern Anal. Machine Intell., vol. 14, pp. 1090–1095, Nov. 1992. [12] M. Bold, R. Weiss, and E. Rieseman, “Token-based extraction of straight lines,” IEEE Trans. Syst., Man, Cybern., vol. 19, pp. 1581–1595, June 1989. [13] G. L. Foresti, S. Gentili, and M. Zampato, “A vision-based system for autonomous underwater vehicle navigation,” in Proc. Oceans’98, Nice, France, Sept. 29–Oct. 1 1998, pp. 195–204. [14] G. L. Foresti and S. Gentili, “A vision based system for object detection in underwater images,” Int. J. Pattern Recogn. Arti. Intell., vol. 14, no. 2, pp. 167–188, 2000. [15] M. J. Buckingham, B. V. Berkout, and A. A. L. Glegg, “Imaging the ocean ambient noise,” Nature, vol. 356, pp. 327–329, 1992. [16] P. E. Utgoff, “Perceptron tree: A case study in hybrid concept representation,” in Proc. 7th Nat. Conf. Artificial Intelligence, 1988, pp. 601–605. [17] J. Sirat and J. Nadal, “Neural tree: A new tool for classification,” Network 1, pp. 423–438, 1990. [18] A. Sankar and R. J. Mammone, “Growing and pruning neural tree networks,” IEEE Trans. Neural Networks, vol. 42, no. 3, pp. 291–299, 1993. [19] G. L. Foresti and G. G. Pieroni, “Exploiting neural trees in range image understanding,” Pattern Recog. Lett., vol. 19, no. 9, pp. 869–878, 1998.

Gian Luca Foresti (S’93–M’95–SM’02) was born in Savona Italy, in 1965. He received the laurea degree cum laude in electronic engineering and the Ph.D. degree in computer science from University of Genoa, Geneva, Italy, in 1990, and 1994, respectively. In 1994, he was visiting Professor at University of Trento, Trento, Italy. Since 1998, he is Professor of Computer Science at the Department of Mathematics and Computer Science (DIMI), University of Udine, Udine, Italy and Director of the Artificial Vision and Real-Time System Lab. His main interests involve artificial neural networks, multisensor data fusion, computer vision and image processing, and multimedia databases. Techniques proposed found applications in the following fields: automatic video-based systems for surveillance and monitoring of outdoor environments, vision systems for autonomous vehicle driving and/or road traffic control, 3-D scene interpretation and reconstruction. Prof. Foresti is author or co-author of more than 100 papers published in International Journals and Refereed International Conferences. He was general co-chair, chairman and member of Technical Committees at several conferences. He has been co-organizer of several Special Sessions on video-based surveillance systems at International Conferences. He has contributed in seven books in his area of interest and he is co-author of the book “Multimedia Videobased Surveillance Systems” (Norwell, MA: Kluwer, 2000). He has been Guest Editor of a Special Issue of the PROCEEDINGS OF THE IEEE on Video Communications, Processing and Understanding for Third Generation Surveillance Systems. He has served as a reviewer for several international journals, and for the European Union in different research programs (MAST III, Long Term Research, Brite-CRAFT). He has been responsible for DIMI for several European and National research projects in the field of image processing and understanding. In February 2000, he has been appointed as Italian member of the Information Systems Technology (IST) panel of the NATO-RTO. He is Senior member of IEEE and member of IAPR.

Stefania Gentili was born in Pisa, Italy, in 1969. She received the Laurea Master’s degree in physics from the University of Pisa, Pisa, Italy, in 1995, working on automatic astronomical image analysis for the research of supernovae explosions in the ambit of an international research project named SWIRT. She is currently on a fellowship from Italian Ministry of University and Scientific Research for pursuing the Ph.D. degree in computer science, at the Industrial Computer Science Laboratory, Udine University, Udine, Italy. In 1996, she obtained a fellowship of the Teramo Astronomical Observatory to work on images acquisition and analysis. From June 1997 until recently, she has been a Research Associate with the Industrial Computer Science Laboratory, Udine University. In these years, she worked at the EC project Holomar, on the automatic classification of holographic images of plankton by neural networks and on two projects financed by CEOM (CEntro Oceanologico Mediterraneo), on the visual part of an autonomous underwater vehicle driving system. Her studies involved theoretical aspects and application of neural networks and invariant shape description. She was involved in the proposal of the EC project called VENFLEX for the automatic recognition by neural networks of parts of furs and other flexible materials. She is referee of the journal IEEE TRANSACTION ON SYSTEMS, MAN AND CYBERNETICS. She is a member of the SAIt (Società Astronomica Italiana).

Suggest Documents