Line Recognition Algorithm Using Constrained

0 downloads 0 Views 334KB Size Report
Abstract—Vectorization, i.e. raster–to–vector conversion, is a central part of graphics recognition problems. Graphics recognition is concerned with the analysis ...
Line Recognition Algorithm Using Constrained Delaunay Triangulation Mohamed Naouai1,2, Melki Narjess1 and Atef Hamouda1 1

Faculty of Science of Tunis, University campus el Manar DSI 2092 Tunis Belvédaire-Tunisia Research unit URPAH [email protected], 2 Laboratory Image Ville Environnement ERL 7230-CNRS-University Strasbourg 3rue de l'Argonne F-67000 STRASBOURG [email protected], [email protected], [email protected]

Abstract—Vectorization, i.e. raster–to–vector conversion, is a central part of graphics recognition problems. Graphics recognition is concerned with the analysis of graphics-intensive documents, such as technical drawings, maps or schemas. In this paper we present a generic algorithm for line extraction based on image vectorization. A pre-processing step provides points that are judged to belong to linear forms. We use Constrained Delaunay Triangulation and an edge filtering process to pass from the raster format to a set of polygons that make up the vector image. The final results are provided by skeletonization of the obtained polygons. The algorithm is automatic with very little interaction from the users and has very satisfying results when tested on road segments.

Keywords—Object skeletonization

recognition,

I.

Vectorization,

CDT,

through the use of a triangulation method. The post-processing is based on a set of well-known criteria for perceptual organization employed in human vision that are applied to filter the set of edges resulting from the triangulation [4]. This process converts the original raster images into vector representation, which is a commonly used technique in graphics recognition problems as it puts the original image into a vector form that is more suitable for further analysis. II.

GENERAL APPROACH

The overall objective of this work is to pass from a raster representation of the image to a vector representation that facilitates the extraction of linear structures. The main algorithm is preceded by a treatment that provides points that are judged to belong to lines and will be introduced to constraints to the Delauny triangulation.

INTRODUCTION

Segmentation is a very important step in the feature recognition process as it aims to cluster pixels into salient image regions that define the semantic content of the image. Thus, reliable image segmentation provides good results in object detection and recognition. Segmentation methods can be generally classified into methods that decompose the image into regions of uniform color or intensity and methods that use the boundaries of the objects to divide the image. However Structures in images manifest through regional uniformities and boundary discontinuities, furthermore edges are usually fragmented. Thus more processing is needed to gather edges that belong to the same object. Many researched explored the possibilities of using the triangulation as a first step in performing image segmentation [1, 2, 3]. However the main focus of these approaches is to vectorize the raster image without further exploitation of the obtained polygonal format. In this paper, segmentation is realized Figure.1. General approach

nd

52 International Symposium ELMAR-2010, 15-17 September 2010, Zadar, Croatia

51

III.

PRE-PROCESSING Input: The original image I, Reference pixel r For every pixel p

A. Correlation The correlation coefficient, sometimes also called the crosscorrelation coefficient is a statistical measure used to evaluate the intensity of the liaison between two variables [5]. This measure was applied to the original image to enhance the regions having high spectral similarities with the reference pixel.

corrImg(p) = corr(p,r) End I= grayscale(I) For every pixel p If CorrImg(p) > t

The correlation coefficient is defined as:

highCorr(p) = I(p) else if (I(p)==I(r)) highCorr(p) = 0 Where: p: current pixel

End

r: reference pixel

End

Figure 3. High correlation

Figure. 2. (a) The original image

b4

a1

a2

b3

p

a3

b2

b1

a4

Figure. 4. (a) The arrangement of the 8-neighborhood into pairs

Figure. 2. (b) Correlation distance image

u1 u2

The correlation distance image (Fig.2.b) is scanned. To each pixel having a coefficient higher than a threshold t is assigned the gray level of the reference pixel. In the case where the current pixel has a low correlation coefficient, its spectral value is examined. If it has the same gray level as the reference pixel, the point is deleted. Thus the pixels that are highly correlated with the reference are emphasized (Fig3). Then the pseudo codes of Computation of the high correlation image are as follows:

nd

52

u3

bm p

v3 am

v2 v1

Figure. 4. (b) Extended neighborhood

52 International Symposium ELMAR-2010, 15-17 September 2010, Zadar, Croatia

The next step consists in applying a line detection method that is based on constrained gradient. The algorithm that has been described in [6] relies on the fact that gradient vectors located in opposite sides of a line are pointing towards or against each other, according to the brightness of the line. The presence of a line can thus be detected by the calculation of the scalar product between opposite gradient vectors around a pixel. First, a gradient filter is applied to the image. Then, for a given pixel p, the symmetrical neighbors with respect to p in the immediate 8-vicinity, are arranged in pairs (ak. bk) as illustrated in figure (Fig 4.a). The pair providing the higher (in absolute value) scalar product, noted (am. bm), gives the direction of the line. Afterwards, the further neighborhood is investigated while the scalar product is increasing. In this step, three other pairs are considered as illustrated in (Fig 4.b). The direction of the line is perpendicular to the pair giving the higher scalar product. To improve the line following, a non maximum deletion algorithm is applied. At each pixel the scalar product value is compared to the two adjacent pixels in the direction perpendicular to the line. If it is not maximum, the point is deleted, otherwise it is stored in a raster maxGrad(i,j). The next step consists in line following. A line begins at a point of maxGrad (i, j) that is larger than a threshold ts. Then, the algorithm moves to the maximum point in the 8neighbourhood if it is larger than a threshold tc. A line starts following the direction of this move and considering the max of the three next pixels ci-1, ci and ci+1(Fig. 5).

ci-1

ci

Figure.6. Set of points resulting from the line following

IV.

VECTORIZATION

A. CDT In geometry, Delaunay triangulation for a set P of points is a triangulation DT(P) that satisfies an "empty circle" property: for each edge we can find a circle containing the edge's endpoints but not containing any other points. It maximizes the minimum angle of all the angles of the triangles in the triangulation. [7] The figure 7 shows an example of DT.

ci+1

Figure.7. Delaunay Triangulation of an arbitrary set of points

p

Figure. 5. Line following in the direction of the maximum gradient

A constrained triangulation is a triangulation of a set of points that has to include among its edges a given set of segments joining the points. The corresponding edges are called constrained edges [8]. A constrained Delaunay Triangulation (CDT) is a Delaunay triangulation where the constraints are preserved and not split into smaller edges by avoiding the insertion of additional points. The figure 8 shows an example of CDT.

A backward line is then followed using the opposite of the initial direction used in the first move. The result image still has areas that contain many scattered pixels. To remediate to this problem, a non-linear form deletion method is used. In fact, for a given pixel, if the immediate neighborhood contains a majority of white pixels, then it is not likely that it belongs to a line and is thus deleted. The result of this process is demonstrated in figure 6.

Figure.8. (a) Delaunay triangulation, (b) Constrained Delaunay triangulation; Constraints are in bold.

nd

52 International Symposium ELMAR-2010, 15-17 September 2010, Zadar, Croatia

53

The presence of constraints in the CDT is ensured through local modifications where Delaunay edges are removed or flipped. Thus, a constrained Delaunay triangulation respects a set of segments that constrain the edges of the triangulation, while still maintaining most of the favorable properties of ordinary Delaunay triangulations (such as maximizing the minimum angle). CDTs solve the problem of enforcing boundary conformity—ensuring that triangulation edges cover the boundaries (both interior and exterior) of the domain being modeled. [9]

Figure. 11. Result of constrained Delaunay triangulation

B. Triangulation edges Filtering The generated triangle edges are then filtered. The constraints are kept while the other edges are filtered out according to a pre-specified set of rules. We use a well-known set of criteria for perceptual organization employed in human vision such as proximity, closure and contour completion. The proximity rule removes the edges using a threshold filtering (Fig.12). The closure rule: The edges bounded by the same constraints are deleted (Fig. 13). Figure.9. Constrained Delaunay Triangulation

The contour completion rule states that between different contour edges, only the shortest edges are kept (Fig. 14).

The image vectorization starts with Canny edge detection [10] (see figure 10) followed by a constrained Delaunay triangulation (CDT) where the lines resulting from the preprocessing stage are used as constraints for Delaunay triangulation.

Figure.12. The dashed edge is not the shortest edge and hence should be deleted as it does not support proximity

Figure. 10. Canny edge detection

nd

54

Figure.13. The thin edges are removed as they are bound by the same contour.

52 International Symposium ELMAR-2010, 15-17 September 2010, Zadar, Croatia

As discussed in [15], skeleton computation is a central part in most of the graphic recognition systems as it provides a simple and compact representation of a shape that preserves many of the topological and size characteristics of the original shape.

Figure.14. Only thin triangle edges, linking different contour edges, are preserved as the shortest links between contour edges (in bold lines), while dashed triangle edges are filtered out

The final step in our algorithm is to extract the skeleton of the obtained polygons. The skeleton represents the extracted lines(fig16).

The edges’ filtering algorithm is the following: • Extraction of the edges that belong to two triangles. • Calculation of edge properties (according to the set of rules detailed in section 4.2) characterizing the relationships of edges with triangles, contours and constraints, for deciding which adjacent triangles should be grouped by deletion of their common edge to create perceptually salient polygonal image segment. • Grouping the triangles sharing an edge that is deleted in the previous step. • Consolidation of triangles belonging to the same group in polygons and assignment of a common color to triangles in each group (the median spectral value of all triangles in the group).

Figure.16. (a) The original image

After the edge filtering the image is segmented into a set of polygons. The result is a segmented image that is represented as a set of spectrally attributed polygonal patches: a vector image. C. Skeletonization The notion skeleton was introduced by H. Blum [11] as a result of the Medial Axis Transform (MAT) or Symmetry Axis Transform (SAT). The MAT determines the closest boundary point(s) for each point is in an object. An inner point belongs to the skeleton if it has at least two closest boundary points. The skeleton hence obtained must be as thin as possible, connected and centered [12]. The skeleton of a shape is a thin version of that shape that is equidistant to its boundaries. It retains the shape of the original polygon and emphasizes its geometrical and topological properties. Thus skeletonization [13, 14 ] is defined as process of reducing the width of pattern to just a single pixel. This concept is shown in fig.15.

Figure.16. (b) The extracted lines (skeletonization result)

V.

ACCURACY ASSESSMENT

The presented algorithm (Fig17) is automatic with very little interaction from the users. In this section, we evaluated it by conducting several experiments on various real world data. To evaluate the obtained results we used the Precision and recall measures [16]. Precision and recall are set-based measures. They evaluate the quality of an unordered set of retrieved elements. Precision can be seen as a measure of exactness or fidelity, whereas Recall is a measure of completeness.

Figure.15. The skeleton superimposed on the original pattern

nd

52 International Symposium ELMAR-2010, 15-17 September 2010, Zadar, Croatia

55

Considering that the main scope of application for this work is the road extraction in high resolution satellite images, as roads are typically linear, we tested the algorithm on several high resolution satellite images containing road segment. The example treated in this paper had a precision of 0.83 and a recall value of 0.85. VI.

CONCLUSION

We propose in this paper our approach of automatic line detection using image vectorization. The algorithm provides good results on both levels of completeness and exactness as indicated by the recall and precision values. This is due to the use of constraints edges in the triangulation process since these edges are highly likely to belong to linear forms. Thus, preprocessing is a major factor in ensuring the performance of method. However, the algorithm is costly in terms of complexity and memory as it depends on the size of the CDT input set. A possible improvement is to reduce the number of entry points of the CDT by filtering the contour points resulting from the canny algorithm. REFERENCES [1] [2] [3] [4]

[5]

[6]

[7]

Figure.17.The major steps of our approach

[8]

In an information retrieval context, Precision is defined as the number of relevant identified elements divided by the total number of retrieved elements, and Recall is defined as the number of relevant elements retrieved divided by the total number of existing relevant elements. The closer value to 1 indicates the better results.

[9]

In order to evaluate the performances of our algorithm, Recall and Precision have been calculated using the following formulas:

[10] [11]

[12] [13] [14] [15]

[16]

Prasad, L., Skourikhine, A. N., 2006. Vectorized image segmentation via trixel agglomeration. Pattern Recognition, 39(4), pp. 501-514. N. Skurikhin a, P. L. Volegov b. OBJECT-ORIENTED HIERARCHICAL IMAGE VECTORIZATION Sriram Swaminarayan, Dr. Lakshman Prasad, RaveGrid: Raster to Vector Graphics for Image Data, SVG Open 2007 Wertheimer M., 1958. Principles of perceptual organization. In: Readings in Perception, Beardslee, D., Wertheimer, M., (Eds.), D. Van Nostrand, Princeton, NJ, pp. 115-135. Ching-Chien Chen , Craig A. Knoblock andCyrus Shahabi. Automatically Conflating Road Vector Data with Orthoimagery. GeoInformatica (2006) 10: 495–530 V. Lacroix , M. Acheroy Feature extraction using the constrained gradient. ISPRS Journal of Photogrammetry & Remote Sensing 53 (1998) 85-94. Shewchuk, J. R., 1996. Triangle: engineering a 2D quality mesh generator and Delaunay triangulator. Lecture Notes in Computer Science, 1148, pp. 203-222. Vid Domiter, Constrained Delaunay Triangulation using Plane Subdivision, CESCG (2004) Jonathan Richard Shewchuk, Constrained Delaunay Tetrahedralizations and Provably Good Boundary Recovery, Eleventh International Meshing Roundtable, September 2002 Canny, J., 1986. A computational approach to edge detection. IEEE Trans. on Pattern Analysis and Machine Intelligence, 8(6), pp. 679-698. H. Blum: A transformation for extracting new descriptors of shape. in: Models for the Perception of Speech and Visual Form (W. WathenDunn, ed.), MIT Press, Cambridge, Mass., 362{380, 1967. Gisela Klette, Skeletons in Digital Image Processing , CITR-TR-112 July 2002 Jain, A.K., : Fundamentals of Digital Image Processing, rentice-Hall, 1986. Gonzalez, R. and Woods, R.E., : Digital Image Processing , Prentice Hall, 2002. Karl Tombre, Salvatore Tabbone Vectorization in Graphics Recognition: To Thin or not to Thin; Proceedings of 15th International Conference on Pattern Recognition, Barcelona (Spain), volume 2, pages 91-96, september 2000. C. J. van RIJSBERGEN, INFORMATION RETRIEVAL P114, 115.



nd

56

52 International Symposium ELMAR-2010, 15-17 September 2010, Zadar, Croatia

Suggest Documents