Structure Preserving Binary Image Morphing using ...

3 downloads 0 Views 1023KB Size Report
Oct 24, 2008 - Chanussot and J.C. Tilton, “Advances in Spectral-Spatial. Classification .... http://www.vision.ime.usp.br/~daniel/sibgrapi2005/. Date accessed: ...
1

Pattern Recognition Letters journal homepage: www.elsevier.com

Structure Preserving Binary Image Morphing using Delaunay Triangulation Abbas Cheddada,  a

Blekinge Institute of Technology (BTH), Karlskrona, SE-371 79, Sweden

ABSTRACT Mathematical morphology has been of a great significance to several scientific fields. Dilation, as one of the fundamental operations, has been very much reliant on the common methods based on the set theory and on using specific shaped structuring elements to morph binary blobs. We hypothesised that by performing morphological dilation while exploiting geometry relationship between dot patterns, one can gain some advantages. The Delaunay triangulation was our choice to examine the feasibility of such hypothesis due to its favourable geometric properties. We compared our proposed algorithm to existing methods and it becomes apparent that Delaunay based dilation has the potential to emerge as a powerful tool in preserving objects structure and elucidating the influence of noise. Additionally, defining a structuring element is no longer needed in the proposed method and the dilation is adaptive to the topology of the dot patterns. We assessed the property of object structure preservation by using common measurement metrics. We also demonstrated such property through handwritten digit classification using HOG descriptors extracted from dilated images of different approaches and trained using Support Vector Machines. The confusion matrix shows that our algorithm has the best accuracy estimate in 80% of the cases. In both experiments, our approach shows a consistent improved performance over other methods which advocates for the suitability of the proposed method. 2016 Elsevier Ltd. All rights reserved.

1.

Introduction and Background

Shape analysis is of paramount importance in different computational fields, spanning from computer vision and robotics to optical character recognition (OCR) and pattern recognition to medical imaging and modelling of organism’s growth, to industrial product inspection, etc. Thus, capturing interest of both research and industrial world. Shape descriptors are most often associated with statistical and directional metrics that act on a binary image of a given object. Before attempting such analysis, pre-processing of non-binary images is normally carried out to segment the region of interest, whilst there exist some related methods that work on the grayscale intensity channel such as those that use energy minimisation algorithms (e.g., active contours). The latter type is not the focus of this paper. A known discipline that relates directly to binary shape processing is the mathematical morphology which illustrates its fundamental operations using the set theory. Its non-linear processing and its direct relation to shape descriptors, tease it apart from the convolution operations deployed in signal processing. As far as image processing is concerned, mathematical morphology offers a powerful and unified approach to tackle a number of problems [1]. Morphological operators are among the first image-based operators where Dilation is one of the two primitive operators in mathematical morphology and the other being the dual operator, known formally as erosion, with respect to set complementation.

Dilation, which is the focus of this paper, is defined as the morphological transformation that combines two sets using vector addition of set elements; as Haralick et.al like to put it [2]. Historically, there were other definitions and they all boil down to how dilation operates. For instance, image spatial shifting to yield dilation is in fact closely related to the Minkowski addition concept. Dilation is implemented using binary kernels termed as structuring elements (SE). They can be of any shape: square, disk, diamond, or any other arbitrary shapes. Dilation provides isotropic expansion of binary shapes which has been recently extended to grayscale objects, it can also bridge gaps in broken segments or smooth out shapes. Cuisenaire [3] defined a morphological operation as a variant of distance transformation algorithms using balls SEs that are locally adaptable. We shed more light on the modern notion of distance transform in connection with dilation in section 2. Shih [4] described the sweep morphology operation that is rendered adaptive by changing the rotation angles and scaling factors of the SEs which are defined with respect to the boundary of the curve of a given object (Ch. 11, p.344). In the recent years, development around the topic of mathematical morphology migrated to grayscale images and 3D objects. Thus, all of the fundamental morphological operations on binary images have been successfully extended to grayscale spatial plane [5][6][7][8], homography, 3D mesh and hyperplanes projective space [9][10][11]. Additionally, a current trend is the adaptation of classical morphological tools from signal processing to graph

 Corresponding author. Tel.: +46 4 55-385863; fax: +0-000-000-0000; e-mail: [email protected]

2 structures (e.g., 3D projections), a survey paper on graph-based morphology is available in [12]. Such research venues are beyond the scope of this letter, therefore, for the sake of conciseness and clarity, we limit our discussion to binary image dilation. 1.1. Example Applications Binary Image Dilation Stahlberg and Vogel [13] applied a dilation operation with ellipse shaped 3x3 kernel to increase thickness of handwritten text lines. Analogous to that was the work of Fouladi and Araabi [14] where they took the image of the main body of a glyph (an elemental symbol) and performed multiple dilation operations to thicken the body structure. Shaus et al. [15] dilated bounding octagon in order to account for certain inaccuracies in the Hebrew texts of the Iron Age (First Temple period) in facsimiles (black and white images of ancient inscriptions). Zelenika et al. [16] dilated binary images of different Sobel filters by a rectangular structuring element. Desai [17] analysed the basic dilated shape of characters for building feature set for their support vector machine classifier. Jamal [18] used morphological reconstruction method by dilation that is based on an estimated baseband from shape analysis for Arabic handwritten texts segmentation. Khayyat et al. [19] proposed detection of handwritten text lines by applying an adaptive mask to morphological dilation. In [20] binary handwriting samples were morphologically dilated using a 7-pixel-width diamond-shaped structuring element. They claimed that this step enhances the template matching outcome due to the resulting different degrees of word shape “fuzziness”. On another frontier, Han et al. [21] demonstrated an eye detection system using morphological operation. They called it morphologybased eye-analogue segmentation. Its aim is to reduce the interference of the background. Basically they performed a closing operation, a type of morphology that uses dilation followed by erosion, and clipped different portions to find the candidate eyeanalogue pixels. Voronoi Diagram /Delaunay Triangulation The use of these techniques in the literature is not scarce either. For instance, Cheddad et al. [22] [23] extracted the outer vertices of the Delaunay triangulation (DT) to segment grayscale images by processing only a small set that contains no more than 255 points (i.e., equivalent to the total possible number of image histogram bins) which makes it a very fast segmentation algorithm. Xiao and Yan [24] and Xie and Lam [25] applied DT to model human faces in images for biometric applications (e.g., face recognition, simulation of facial expression, etc.). The aforementioned applications are but a few examples extracted from a larger pool of morphology applications which we hope will rekindle the interest in revisiting this central and vital field. 1.2. Research Statement and Contribution The outcome of dilation is dependent on the used structuring element and the number of iterations as well as the characteristics of the shape being processed. If we desire to have a mild effect in each dilation pass one can use a small SE which for a flat disk SE object it translates to 5 neighbours and for a flat square SE object it translates to 4 neighbours. Repetitive passes quickly evolve to an unrecognisable thickened shape especially for small binary objects. Shih [4] stated that “traditional mathematical morphology uses a designed structuring element that does not allow geometric variations during the operation to probe an image.” (Ch. 11,

p.341). Although, [3] and [4] have proposed adaptive dilation processes, however, both still utilise some sort of structuring elements. Therefore, it is the intent of this paper to propose an approach to address this void. Thus, the contribution of this paper is in our attempt to revive the area of mathematical morphology by suggesting to also consider, in addition to the set theory, the pure geometry association of dot patterns in the Euclidean space. We namely propose the use of the DT for binary image morphing. To the best of our knowledge, such concept has not been proposed before, thus reinforcing its novelty. Our results reaffirm the usefulness of this notion and confirm the stability and the mild thickening effect that helps preserve shape properties which is inevitability a good property for many applications including pattern recognition algorithms. Fig. 1 shows that morphology is among those techniques that represent shapes as a non-linear transformation. Our ultimate objective in this work, illustrated as a dashed line, is to adapt the Delaunay triangulation topology to perform basic mathematical morphology (i.e., dilation in this case).

Fig. 1. Expanded taxonomy model of shape representation domain techniques and the unique connection this work makes (dotted arrow).

2.

Common Dilation Methods

In order to make this paper self-contained, we briefly review the widely used dilation methods. A morphing procedure usually consists of two steps. The first step is to choose a proper flat structuring element (kernel) with a specified neighbourhood, the latter is a square matrix containing 1's and 0's (the pattern formed by ones and zeros dictate the shape of the structuring elements); the second step is to apply a morphology filtering method with the chosen matrix to achieve a morphology effect (probing and expanding the shapes contained in an input binary set). We can state that symbolically as follows. Let I and S be sets in the 2D space N2 that correspond to the binary image and the structuring element, respectively. And let i = {i1, i2,…in} and s = {s1, s2,…sn} be the elements in I and S, respectively. The dilation of I by S, denoted by I  S , is defined as: (1) I  S  d  N 2 d  i  s, i  I , s  S





which can be rewritten as the aggregated union of translations of Si (the structuring element centred at every pixel location i in the foreground): (2) I  S  S  I   iI Si It is possible to achieve dilation by reflecting S (a.k.a, the symmetric of S). Let Sˆ denotes the reflected S, then the above equation becomes: (3) I  S  n  N 2 Sˆn  I  





There are dozens of standardised structuring elements and limitless number of them can be arbitrarily set. The structuring

3 elements we use in this study are a square structure SE1 ={(0,0),(0,1),(1,0),(1,1)} and a disk structure SE2 ={(0,1),(1,0),(1,1),(1,2),(2,1)} which are depicted graphically in Fig.2. Fig. 4. Voronoi diagram of seven dot patterns shown as thick lines in blue colour.

Fig.2. The two morphological structuring elements used in this study. Square shape SE containing 4 neighbours (left) and disk shape SE containing 5 neighbours.

A modern method is the adaptation of the Euclidean distance map (EDM), also called Euclidean distance transform, to mathematical morphology. EDM is a fundamental geometrical operator with great applicability in variety of computer vision applications and it is also related to many other important entities such as VD which we discuss in this paper [26], morphological dilation [27] and graphs [28]. Distance transform associates each point p of a set P with the distance from itself to the closest point in the complementary set of P [29]. Computation of morphological filters using EDM is highlighted in [26, p.2:6] and [3]. Thresholding the EDM at level 1.1 generates the smallest possible dilation as shown in Fig. 3.

The intersections of the Voronoi regions for the set of points construct the Voronoi diagram. More formally, let a set K = {p1,…, pn} of n distinct points in the 2D plane ( 2 ). The Voronoi cell V (pi) of a point pi  K is defined as:

V ( pi ) : {q  2 : d ( pi , q)  d ( p j , q), i  j}

(5)

The dual tessellation of VD is known as the Delaunay Triangulations (DT) (as shown in Fig. 5). The duality comes about in the following way: vertices in the Voronoi diagram correspond to faces in DT, while Voronoi cells correspond to vertices of DT.

Fig. 5. Delaunay triangulation of the seven dot patterns shown as black lines.

Each triangle in the DT shown in Fig. 5 is created in the following manner. Once the VD for a set of points is constructed, it is a simple matter to produce DT by connecting any two sites whose Voronoi polygons share an edge. More specifically: let P be a circle free set. Three points, p, q and r of P define a Delaunay triangle if there is no further point of P in the interior of the circle which is circumscribed to the triangle p, q, r and its centre lies on a Voronoi vertex; Fig. 6 depicts this notion. Fig. 3. Isotropic dilation using the Euclidean distance transform. (a) Original binary image, “cat32.pgm”, from the Binary Shape Databases, the Brown University. (b) Dilation with a distance thresholded at level 1.1. (c) Colour enhanced EMD of the inverted image in (a).

3.

Delaunay Triangulation based Binary Image Morphing (DTBIM)

The proposed approach is based ultimately on DT technique, which is the dual of Voronoi Diagram (VD), a versatile geometric structure [30][31][24]. We opt to address the issue of binary morphology geometrically and not spatially. In other words, we treat shapes as dense dot patterns, a field that DT is meant to handle. A VD is essentially a set of geometric elements derived from the distance relationship of geometric objects and has some useful properties for engineering applications. VD can be generated using different algorithms. Perhaps the renowned implementations are the Fortune's algorithm and the Quickhull algorithm. Given a set of 2D points, the Voronoi region for a point pi is defined as the set of all the points that are closer to pi than to any other points, see Fig. 4. The Euclidean distance is normally the distance metric of choice for VD construction which is defined, between two points p and q with (x, y) coordinates in the Euclidean space, as:

d ( p, q)  ( p x  qx ) 2

 ( p y  q y )2

(4)

Fig. 6. Delaunay construction. The triangle pqr defines a Delaunay triangle where the centre (C) of the circle circumscribed to pqr is a Voronoi vertex.

The key point in the preceding discussion and the above synthesised figures is that, sites (dot patterns) geometry dictates the construction of the DT via its dual topology; the VD. Now, when it comes to digital images, blob pixels would correspond to those dot patterns, however, point coordinates (x, y) are discrete and are described by their two dimensional locations on a square grid. 3.1.

DTBIM: A constrained DT

As mentioned earlier, DTBIM tries to provide an alternative for the existing dilation methods (using set theory or distance transform). One of the main advantages of DTBIM is its small incremental expansion that is, as we will see later, unreachable by the smallest SE used in the common methods. Thus, shape characteristics may survive longer chained dilations before the shape is totally deformed. Such a feature can leverage the performance of several algorithms for further processing (e.g., word/line segmentation in

4 scanned text documents, text and shape based pattern recognition, etc.). The other characteristic of DTBIM is that it obviates the need for the use of a specific SE. Its morphology is inherited from its inner triangle structure which, unlike SE, is adaptive depending on the geometry formed by the object pixels in the Euclidean space. Moreover, DTBIM tend to be less prone to additive noise. Inner solid patches in a binary image are of no interest to DTBIM (i.e., no expansion occurs). Therefore, we pre-process the input binary image by extracting its contour with a predefined inward thickness level as follows (detailed steps are furnished in Algorithms 1&2):   0  Bt   where

(6)

B t  Dist ( B c )

where Bc is the image complement of the original binary image, B, where the foreground is black, and the contour is obtained from its distance transformed image (Bt). The variable τ controls the inward thickness of the contour and is chosen to generate a contour that is sufficient to construct DT (   3, and   B ).

// Filter out invalid triangles for each triangle TDT in DT do if (Area of TDT 0 && Bt

Suggest Documents