Shape Based Image Retrieval Using Generic Fourier ... - CiteSeerX

27 downloads 4813 Views 802KB Size Report
In MPEG-7, shape is one of the key components for describing digital image along with ... through Fourier transform (FT) on a shape signature function derived from ..... (http://gofaster.gscit.monash.edu.au/~dengs/Regionn/gfd_src/query.html) ...
Shape Based Image Retrieval Using Generic Fourier Descriptors Dengsheng Zhang and Guojun Lu Gippsland School of Computing and Info. Tech. Monash University Churchill, Victoria 3842 Fax: 61-3-99026842 dengsheng.zhang, [email protected]

Abstract Shape description is one of the key parts of image content description for image retrieval. Most of existing shape descriptors are usually either application dependent or non-robust, making them undesirable for generic shape description. In this paper, a generic Fourier descriptor (GFD) is proposed to overcome the drawbacks of existing shape representation techniques. The proposed shape descriptor is derived by applying 2-D Fourier transform on a polar shape image. The acquired shape descriptor is application independent and robust. Experimental results show that the proposed GFD outperforms common contour-based and region-based shape descriptors.

Keywords: Fourier descriptors, shape, CBIR, retrieval.

1. Introduction Due to the rapid increase of multimedia information, there is an urgent need of multimedia content description so that automatic searching is possible. The newly emerging multimedia application MPEG-7 is to address this issue. In MPEG-7, shape is one of the key components for describing digital image along with other features such as texture and color. Six criteria have been set for shape description by MPEG-7, they are: good retrieval accuracy, compact features, general application, low computation complexity, robust retrieval performance and hierarchical representation.

Various shape descriptors exist in the literature, these descriptors are broadly categorized into two groups: contourbased shape descriptors and region-based shape descriptors.

Contour-based shape descriptors include Fourier descriptor (FD) [Granlund72, PF77, ZR72, KSP95, ZL01], wavelet descriptors [TB97, YLL98], curvature scale space [MM86, MAK96, DT97] and shape signatures [FS78, Davies97]. Since contour-based shape descriptors exploit only boundary information, they cannot capture shape interior

1

content. Besides, these methods cannot deal with disjoint shapes where boundary information is not available. Consequently, they have limited applications.

In region based techniques, shape descriptors are derived using all the pixel information within a shape region. Region-based shape descriptors can be applied to general applications. Common region based methods use moment descriptors to describe shape [Hu62, TC91, TC88, LP96, Teague80, Niblack et. al93]. These include geometric moments, Legendre moments, Zernike moments and pseudo Zernike moments. It has been shown in [TC88] that Zernike moments outperforms other moment methods in terms of overall performance. Recently, several researchers also use the grid method to describe shape [LS99, SSS00, CBPM00]. The grid-based method attracts interest for its simplicity in representation and conformance to intuition, however, its rotation normalization does consider shape interior content. Other region-based shape descriptors are also proposed, these include bounding box descriptor [ISO99] and multi-layer eigenvector shape descriptor [KK00]. Most of the region-based shape descriptors are extracted from spatial domain, as the result, they are sensitive to noise and shape variations.

In this paper, we propose a generic Fourier descriptor (GFD) which can be applied to general applications. The GFD is extracted from spectral domain by applying 2-D Fourier transform on polar shape image. The rest of the paper is organized as following. In Section 2, background of related work and GFD are described in details. In Section 3 we give detailed experimental results on the proposed methods and compare GFD with other shape descriptors. Section 4 concludes the paper.

2. Generic Fourier Descriptor In this section, we describe GFD in details. First, we give some background information on related work in §2.1. We then introduce two polar Fourier transforms used to derive GFD in §2.2. The implementation details of GFD are described in §2.3 and §2.4.

2.1 Related Work 2.1.1 One Dimensional Fourier Descriptor One dimensional FD has been successfully applied to many shape representation applications, especially to character recognition. The nice characteristics of FD, such as simple derivation, simple normalization, simple to do

2

matching, robust to noise, perceptually meaningful, compact and hierarchical coarse to fine representation, make it a popular shape descriptor [Granlund72, PF77, ZR72, Arbter et al90, Otterloo91, Rauber94, KSP95, MKL97, HH98, ZL01]. Generally, 1-D FD is obtained through Fourier transform (FT) on a shape signature function derived from shape boundary coordinates {(x(t), y(t)), t=0, 1, …, N-1}. A typical shape signature function is the centroid distance function which is given by the distance of the boundary points from the centroid (xc, yc) of the shape

r (t ) = ([ x (t ) − xc ]2 + [ y (t ) − y c ]2 )1 / 2 ,

xc =

where

1 N

N −1

∑ x(t ) t =0

yc =

1 N

t = 0, 1, ..., N − 1

N −1

∑ y (t ) t =0

An example of centroid distance function of an apple shape is shown in Figure 1.

(a)

(b)

(c)

Figure 1. (a) An apple shape; (b) the contour of (a); (c) centroid distance function of (a).

One dimensional FT is then applied on r(t) to obtain the Fourier transformed coefficients

an =

1 N −1 − j 2π n t r (t ) exp( ), ∑ N t =0 N

n = 0, 1, ..., N − 1

The magnitudes of the coefficients an (n=0, 1, …, N-1) normalized by the magnitude of the first coefficient a0 are used as shape descriptors, called Fourier descriptors. The acquired FDs are translation, rotation and scale invariant. It has been shown that shape representation using Fourier descriptor (FD) outperforms many other contour shape

3

descriptors [KSP95, ZL01-1]. However, all these methods assume the knowledge of shape boundary information which may not be available in general situations. For example, it is difficult to derive 1-D FD for the shape in Figure 2(a), because the contour of the shape is not available. Furthermore, 1-D FD cannot capture shape interior content which is important for shape discrimination. For example, FD is not able to discriminate the shape in Figure 2(b) from the shape in Figure 2(c). The drawbacks limit the application of 1-D FD.

(a)

(b)

(c)

Figure 2. (a) A shape without contour; (b)(c) two shapes with same contour but with different interior content.

2.1.2 Zernike Moments Descriptor The application of Zernike moments on shape overcomes the aforementioned drawbacks of 1-D FD. Zernike moment descriptor (ZMD) is obtained by using all the pixel information within a shape region. It does not assume shape boundary information. ZMD is one of the best shape descriptors among the existing shape descriptors. Many researchers report promising result of ZMD [TC88, KK00, ZL01-1]. It has been tested on MPEG-7 shape databases and adopted in MPEG-7 as region-based shape descriptor. An examination into ZMD reveals that it is essentially a spectral descriptor which is derived from two dimensional spectral transform of shape on polar space. The theory of ZMD is similar to FD. However, a more careful analysis on ZMD reveals that there are two shortcomings in ZMD. First, the bases of ZMD only contain angular frequency, they do not allow multi-resolution in radial directions. Second, due to the repetition in each order of the basis, the number of angular frequencies reflected at each order is much reduced. In the following, we examine ZMD in details.

The complex Zernike moments are derived from Zernike polynomials:

Vnm ( x, y ) = Vnm ( r cos θ , r sin θ ) = Rnm ( r ) ⋅ exp( jm θ )

(2.1)

4

and

Rnm ( r ) =

( n −|m|) / 2



( −1) s

s=0

(n − s )! r n−2 s n+ | m | n− | m | − s )!× ( − s )! s!× ( 2 2

(2.2)

where r is the radius from (x, y) to the shape centroid, θ is the angle between r and x axis, n and m are integers and subject to n-|m| = even, |m|≤n. Zernike polynomials are a complete set of complex-valued function orthogonal over the unit disk, i.e., x2 + y2 = 1. Then the complex Zernike moments of order n with repetition m are defined as:

Anm = =

n +1

π

∑∑ f ( x, y) ⋅ V

n +1

π

* nm

x

( x, y)

y

f (r cosθ , r sin θ ) ⋅ R ∑∑ θ

nm

(2.3)

(r ) ⋅ exp( jmθ ),

r ≤1

r

A list of Zernike moments up to order 10 is given in Table 1 [KK00]. The magnitudes of the acquired Zernike moments normalized by the mass of the shape are used as shape descriptors.

It can be seen from (2.3) that the basis of Zernike moments

Rnm ( ρ ) exp( jmθ ) only reflects angular frequency in

its trigonometric harmonic. This indicates that the radial spectral features of the shape are not captured in ZMD. Furthermore, the repetition of m in each order n of the basis reduces the number of angular frequencies each order of Zernike moment(or coefficient) captures. This indicates that the circular spectral features captured by ZMD are too coarse if the number of moments used is not sufficiently large. For example, the number of angular frequencies captured by the first 36 Zernike moments is 10. In other words, if 36 Zernike moments are used as shape descriptor, then the descriptor only captures 10 circular features. More circular features can be otherwise captured if there is no repetition in each order of the basis. To prove this fact, we propose a variation of ZMD (VZM) in (2.4). The variation is an extension to Zernike moments in a way by removing the repetition in each order of Zernike moment. It is given by

5

VF (u ) = ∑∑ f (r cosθ , r sin θ ) ⋅ r ⋅ exp( juθ ) r

(2.4)

θ

where r and θ have the same meanings as those in (2.1). (2.4) has simpler form than (2.3). However, the same number of transformed coefficients VF(u) captures more circular features than the same number of Zernike moments Anm captures. The retrieval effectiveness of F(u) will be shown in Section 3. Similar to Zernike moments, VF(u) does not capture radial features.

To summarize this section, we conclude that the Zernike polynomials Rnm(r) only add weight to the bases. They contribute little to the capturing of shape features. Furthermore, the polynomials create a number of repetitions in each order of the calculated moment. The repetitions are actually the weighted moments of the previous orders. In other words, for the number of Zernike moments calculated in each order, only one is essentially important, the others are the repetitions of the moments of the previous orders. The repetitions in each order can be saved for capturing radial features. With this idea in mind, we attempt to use 2-D polar FT (PFT) instead of Zernike moments. 2-D polar FT allows multi-resolution in both radial and angular directions, and the same number of FT coefficients can capture more shape features than ZMD. In the next section, we introduce PFT that captures both radial and circular features from formal FT theory.

Table 1. List of Zernike moments up to order 10 Order Zernike moment of order n with repetition (n) m (Anm) 0 A0, 0 1 A1, 1 2 A2, 0, A2, 2 3 A3, 1, A3, 3 4 A4, 0, A4, 2, A4, 4 5 6 7 8 9 10

A5, 1, A5, 3, A5, 5 A6, 0, A6, 2, A6, 4, A6, 6 A7, 1, A7, 3, A7, 5, A7, 7 A8, 0, A8, 2, A8, 4, A8, 6, A8, 8 A9, 1, A9, 3, A9, 5, A9, 7, A9, 9 A10, 0, A10, 2, A10, 4, A10, 6, A10, 8, A10, 10

Number of moments in each order n 1 1 2 2 3

Total number of moments up to order 10

3 4 4 5 5 6

36

2.2 Polar Fourier Transform

6

Fourier transform has been widely used for image processing and analysis. The advantage of analyzing image in spectral domain over analyzing shape in spatial domain is that it is easy to overcome the noise problem which is common to digital images. Besides, the spectral features of an image are usually more concise than the features extracted from spatial domain. One dimensional FT has been successfully applied to contour shape (which is usually represented by a shape signature derived from the shape boundary coordinates) to derive FD. The application of one dimensional FT on shape assumes the knowledge of shape boundary information. There is no reported work on region based FD. In this section we introduce generic FD derived from 2-D PFT.

The continuous and discrete 2-D Fourier transform of a shape image f(x, y) (0≤x

Suggest Documents