Pattern matching based on a generalized Fourier ... - Semantic Scholar

6 downloads 7240 Views 600KB Size Report
image or pattern and hence speed up the searching process. Also by ..... Registration", Proceedings of the 2nd International Conference on Information Fusion, ...
Pattern matching based on a generalized Fourier transform Dinesh Nair*, Ram Rajagopal, and Lothar Wenzel

National Instruments 1 1500 N. Mopac Expwy, Austin, TX 78759, USA ABSTRACT In a two-dimensional pattern matching problem, a known template image has be located in another image, irrespective of the template's position, orientation and size in the image. One way to accomplish invariance to the changes in the template is by forming a set of feature vectors that encompass all the variations in the template. Matching is then performed by finding the best similarity between the feature vector extracted from the image to the feature vectors in the template set. In this paper we introduce a new concept of a generalized Fourier transform. The generalized Fourier transform offers a relatively robust and extremely fast solution to the described matching problem. The application of the generalized Fourier transform to scale invariant pattern matching is shown here. Keywords: pattern matching, generalized Fourier transform, Walsh transform, machine vision

1. INTRODUCTION Pattern matching is the technique used in machine vision to quickly locate known reference or fiducial patterns in an image. With pattern matching you create a model or template that represents the object you are searching for. Then in your machine vision application the model is searched for in each acquired image. A score is calculated for each match. The score relates how close the model matches the matched pattern. Pattern matching is the key to many applications. Pattern matching can provide your application with information about the presence or absence, number, and location of the model within an image [1][2][3]. Figure 1 shows an example of typical pattern matching application.

(a) Figure 1: A pattern matching example. (a) Image of the template (pattern) to be located. (b) Matches found in an inspection image.

472

Advanced Signal Processing Algorithms, Architectures, and Implementations X, Franklin T. Luk, Editor, Proceedings of SPIE Vol. 4116 (2000) © 2000 SPIE · 0277-786X/00/$15.00

Traditional pattern matching techniques include normalized cross correlation, pyramidal matching, and scale-invariant matching. Normalized cross correlation is the most common way to find a template in an image. The following is the basic concept of correlation: Consider a sub-image w(x,y) of size K x L within an image f(x,y) of size M x N, where K Mand L N. The normalized correlation between w(x,y) andf(x,y) at a point (ij) is given by

(w(x, y) - w)(f(x + i, y + j)

- f(i, j))

x=O y=O

C(z,j) = [L-1K-1 Lx=0 y=o

where i = 0, 1 , ... M — 1 , j = 0,1 ... N — 1, W (calculated only once) is the average intensity value of the pixels in the template

w. The variable f(i , 1 ) is the average value offin the region coincident with the current location of w. The value of C lies in the range —1 to 1 and is independent of scale changes in the intensity values offand w.

Figure 2 illustrates the correlation procedure. Assume that the origin of the imagefis at the top left corner. Correlation is the process of moving the template or sub-image w around the image area and computing the value C in that area. This involves multiplying each pixel in the template by the image pixel that it overlaps and then summing the results over all the pixels of the template. The maximum value of C indicates the position where w best matchesf. At the borders of the image, correlation values are not accurate. Because the underlying mechanism for correlation is based on a series of multiplication operations, the correlation process is time consuming. With new technologies such as MMX, multiplications can be done in parallel, and the overall computation time can be reduced considerably. You can speed up the matching process by reducing the size of the image and restricting the region in the image where the matching is done. However, the basic normalized cross correlation operation does not meet speed requirements of many applications.

I_,

Ipfl.j

(Q,O)

I

-

I

------a

w')

(j)

f(xy) Figure 2: The correlation process.

Proc. SPIE Vol. 4116

473

You can improve the computation time by reducing the size of the image and the template. Once such technique is called pyramidal matching. In this method, both the image and the template are sub-sampled to smaller spatial resolutions. The image and the template can be reduced to one-fourth their original size. Matching is performed first on the reduced images. Because the images are smaller, matching is faster. Once the matching is complete, only areas with a high match need to be considered as matching areas in the original image. Normalized cross correlation is a good technique for finding patterns in an image as long as the patterns in the image are not scaled or rotated. Typically, cross correlation can detect patterns of the same size up to a rotation of 5°to 10°. Extending correlation to detect patterns that are invariant to scale changes and rotation is difficult. For scale-invariant matching, you must repeat the process of scaling or resizing the template and then perform the correlation operation. This adds a significant amount of computation to your matching process. Normalizing for rotation is even more difficult. If a clue regarding rotation can be extracted from the image, you can simply rotate the template and do the correlation. However, if the nature of rotation is unknown, looking for the best match will require exhaustive rotations of the template.

Many of the new commercially available pattern matching tools incorporate image understanding techniques to interpret the template information and use this information to find the template the image. Image understanding refers to image processing techniques that are used to understand the contents of an image. These include methods such as geometric modeling of images, efficient non-uniform sampling of images, extraction of template information that is constant irrespective of the rotation and size change of the template. These techniques reduce the amount of information needed to fully characterize an image or pattern and hence speed up the searching process. Also by extracting useful information from a template and removing the redundant and noisy information in the template, the search accuracy is improved. One such pattern matching algorithm uses a combination an intelligent image sampling technique and the edge information in the image. Most images contain a lot of redundant information. Using all the information present in the image to pattern matching makes the process time intensive and less accurate. Therefore by sampling the image and extracting a fewer number of points that represent overall content of the image, the speed as well as the accuracy of the pattern matching tool can be improved. Figure 3 shows an example of the sampling process.

(b)

Figure 3: Illustration of the intelligent sampling used in pattern matching (a) Reference pattern, (b) Sampled pattern. The pattern matching tool also utilizes the edge information in the image to provide information about the structure of the image. The edge image also contains a reduced amount of very significant data. Pattern matching then reduces to matching the edge and/or geometric information between the template and the image. Figure 4 illustrates the importance of edges and geometric modeling.

(a)

474

Proc. SPIE Vol. 4116

Figure 4: Illustration of edge detection used in pattern matching. (a) Reference pattern, (b) Edge information in the image.

The rest of the paper is organized as follows. Section 2 introduces the concept of a generalized Fourier transform. Section 3 describes a specific application of the generalized Fourier transform to scale invariant pattern matching. Finally section 4 gives a summary of this paper and talks about future directions in this research.

2. GENERALIZED FOURIER TRANSFORM (GFT) Given an arbitrary regular real valued matrix B, let B ' denote the column wise and cyclic shifted matrix B. The matrix A = BB' preserves the regularity property and satisfies the identity AM j where M is the dimension of B. To show this let x0, x1, AxM]= x0. This follows directly from AB =B'. We ..., XM] denote the different columns of B. Then Ax = x1, Ax = x2 have AMx0= xo. The same equation holds for all the other vectors x1 XM1. The matrix B is regular that is, the vectors x0, x1, . . ., XMJ are independent of each other. This proves the identity AM I. The eigenvalues of A are the M complex M th roots of 1 . This is a simple consequence of AM it follows AMx= 2Mx = x, i.e. 2M

Given x and A with Ax = Ax

Let C be the MxM matrix

oo..1

1o..o

c=o 1 . .0

00..0 The eigenvalues of C are 2 exp(2iti/M) and the conesponding eigenvectors form the known Fourier-matrix.

Lemma 1f2 is an eigenvalue of C and y is the corresponding eigenvector, then 2 is an eigenvalue ofA and By is the corresponding eigenvector of A.

Proof:

Let y=(yo, y y M-1 ). Then Cy=2cy implies yM1=Xyo, y0=xy1, ... , yM2=XyM1. The vector z = y0x0 + y1x1 + . . . + yMlxMl satisfies

Az = A(yoxo + y1x1 + ... + yM1xM1) = yoAxo + y1 Ax1 + ... + yM-1 AxM1 = y0x1 + y1x2 + ... + yM1xo. On the other hand, we have Xz = X(yoxo + y1x1 + . . . + yM1xM1 ) = yM1xo + Therefore, Az

y0x1 + . .

. + yM2xMl

= Az. The vectors z are different from 0 (the x-vectors are independent). In other words, the eigenvectors of A

are 21= exp(2711/M).

Because of Lemma 1 and according to classical matrix theory [3, Corollary 7.1.8], a regular matrix XB can be constructed that satisfies

XB1A = diag(2°, 2'

2M1) XB1 with 21= exp(2iti/M).

XB is the matrix consisting of all eigenvectors of A where the eigenvalues are the Mth

roots of 1.

Proc. SPIE Vol. 4116

475

In the special case B = I, A is a shifted version of I and the matrix XB is essentially the Fourier transform matrix. Obviously, there is a strong connection between the Fourier matrix and B ' . The latter can be interpreted as the cyclic shift operator. A matrix B consisting of an arbitrarily chosen vector and shifted versions of it produces the same Fourier matrix under weak conditions. More precisely, the matrix B must be regular. In other words, the Fourier transform is characterized by a family of matrices B where all columns are shifted versions of the first one. The canonical matrix producing the Fourier transform is (1 , 0, . . . , 0) but many others generate the same result.

The transform B — XB can be interpreted as a constructive scheme for generalized Fourier transforms where the classical Fourier transform is closely related to matrices B based on shifted versions of a given vector. If B follows a different construction scheme, a new generalized Fourier transform is generated. Different matrices B can produce the same generalized Fourier transform. It is an interesting endeavor to characterize the members of a family generating the same generalized Fourier transform. The most generic form of a generalized Fourier transform is based on a regular matrix B with arbitrarily chosen entries with i, j = 0, . . ., M- 1. The generalized Fourier transform can be regarded as the common basis of many other transforms. An interesting example is the construction of Walsh matrices based on the new construction scheme. The following example demonstrates this method. Example:

Let be given the following matrix

B=

b11

b12 b21 b22

b21

b22

b31

b32 b41 b42 b42 b31 b32

b41

b11

b12

with ( is very small)

(1+i b12

b22

b32

b42

(1-i

Lf31 +f41 (1-i (1+i

f31 +Lf41 (1+i (i-i

-fii +f21 (1-i (1+i fii f21 =+ S

= b31 The two free parameters b11 and b31 are chosen in such a manner that B is regular. It can be shown that

476

Proc. SPIE Vol. 4116

0 0

0

1+1

!—T

!+I

u=v co= !+T

0

!—T

!—T

0

1—I

!+T

0

0 o

XTJUUJ-qSJU

qaTqM SJflSOJ UT

TI x co=

I I T— I T— T—T— I— T— I

T

El

T

piiods nbTuqo1 o osn oq unoj wiojsui o uiuuop souuipuus uooiwioq UOAT sjUuTs OJM ouo Jo osoq poprus prn pTJTSSUp in oouUAp y IOSUIO idwxo ST OI1 UOuUUUflJOP JO UU UMOU)IUfloTIoJo JTqS uooMoq spus OMI I0!T0a1!PT SjUUTS 5 pu1 O'J S oq UAT UT OOuAp UTMOIIOJ IUODS ST oqUoTddU pui siojjo JA 2S1J Uo!:lnIos Ui1unss1 q2 S3OJJ Jo s!ou opq upWo :2auiI 1J

S

OP!M

u1o oq

J

(v) UI OOU1A1 .v)(j oindwoj

qJ

JO!JflOtT IUJOJSU1J1 Jo

.s

(rv) ouiuuoj oqi uoi2isod j jo oq wnuuxrujo

(H) UTJflp UJ!1unJ H) Cr indwoj

i

oq uiq-j jo oqi JoMod wnJl3ods jo

JoMod wnnoods

u ius j

siqj uoimndwo3 s

'sj osniooq 1cpoix uo uiq Jo

qi iunoj auojsuiq snw oq po2ndwoo oqj OJ Jo S!q OflA pui oq oudodd OflA jo oq .iounoj unojsui jo s SOAT oqi pausp uouiauojui noqi

()

oq osiqd :1J!qs uooM2oq 0142 OM sI1u!s

u

.1

u

uj 'jdouud noA sooqo £wpqi uiq jo oq iounoj auojsu jo aioiu sou oousisat U1j jj1 q2 ioqo suiq

wniuixiw jo oq JMod wnoods SJATIOP

pop si i pojpow wqwop uip.iooo o (y) pu flj siq ' 'x (2souip) JcJJqLrn i jo j, psi uo i znjiads uw o) uiopion(oui2sJp sq

oqj MOU owoqos psiq uo otp pzqiuou iounoj wiojsuin uo q iounoj uuojsu si pcjdo qTM OI4 MOU douoo ([) OJOqM 'UAT9 'UU J,S4 SJOOOA 'Ox 'TYJ

' TX JO OZTS J/4 ij31O

'aiounqinj q UOAT Txpou iiidUJ JO300A

'Ollffj

JA SJO13A 'Ox

.U0s0q3

ozjs

qom !x sq o q pouauJoop uj Joqo 'spJoM A Si 1 iSJOU UOTSJA jo i uiiioo J0200A-X sy pououuw'arojoq ut 1cuiiu 'sos1o

I2 S 'OX ' 1X S! UMOU) UI OU1A1pu UO U13 iflJOJO XjfljS OI UiiJOpUflOJ1T13flJSJO s!II1 .os oqj pZTpJUO JOJflOj wJojsuin uuopq 01 SJjjO i JjoAfl1joJ snqo pui sj uounjos o OIj1 poquosop

j

s

ioqo 'spJoM oq wquop si IIoJ!1u posq uo oq oijioods spoqui SJ posq UO UAJ j1EUTS pu pjUjS SUOTSJOA JO •1

suuuoo oq SJOOOA x siiiunoo uj OJfl43flJS JO 0q2 UAT SJOOOA X SJOqM Oq JOUflOJUUOjSU1iJ

uqoinu wjqoid

qi sdos J1 s1 :sMoJIoJ (J) U O3U1A1 ii) (T ondwoj 1X (ci) UJJfl oIu!unJ

(cj) onop

OUJUJJ2OU

ox()[X JOqM OI1 WJOj

() spuiis

JOJ IJ1 jDU000S MOJ JO [_X

()[X qj. -iomj uouudo si psiq uo Apoixo ouo xoidinoo jps onpod

xoidwoo oqmnu

(zu) ondwoD Otj OTJ V Proc. SPIE Vol. 4116

477

(D.3) Determine the nearest ,? to a. The index n represents the nearest x to the given noisy vector y.

The algorithm uses the first generalized frequency component beyond 0. If the size of the signal is a prime number, a better strategy starts with the highest generalized energy content of an XIj1 x0 component (increased noise radius). The prime number size avoids ambiguities during the process of determining the nearest neighbor. The step (D.3) is computationally inexpensive. The determination of the nearest neighbor can be based on phase information of the complex number a = Xj'2yfXj'2xo. XM1 were chosen Figure 5 depicts results of an implemented version of the algorithm (C) and (D). The M vectors x0 randonily. The object y was a slightly disturbed version of one of the vectors x0 XMJ. The computation of (Dl) is based on

exactly one complex scalar product. The steps (D.2) and (D.3) are computationally inexpensive. This is a significant improvement compared to the direct computation of all distances between the M vectors x0, . . ., XMJ and y. The latter can be extremely expensive, especially when the database (i.e., the set of vectors) is large.

xv Graph

I 2'- —"""

GFT result S

I .0 -

unit circle Mthroots

OS0.6 —

O4-

02-

'N t

2

'1t•

1

0.0-

f

-ft2-

O4-

: :: 1.0i2—

12

I I

k J¼%

-as

oo

3

05

1.01.2

Figure 5: The smaller points stand for the original set of 25 vectors. In the example the algorithm detects correctly the best match. The big dot is close to the element of the 25 roots of 1.

The algorithm was furthermore implemented and tested to solve a pattern matching problem. The learning phase (determination of the generalized Fourier transform XB1) is time consuming. The decision process itself is extremely fast. The images can be described as flattened vectors where significant reduction of the image content is still acceptable. As demonstrated in [1] and [2], even a small percentage of an image can still characterize the essential content.

3. APPLICATION OF THE GFT TO PATTERN MATCHING In a two-dimensional pattern matching problem, a known template image has be located in another image, irrespective of the template' 5 position, orientation and size in the image. One way to accomplish invariance to the changes in the template is by forming a set of feature vectors that encompass all the variations in the template. Matching is then performed by finding the best similarity between the feature vector extracted from the image to the feature vectors in the template set. Given the vector y extracted from the image and M template vectors, xo XM1 of size N each, pattern matching involves finding the vector x

478

Proc. SPIE Vol. 4116

that best matches y based on a distance metric (e.g., Euclidean distance). y usually is a noisy version of a certain x-vector. The set x0 XMJ 5 known in advance and the underlying structure of this set can be carefully studied. Scale invariant pattern matching is done as follows:

During the learning phase, which is usually done offline, the template points that represent the template are extracted from the template image, i.e., for each scaled version of the template image, a vector of pixel values is extracted to characterize the template at that scale. Let x0 XMJ represent the M vectors that are extracted from the scaled versions of the templates. The number of vectors extracted will depend on the scale variation of the template that you expect to see in the search image. Let B represent the matrix of these template vectors. B contains the vectors x as columns. Next, the generalized Fourier transform XB belonging to B is computed and inverse XB' is obtained. This is followed by determining XB12XO, where the term (2) stands for the second row of XB1.

During the matching phase, the traditional method of pattern matching as described in section 1 is followed. The template

image is moved pixel by pixel across the search image. At each position in the image, the pixels values in the image corresponding to the area corresponding to the smallest template region are extracted. Let these pixels values be represented by the vector y. Using the vector y, the scale of the template that matches closest to the vector y is determined using steps D.1

to D.3 as described in section 2. Once the best template scale is determined, the normalized correlation coefficient is computed between the image pixels and the pixels belonging to the template at the current scale. Figure 6 shows an example of the scale invariant pattern matching.

L4LhLrL

1

th::!! .

4 +

(a)

£

,

;;

e

=L (b)

(c)

Figure 6: Scale invariant pattern matching. (a) The original template. (b) Match found at a scale of 0.5. (c) Match found at a scale of 0.75.

4. SUMMARY In this paper the concept of a generalized Fourier transform is introduced. The generalized Fourier transform can be regarded as the common basis of many other transforms. As an example it was shown how the Walsh matrices could be based on this

scheme. However, the main focus of this paper was to illustrate the usefulness of the generalized Fourier transform to quickly determine the similarity between an unknown signal and a set of known signals. A method to do this was introduced in this paper and its application to scale invariant pattern matching was demonstrated. The concept of the generalized Fourier transform offers an interesting area of research. This paper has attempted to introduce the theory and explore a few potential applications.

Proc. SPIE Vol. 4116

479

5. REFERENCES 1. D. Nair and L. Wenzel, "Image Processing and Low-Discrepancy Sequences", Advanced Signal Processing Algorithms, Architectures, and Implementations IX, Proceedings of SPIE, Volume 3807, pp. 102-111, July 1999 2. D. Nair and L. Wenzel, "Application of Low Discrepancy Sequences and Classical Control Strategies for Image Registration", Proceedings of the 2nd International Conference on Information Fusion, SunnyVale CA, pp. 706-7 14, July 1999 3. IMAQ Vision for G Reference Manual, Chapter 14, National Instruments, 1999. 4. Gene H. Golub, Charles F. Van Loan, Matrix Computations, The Johns Hopkins University Press, 1991

480

Proc. SPIE Vol. 4116