Invariant recognition to position, rotation and scale

0 downloads 0 Views 1MB Size Report
30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52 ... stuv w xyzaa ab ac ad aeaf ag ahai aj akal aman ao ap aqar as at au av.
Invariant recognition to position, rotation and scale considering vectorial signatures Jesús R. Lerma A.1,2, Josué Álvarez-Borrego3, José Ángel González-Fraga1 1

UABC, Facultad de Ciencias, Km. 103 Carretera Tijuana-Ensenada, Ensenada, B.C., México, C.P. 22870 2 UABC, Facultad de Ingeniería Ensenada, Km. 103 Carretera Tijuana-Ensenada, Ensenada, B.C., México, C.P. 22870 3 CICESE, División de Física Aplicada, Departamento de Óptica, Km. 107 Carretera Tijuana-Ensenada, Ensenada, B.C., México, C.P. 22860 ABSTRACT This work presents the development and utilization of vectorial signatures filters obtained from the application of properties of the scale and Fourier transform for images recognition. The filters were applied to different input scene, which consisted in the 26 letters of the alphabet. Each letter is an image of 256 X 256 pixels of black background with a centered white Arial letter. The image was rotated 360 degrees in increment of 1o and scaled from 70% to 130% in increment of 0.5%. In order to find a new invariant correlation digital system we obtained two unidimensional vector after to achieve different mathematical transformation in the target as well as the input scene. To recognize a target, signatures were compared, calculating the Euclidean distance between the target and the input scene; then, confidence levels are obtained. The results demonstrate that this system has a good performance to discriminate between letters. Keywords: Pattern recognition, scale transform, vectorial signature, euclidean distance.

1. INTRODUCTION Due to diversity of shapes and sizes that present both living organisms and inert objects, the necessity to look for automated system of identification, both in the industry and scientific research has arisen. Since 60’s decade, scientific community in optics area has used different types of mathematical transformation for pattern recognition, taking advantage of their different properties, for example: invariant to position, rotation, and scale. In the last years, they have been used as a tool in the digital processing of pattern. Recently several books have been published with an extensive overview of this topic [1-5] that show us a general view, with advances and developments of pattern recognition, as well as different tools for pre-accused images, extraction, selection and creation of characteristics, methods of classification, utilization of different types of transforms, etc. In this paper we will focus on examining the capabilities of vectorial signatures, as a method recognize objects, considering in this algorithm invariance to position, rotation and scale. Each letter is an image of 256 X 256 pixels of black background with a centered white Ariel letter. The images were rotated 360 degrees in increment of 1o and scaled from 70% to 130% in increments of 0.5%.To recognize a target in an input scene, signatures were compared using the Euclidean distance (dE) between the target and the input image. When an image is similar to another one, the dE has a minimum value. This paper is organized as follow: In section 2 the methodology is described. The results are presented in section 3. Finally, the concluding remarks are given in section 4. Further author information: J.L.A.: [email protected]; Tel.: +52 (646) 175 0700 ext. (131); fax: +52 (646) 174-59-25; uabc.com J.A.B.: [email protected], Tel.: +52 (646) 175 0500 ext. (25033); fax +52 (646) 175 0577; cicese.com J.A.F.: [email protected]; Tel.: +52 (646) 175 5935 ext. (113); fax: +52 (646) 175-4560; uabc.com

Applications of Digital Image Processing XXXI, edited by Andrew G. Tescher, Proc. of SPIE Vol. 7073, 707329, (2008) · 0277-786X/08/$18 · doi: 10.1117/12.795301

Proc. of SPIE Vol. 7073 707329-1

2. METHODOLOGY In this work an algorithm was constructed to make a new invariant digital system to position, rotation and scale using vectorial signatures. The aim is to obtain an automatic and simple system for pattern recognition with low computer cost. In this work we choose of the scale transform, developed by Cohen [6], which is a restriction of Fourier-Mellin transform[7], the main property of the scale transform is the scale invariant property. If we call c the scale variable, then the scale transform and its inverse; may be written as:

³ f ( x)

2S 2S

e (  jcInx ) x

0

f

1

f ( x)

f

1

D (c )

³ D (c )

e

[8]

(1)

dx

( jcInx )

dc; x t 0

x

f

(2)

The scale transform can be written as: f

1

D (c )

2S

³

f ( x) x

 jc  1

(3)

2 dx

0

which shows that it is the Mellin transform with the complex argument -jc + 1/2. A practical realization of the Mellin transform is given by a logarithmic mapping of the input scene followed by a Fourier transform. Since we are dealing with the Mellin transform of a complex argument, there exists a direct relationship with the Fourier transform. In particular if we define a signal. 1

f1 ( x )

x

(4)

f ( Inx)

then by substituting in Eq. (1) we have 1

Dl (c)

2S

f

³

(5)

f ( x )e  jcx dx

f

that is F(c)=Dl(c), where F is the symbol for Fourier transform. From this relation, one can see the scale by re sampling the samples uniformly distributed of x with a logarithmic function. A nonseparable direct scale transform D(cr ,cș) is used in this correlation process because it is invariant to size and rotational changes, and is given by: D(cr , cT )

1 2S

f 2S

³³

f ( r ,T ) r

 jcr  1

2 e (  jT cT

)drdT

(6)

0 0

and taking the logarithmic of the radial coordinate, that is Ȝ=ln(r), we get D(cO , cT )

1 2S

f 2S

³³

e ( O / 2) f (O ,T )e  j ( O cO T cT ) dOdT

(7)

0 0

Figure (1) shows the different steps used in the new algorithm proposed in this work. The target to be recognized is denoted by the function f(x,y), and the absolute value of its Fourier transform (F) is obtained (step1). A parabolic filter to

Proc. of SPIE Vol. 7073 707329-2

the modulus of F is applied, (step 2). Of this way, low frequencies are attenuated and high frequencies are enhanced in proportion to (wx)2, (wy)2 [8] , where w is angular frequency. The scale factor, r , (step 3) is applied to the results of step 2 with step 1( r is the radial spatial frequency). This process is what differentiates the scale transform from the Mellin transform. After these steps, we mapped the Cartesian coordinates to polar-logarithmic coordinates (step 4). In this step we introduced a bilinear interpolation of the first data of coordinate conversion. This is done to avoid the aliasing due to the log-polar sampling. A subimage is chosen from step 4 which contains more information of the target and a bilinear interpolation is done again in order to resize the matrix data to its original size (step 5). Steps from 1 to 5 can be seen in the right side of figure 1. From the subimage, we obtain two 1-D vectors by projecting M[log(r),ș], onto the two axes x and y. In other words, we compute the marginals, according with the equations: (8) M (t ) M [log(r ), t ], rotation vector, T

¦

log( r )

M log( r ) ( s )

¦ M (s,T ),

(9)

scale vector,

T

Figure 2 shows these two steps perfectly, (steps 6 and 7)

(Z

I

VN

%UI

(7)

-'p1 (5) (9)

(a) (b) Figure 1. (a)Block diagram of used procedure, (b)Example of the procedure (steps 1 to 5)

Proc. of SPIE Vol. 7073 707329-3



A modulus of the Fourier transform is calculated in order to obtain the rotation and scale signatures (equations 10 and 11), which will be the two unidimensional vectors for the target (steps 8 and 9). Figure 3 shows these two vectorial signatures. V1 (w t )

F[M T (t )],

V2 (w s )

F[ M log( r ) ( s )],

rotation signature,

(10)

scale signature,

(11)

To evaluate this algorithm, we select any of the letters from alphabet rotated or scaled, which follows the same process explained above. Once obtained both signatures. In order to determine the similarity between any two images Ii, Ij in the database, the simple Euclidean distance function between their signatures was used, and their average value was retained as the overall distance: dE = d i, j

1§ ¨ 2 ¨©

¦ V1i wt  V1 j wt  ¦ V2i ws  V2j ws

2

2

· ¸¸ ¹

Thus, the signatures of all the j images of the database can be compared with any i image to be recognized.

At

X AXIS (ROTATION)

(c} Figure 2. Tridimensional image of Fig. 1.(5’), (b) Vector of the summation on the rotation axis, (c) Vector of the summation on the scale axis.

Proc. of SPIE Vol. 7073 707329-4

(12)

3. RESULTS AND DISCUSSION 3.1 Analysis of the rotated images

In our work, images of different letters of the alphabet were used like input image. The image of letter E was used like target. xlO

45x10

4

35

tt) 2

= .

10

0

60

100

60

200

250

3

On

0

300

00

50

150

200

250

300

X AXIS (FREQUENCY)

V AXIS (FREQUENCY)

(b)

(b) Figure 3. Scale signature vector (a), rotation vector (b)

The behavior of the Euclidean distance (dE) for the target E versus itself with respect to rotation from 0 to 359 degrees was studied. Because the letter has distortions, when is rotated, due to the pixels, the curve representing the dE is cyclic and with symmetry of 180o (Fig. 4). In order to see if this methodology has a good performance to recognize the target when is compared with the other letters, a numerical calculation was realized. Figure 5 shows this result. We can see a very good separation of the dE for the target with respect to the calculation of the dE for the other letters. When an image is similar to another one, the dE has a minimum value. Statistic was realized and the mean value ± 2SE (two standard error) were calculated, we can see that, this algorithm has a 100% level of confidence for this case (Fig. 6). 7

1.0x10

6

Euclidean Distance

8.0x10

E

6

6.0x10

6

4.0x10

6

2.0x10

0.0 -50

0

50

100

150

200

250

300

350

400

angle of rotation

Figure 4. Behavior of the Euclidean distance where the target and the input scene is the letter E.

Proc. of SPIE Vol. 7073 707329-5

3.2 Analysis of the scaled images

To analyze the behavior of changes on scale, tests were realized doing changes on scale from 70% to 130% of the image of reference (Fig. 7), compared with all the letters of the alphabet. In figure 8 we can see a magnification of the zone from 90% to 110 % where we observed that the results have a better visualization. Target E 7

6.0x10

184 186 96 178 276 358 173 274 175 355 13456710 181 183 185 353 94 172 352 356 176 275 9 99 189 265 180 95 83 85 263 101 191 279 281 268 273 174 354 82 88 93 262 179 266 270 271 359 86 90 91 357 97 100 177 187 190 264 277 280 269 89 84 351 171 267 81 87 261 342 161 162 2 811 182 188 341 170 325 145 12 14 16 192 194 196 278 322 323 350 142 143 13 15 193 195 102 282 272 9298 55 235 160 169 326 340 305 307 311 150 146 115 125 126 127 131 19 104 199 284 22 25 202 205 80 52 53 232 233 260 336 345 348 349 156 165 168 306 308 330 331 292 295 151 112 128 129 17 18 103 106 197 198 283 286 71 72 251 252 60 240 337 339 343 344 347 157 159 163 164 167 309 312 314 332 296 299 152 116 132 134 105 109 119 289 26 206 61 62 29 41 56 209 221 236 241 242 333 334 302 153 154 147 130 122 285 21 66 70 201 246 250 63 75 243 255 258 259 78 79 36 38 212 216 218 155 166 324 327 328 335 346 301 303 310 313 291 148 144 111 113 117 121 123 133 20 107 108 200 287 288 23 24 27 64 73 74 203 204 207 244 253 254 77 31 32 35 37 39 42 44 51 211 215 217 219 222 224 231 320 321 329 297 298 149 140 141 110 114 118 290 293 294 67 247 28 57 58 208 257 50 54 230 234 237 238 338 158 304 124 69 249 59 76 256 33 40 43 213 220 223 239 65 245 30 34 47 210 214 227 315 317 319 300 135 137 139 120 68 248 49 229 45 46 225 226 316 318 136 138 48 228

7

5.5x10

jljjop bdef cp fz gb ya cq cr ckcn gw gc gd mu gv gy ms ct cf ci gu gx gf cs cv jq jrjt jz ce ch jc jdjfjgji jn gt dk gh ki mo mp mr mt kg kh kj kk cj di dj dm dl gz gg fq ffs rffu fv tfw ac ghijknopqrstuvwxazab cu db cg cm dg dh gs fy ke kf ag mq dd gn he gage gigm ak ac af cw dc df dn go gp gr ha hi kd klko alao ap ar dw dx hn hp mn cd de jbjejhjk jm jsju kc hj ku kvky hd aq asav aj ad ae cz da dv jx jka ykb kq hm hqht bk do ds gq hb hc hh mi ed ee kt lb lclflg km ai by cc ja kp hw ay bi bj dr es ew gl li llnlolq hg l ah du jv hu hv ilin mm ml aw ax bg bn bp dq eu ev ks tlw ek o fx gj am cb cy ea eb ec jw hx hy az ba bm br bu clco cx dp ip kn er et hk ez ffc bfd plrlsllu me mh jflffnfp ei ej kz la lx hf m dy kr bl bw dt fihiiijik bd bh ey gk mf mg fh fffk ifm eh kw el an bs bt bv bx iqiirsiiu tiiw vix hlho ia ib iid cieiig im bc be bf bo ex fa md mj mk fe ffg ef llmb yzmc em ep eq at bz ca iiz y jj hr hs hz au bb bq io lm eg ld ma eo dz lelhljlk kx ll lv en

7

5.0x10

7

4.5x10

Euclidean distance

7

4.0x10

MS MU FU MR E FT GC FW CQ CI JO MT B CK JG FV JIJL CN FZ C CH CR GD GA MQ JF JP FS EE JS LC DF G CS CU CJ CO GE MN JH JM JQ FP GB FX MO MP EC ED LA LB LF HI CT CV JR AS GG FQ FR EB EH JT KZ LH AV AX CP HQ EJ EA KY CG AQ AR AY CL HT HV JN GH EK L ILN AW HO JE JJ AP EI MM LG LQ CF HN HP AO HU HW FO ES EV DP KX LE LL LR LT LU LV LY A JK CD HM JD KN FYGF EL EN EO ET EW EX FA DZ EG DQ LJ LK LM CE JB KO BA BB BC BG BH HX AD EM EP EQ FB DS DW KQ DJ DL KU LO LX LZ MA MG MJ JC BJ CM CW HS HZ IA IE FC FI FL EZ MK ML DT DM DN KR LP LS MD HB JK JU KH KJ KK KL BD BE BK BL HY IF IH AE FF FM FN EU EY DU CX DH DK DO DR KS LW MB ME HC KF KI KM KP ANAUAZ BF IB IIC IIIJIIM AG GI FD FG DV DY DB DC DE DF DI KT KW MF MH MI HE HL JV KD KG BI BO D FH FJ FK JZ KA KC CY DD DG MC L AB AC AF AH AK FE KB CC GZ HD HF JW KE X AA Z BN BP GJ HA HI BM BT L DX EF ER CZ DA KV LD BW BZ GV GX GY M P V Y AM BQ IIN AI GK GN CA CB HKHR IGIK IR IX JX JY Q ST W AT BR AJ GO GT GU GW IIJA YZ R BU HG IO GQ GR BV BX BY IS HH N O U BS GM GP GS IP Q IIU TIV IW AL GL HJ

7

3.5x10

7

3.0x10

7

2.5x10

1

7

2.0x10

A a

7

1.5x10

7

1.0x10

6

5.0x10

0.0 -50

0

50

100

150

200

250

300

350

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

400

Angle of rotation Figure 5.Euclidean distance of the letter E compared with the letters. Target E 7E7

6E7

Euclidean distance

5E7

4E7

3E7

2E7

1E7

0 A B C D E F G H I J K L M N O P Q R S T U VWX Y Z Letters

Figure 6. Statistical distance behavior.

Proc. of SPIE Vol. 7073 707329-6

Mean ±SE ±2*SE

Figure 9 shows a box plotted of the mean value ± 2SE of the dE of the data shows in Fig. 7 and we can see that this algorithm has a very good performance in to recognize the E target when is compared with the others letters of the alphabet. 3.3 Analysis of the confidence levels

In 3.1 and 3.2 the particular case of letter E, used as reference was analyzed, but a complete analysis was realized to each letter of the alphabet, and the confidence level from each one was obtained (Tables 1 and 2). There is no symmetry in tables, so if we take the letter E like a target, we have a 100% confidence level. With letter H exists a confidence level less than 80% with letters E and F. Similar results can be found with targets L and T in three cases.

Euclidean distance

Target E 85 80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5 0

t s v u

r

q

p

o

16 15 w z y x 1918 17 ab aa 21 20 af ae ad ac 23 22 ai ahag 27 2625 24 28 32 31 3029 aj 34 33 ak 36 al 37 35 anam 3938

n

14

m

l

1312

i

k j

9

h 8

g

f e

d c

4 6 5

b a

3 2 1

7

1110

ap ao 41 as ar aq 43 42 40 au at av 4645 44 axaw ay 50 6160 5958 5756 bc 49 4847 55 az 54 51 53 bb ba 52 bi bhbg bf bebd

BIBH BGBFBE ML K ON BDBC BBBA RQP AZAY AXAW T S Z Y AV AT VU XW AU ASARAQ AKAJ AIAHAGAFAEADACABAA APAOANAMAL

70

80

90

100

110

J I

120

H

D GF E

C B A

1 A a

130

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Scale Figure 7. Changes of distance vs. scale for all letters

Table 3 shows the confidence level when the variation in scale is analyzed. Three different ranges in scale are presented. The worst case scenario is the range of 70%-130% in scale.

4. CONCLUSIONS This paper describes the development of a system for pattern recognition with the utilization of vectorial signatures, based on the well know relation between scale and Fourier transform, and has been developed to be practical and accurate. Realizing diverse numerical tests for 26 letters of the alphabet, with variations in the scale and rotation and using the Euclidean distance, we found that it is possible to identify them with a high level of confidence

Proc. of SPIE Vol. 7073 707329-7

Euclidean distance

Target E 2.6x10

7

2.4x10

7

2.2x10

7

2.0x10

7

1.8x10

7

1.6x10

7

1.4x10

7

1.2x10

7

1.0x10

7

8.0x10

6

6.0x10

6

4.0x10

6

2.0x10

6

E F H L P T

0.0 88

90

92

94

96

98

100 102 104 106 108 110

112

Scale Figure 8. Changes of distance vs. scale.

Target E 7E7

6E7

Euclidean distance

5E7

4E7

3E7

2E7

1E7

0 A B C D E F G H I J K L M N O P Q R S T U VWX Y Z Scale

Figure 9. Statistical distance behavior.

Proc. of SPIE Vol. 7073 707329-8

Mean ±SE ±2*SE

Table 1. Confidence level.

INPUT SCENE

TARGET A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

A 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 85.56 100 100 100 100

B 100 100 100 77.78 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

C 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

D 100 81.11 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

E 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

F 100 100 100 100 100 100 100 100 100 100 100 97.78 100 100 100 100 100 100 100 50 100 100 100 100 100 100

G 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

H 100 100 100 100 57.78 78.89 100 100 100 100 100 86.67 100 100 100 100 100 100 100 85.56 100 100 100 100 100 100

I 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

J 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

K 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

L 100 100 100 100 100 81.67 100 100 100 100 100 100 100 100 100 100 100 100 100 65.83 100 100 100 100 100 100

M 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

Y 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

Z 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

Table 2. Confidence level.

INPUT SCENE

TARGET A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

N 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 80.56 100 100 100 100

O 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

P 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

Q 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

R 100 100 100 89.17 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

S 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

T 100 100 100 100 100 49.72 100 100 100 100 100 33.33 100 100 100 100 100 100 100 100 100 100 100 100 100 100

U 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 95.56 100 100 100 100 100 100 100 100 100 100

Proc. of SPIE Vol. 7073 707329-9

V 86.94 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

W 100 100 100 100 100 100 100 100 100 100 100 100 100 93.33 100 100 100 100 100 100 100 100 100 100 100 100

X 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

Table 3. Confidence level for changes in scale of letter E

A B C D E F G H I J K L M

Scale 90-110%

Scale 80-120%

Scale 70-130%

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

70.73

39.34

100

100

100

90.48

63.41

42.62

100

100

100

100

100

100

100

100

100

100

90.24

57.38

100

100

100

Scale 90-110%

Scale 80-120%

Scale 70-130%

100

100

100

100

100

100

100

100

90.16

100

100

100

100

100

100

N O P Q R S T U V W X Y Z

100

100

100

100

82.93

47.54

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

100

REFERENCES [1] [2] [3] [4] [5] [6] [7] [8]

Pratt, W. K., [Digital image processing], chapters 6-19, John Wiley & Sons, Inc., Hoboken, NJ, (2007) Gonzalez, R.. and Woods, R., [Digital Image Processing], chapters 9-12, Pearson Prentice Hall, Upper Saddle, NJ, (2008) Gonzalez, Rafael C., Woods, Richard E., Eddins, Steven L., [Digital Image Processing Using MATLAB®], Pearson Prentice Hall, chapters 9-12, Upper Saddle, NJ, (2004) Cheriet, M., Kharma, N., Liu, Ch. and Suen, Ch., [Character recognition systems : a guide for students and practioners], chapters 6-19, John Wiley & Sons, Inc., Hoboken, NJ, (2007) Obinata, G. and Morris, G., [ Vision systems, segmentation and pattern recognition] , I-Tech Education and Publishing (2007) Cohen, L., “The scale representation”, IEEE Trans. on signal processing 41, 3275-3291 (1993) Casasent, D. and Psaltis, D., “Position, rotation, and scale invariant optical correlation”, Appl. Opt. 15, 1795-1799 (1976) Pech-Pacheco, J., Álvarez-Borrego, J., Cristóbal, G. and J., Keil, M., “Automatic object identification irrespective of geometric changes”, Optical Engineering 42, 551- 559 (2003)

Proc. of SPIE Vol. 7073 707329-10