Trademark Retrieval Based On Block Feature Index Code Day-Fann Shen, Li Jin, Hsuan T. Chang and Hsien Huang P. Wu National Yunlin University of Science and Technology, Department of Electrical Engineering, Douliou city, Taiwan 640 E-mail:
[email protected] ABSTRACT Current trademark classification procedure adopts the Vienna approach developed by World Intellectual Property Organization (WIPO). It has following drawbacks: 1. Tremendous amount of human efforts are required to classify and annotate each trademark submitted for registration; the time required to complete the registration would increasing as the number of registered trademarks increases. 2. Same trademark may be classified into different classes by two human classifiers. 3. It is difficult to classify trademarks with abstractive contents by human classifier. For the above reasons, it is important to improve the existing trademark classification system by replacing human classifier with effective and precise computer algorithms. As we review previous researches on trademark indexing and retrieval, we find that shape information together with profile and contents of trademark plays an important role in determining the trademark similarity. In this paper, we develop an effective feature for trademark image using Block Feature Index (BFI), the concept is borrowed from vector quantization. For performance evaluation purpose, 3000 trademark images in MEPG-7 were classified into 10 categories for experiments. We compare the recall rate and precision for the proposed BFIC algorithm and two well-known methods: Hu’s 7 Moments (H7M) and Zernike Moments (ZKM), the proposed BFIC method outperforms the other two methods. The performance can be even better by cascading the proposed BFIC and ZKM. Keywords: Trademark database, image retrieval, Block Feature Index Code (BFIC). 1. INTRODUCTION A trademark is mainly an image, but may also consist of text, image and even media like sound or even odors [1]. It represents the reputations, the products or services of an enterprise or a company; therefore, a trademark is an important intellectual property of the enterprise, which should be registered and properly protected against unlawful misusing. Most practical trademark retrieval systems [13-16] are co-exists with current Vienna based
0-7803-9134-9/05/$20.00 ©2005 IEEE
classification methods adopted by trademarks administration agents. Vienna system is developed by WIPO (World Intellectual Property Organization) for trademark registration and protection. It is designed to ensure that similar trademark patters are classified into the same class. However, Vienna based methods require tremendous human efforts and suffer from three major problems [1]: 1. Tremendous amount of efforts by human are required to classify and annotate each trademark submitted for registration; the time required to complete the registration would increasing as the number of registered trademarks increases. 2. Same trademark may be classified into different classes by two different human classifiers. 3. It is difficult to classify trademarks with abstractive contents by human classifier. Besides, currently trademark registration procedure is quite complicated and is very time consuming. It is highly desirable to develop a computer aided trademark indexing and retrieval system helping to resolve the above problems. CBIR (Content Based Image Retrieval) based techniques have gained wide applications in image database retrieval [3-8] and are very promising for trademark retrieval [1,6,10,11], the reasons are as follows: 1. Trademark images are large in quantity and complexity, which is a typical application area of image indexing and retrieval technique. 2. Avoid errors due to human factors. Especially abstractive trademarks are difficult for human being to classify, but is relatively easier to classify by image features. 3. Computer algorithm is more efficient which can save significant human efforts and costs. In this paper, we propose an image feature designed for trademark images called Block Feature Index Code (or BFIC) - an idea borrowed from vector quantization in image compression [2]. 3000 trademark images in MEPG-7 format are classified into 10 categories for experiments and performance evaluation purpose. We compare the recall rate and precision for the proposed BFIC based algorithm as well as two well-known algorithms: Hu’s 7 Moments (H7M) [8] and the modified ZerniKe Moments (ZKM) [9], the proposed BFIC based algorithm outperforms the other two. The performance can be even better by combining the proposed BFIC and ZKM.
III-177
2
The organization of this paper is as follows: In section 2, we discuss the characteristics of trademark images and review papers on shape feature extractions. In section 3, we propose the BFIC algorithm. Section 4 Experimental results, performance evaluation and comparisons with other well-known algorithms are presented in section 4. In section 5, we conclude the paper with discussions on how the proposed BFIC based algorithm can be used in a practical trademark registration and retrieval system.
Trademark images are characterized by (1) binary format and (2) small size (110*110 as provided in MPEG-7 format). Based on these characteristics, we propose a feature for trademark images called Block Feature Index Code (or BFIC). 3.1 The Block Feature Index Code (or BFIC) and the Codebook Each N*N trademark image is divided into small blocks (or vector) of dimension n*n, there are totally
2. REVIEWS
2
2.1 Features of Trademark Images Image feature is the key element in CBIR based technique. Among the image features, shape is the most important; For example, trademarks of the similar shape with different color and texture are still considered as illegal. Therefore, black and while binary provide the maximum trademark protection. For this reason, the 3000 trademarks provided in MPEG-7 are all binary images. 2.2 Review of Shape Features Shape features like lines, boundary, aspect ratio and circularity are common used and popular in CBIR. There are two kinds of 2D shape descriptors in MPEG-7 [4]; they are area based and contour based descriptors. For contour based descriptor includes Fourier descriptor[5], boundary direction histogram [6], Convex Factor [7]. Area based descriptor includes the invariant moments based on Hu’s work in 1962 [8], which are invariant under translation, rotation and scaling, but insensitive to local changes. Another invariant moment called Zernike moments (ZKM) [6][9-10], which is a popular area descriptor robust to noises, and is invariant to translation, rotation and scaling. The descriptor works well for a wide range of images, particularly the geometrically symmetric shapes. However, it does not perform well for highly irregular shapes and it requires extensive computations. Leung [11] proposed a skeleton based descriptor and claimed better performance than Hu’s 7 moments [8] and boundary direction histogram [6]. 4.
TRADEMARK RETRIEVAL BASED ON BLOCK FEATURE INDEX 5. Both contents and contours should be considered when classifying trademarks. Two trademark images are considered as similar if either contents or contour is similar. In this paper, our database includes 3000 binary trademark images provided in MPEG-7 S8 database [12]; we classify these images into 10 categories for evaluation purpose. Some researchers also include the shifted, scaled and rotated versions to increase the size of database; this approach may bias the performance evaluation and therefore not recommended.
( N / n ) blocks (or vectors) in each image. For each n*n small block, find the most similar code vector in the Block Feature (BF) Codebook, then let the index of that code vector as the Block Feature Index (BFI). The BFIC of the 2
trademark is formed by the corresponding ( N / n ) BFIs. We randomly select 2/3 trademark images from each of the 10 categories as the training set and then derive training vectors of dimension n*n accordingly. The rest 1000 trademark images are used as the test set. BFIC is extracted for all images in the current trademark database for later query and retrieval. The initial Block Feature Index Codebook (or BFIC Codebook) of size (length) L is generated using the fast PNN [14,15]. The final codebook is obtained by applying the well-known LBG [2] algorithm (or generalized Lloyd Algorithm) of vector quantization. 3.2 BFIC Based Trademark Query and Retrieval For a query trademark image X, we extract its BFIC and compute its distance (or degree of dissimilarity) with each image in the trademark database. Let f Q (i ) and f D (i ) are the i _th BFI of the query image and an image in the database, where i = 1,2,....,( N ) 2 , the distance n
between the two images is defined as: (
Distance =
N 2 ) n
∑δ( f i =1
Q
⎧ δ(x) = 1 , if x ≠ 0 (i) − f D (i )), ⎨ ⎩ δ(x) = 0 , if x = 0
(3-1)
The outputs are the K images in the database with smallest distances (Most similar) with the query image. The whole process is as follows:
Figure 1. The BFIC based Trademark Query Process Since the proposed BFIC based retrieval algorithm cannot resist the attacks by rotation, translation and scaling, we design a preprocessing to counterattack these attempts or errors. The preprocessing includes the normalization on image position, size and rotation.
III-178
Assuming the trademark image is of size N*N, max(X) and min(X) are the maximum and minimum in X axis coordinates respectively. Similarly, max(Y) and min(Y) are the corresponding coordinates in Y axis. (A) Position Normalization Find the center of the object (X,Y) X=(max(X)+min(X))/2, Y=(max(Y)+min(Y))/2 and move the object to the center of the image (N/2,N/2). (B) Size Normalization Enlarge the object to fit the image size N*N.
Step 1
3.4 Parameters in BFIC Codebook Codebook size and codeword dimension are the two important parameters in a BFIC codebook. In this section, we examine the effect of adjusting these two parameters on recall rate. In all cases, the number of output images K is set to twice of the total number of trademarks in the same category where the query image belongs. For example, if the query image belongs to a category of 100 images, then the number of output images is set to 200. Determining the Codebook Size Codebook size is the number of codeword in the BFIC codebook. FBIC codebook of 15 different sizes i.e. 6、 8、 10、 12、 14、 16、 18、 20、 24、 28、 32、 40、 48、 56 and 64 are created with dimension n*n set to 16.
Step 2
Figure 2. Position and size normalization (C) Rotation Normalization. Find the two eigenvectors using Hotelling transform [13] then rotate the binary object accordingly as shown in Figure 3. Figure 4. Average Recall Rate vs Codebook Size Figure 4 shows the average recall rate of all categories (Y axis) as a function of codebook size (X axis), which indicates that when codebook size of 16 yields the best average recall rate.
(a) (b) (c) Figure 3. Rotation Normalization (a) The binary object (b) Find orthogonal eigenvectors of the object (c) Rotate the object accordingly
Determining the Codeword Dimension Dimension is the number of pixels in a n*n block. The experiment set up is the same as above with the following dimension: 1 (1*1), 4(2*2), 16(4*4), 64(8*8), 144(12*12) and 256(16*16).
3.3 Criteria for Performance Evaluation Recall rate and precision are the two popular criteria for performance evaluation; they are defined for each query as follows: (1) Recall rate = a Precision
=
a+d a a+b
(2)
a: Number of correctly recalled images。 b: Number of incorrectly recalled images。 d: Number of correct (similar) but unrecalled images。 (a + b): Total number of recalled images。 (a + d): Total number of correct images in database。 The purpose of trademark (image) retrieval is maximizing the value of a and minimizing the value of b and d. Recall rate is more important than the precision in trademark registration, because if one (or more) similar trademark image in the database is unrecalled in response to a query, the consequences involves serious legal issues. We adopt recall rate as the criteria in the performance evaluation.
Figure 5. Average Recall Rate vs Codebook Dimension. Figure 5 shows that dimension of 16 yields the best average recall rate (of all categories). Based on the above experiments, we choose codebook size of 16 and codeword dimension of 16 (4*4). 4. Performance Evaluation and Comparison We compare the proposed BFIC based algorithm with two well known algorithms: Hu’s 7 moments (HU) [8] and ZerniKe Moments (ZKM) [9].
III-179
3
4
The experiments set up as follows: (1) Query images: Table 4 shows the 10 query images corresponds to each of the 10 categories. (2) Table 5 lists the number of classified images in each of the 10 categories (Human efforts are required to classify each image). (3) The total number of output images is set to 4 times the total number of images in each category. For example, there are 25 images in category 10; thus, for query image 10 (in Table 4), the output is the 100 (4x25) most similar (the smallest distance) images. 1
Category Query Image
6
2
3
4
References [1] [2]
Table 4. The query Images for each of the 10 categories Category Query Image
feature outperforms the other two in a big margin (69.3% vs 43.5 % and 19.1%). Several features can be combined for further improvements on performance. The goal should be 100% in recall rate with the fewest output images.
5
[3]
7
8
9
[4]
10
[5]
Table 5. Number of images in each of the 10 categories Category
1
2
3
4
5
6
7
8
9
Number
41
19
15
1 1
11 6
10 0
8 7
2 1
7 7
[6]
1 0 2 5
[7]
Table 6. Comparisons of Recall Rate (%) for Each Trademark Category (K=4) Category BFIC ZKM HU Category BFIC ZKM HU
1 67.4% 27.9% 14.0% 7 84.3% 37.1% 18.0%
2 55.0% 40.0% 5.0% 8 59.1% 22.7% 9.1%
3 62.5% 31.3% 6.3% 9 80.0% 41.3% 21.3%
4 54.6% 45.5% 45.5% 10 50.0% 84.6% 11.5%
5 6 81.0% 99.0% 46.6% 58.0% 8.6% 52.0% Avg. Recall Rate 69.3% 43.5% 19.1%
[8] [9]
[10]
Based on the experimental results in Table 6, the proposed BFIC based algorithm (average recall rate of 69.3%) outperforms ZKM [9] (43.5%) and HU [8] (19.1%) in a significant margin. One point needs to be clarified is that 56.13% of recall rate is claimed by [9] rather than 43.5% in Table 6. The main reason is that the database in [9] includes shifted, rotated, scaled, noisy and smoothed versions of the original images; as a result, the recall rate is boosted in the biased database. 5.
[11]
[12] [13] [14]
Conclusion
We propose a trademark database retrieval algorithm based on a binary image feature called BFIC (Block Feature Index Code), a concept borrowed from vector quantization (VQ) coding. Experiments are conducted to find the best codebook size and codeword dimension for the 3000 trademark images from MPEG-7 database. We then conduct experiments to compare the performance among the proposed BFIC feature and two well-known features Zernike moments (ZKM) and HU’s 7 invariant moments (HU), experiments show that the proposed BFIC
[15]
III-180
P. Eakins, J.M. Boardman, and M.E. Graham, “Similarity retrieval of trademark images”, IEEE Multimedia, 5:53--63, April-June 1998. Y. Linde, A. Buzo, and R. Gray, “An algorithm for vector quantization design,” IEEE Trans. On Communications, vol. 28, no. 1, pp. 84-95, Jan. 1980. V.N. Gudivada and V.V. Raghavan, eds., “Content-based Image Retrieval Systems”, Computer, Vol. 28, No. 9, Sept. 1995, pp. 18-22. M. Bober, “MPEG-7 visual shape descriptors”, IEEE transaction on circuit and system for video technique vol.11, NO.6, pp.716-719, June, 2001. Lin, C.S.,& Hwang, C.L., “New forms of shape invariants from elliptic Fourier descriptors”, Pattern Recognition, 20(5), 535-545, 1987. J. L. Shih, L. H. Chen, “A new system for trademark segmentation and retrieval”, Image and Vision Computing, vol. 19, pp. 1011-1018, 2001. Pisit Phokharatkul, Sombut Foitong, and Chom Kimpan. “Object recognition using characteristic component and genetic algorithms”. Proceedings of IEEE Region International Conference on Electrical and Electronic Technology, 2001, TENCON, Vol. 1, pp. 345–349. Hu, M. K., “Visual Pattern Recognition by Moment Invariants” IRE Trans. Info .Theory, vol:IT-8,pp:179-187, 1962. Hae-Kwang Kim, “A modified Zernike moment shape descriptor invariant to translation, rotation and scale for similarity-based image retrieval”, IEEE International Conference on, Volume: 1 , 30 July-2 Aug, 2000. Y. S. Kim and W. Y. Kim, “Content-based trademark retrieval system using visually salient features”, Image and Vision Computing Vol.16, pp.931-939, 1998. Wing Ho Leung and Tsuhan Chen, “Trademark Retrieval Using Contour-Skeleton Stroke Classification”, IEEE Intl. Conf. on Multimedia and Expo. (ICME 2002), Lausanne, Switzerland, August 2002. J.M. Martinez, MPEG-7 Overview (Version 8), ISOIEC JTC1SC29WG11 N4980, International Organization for Standardization, July 2002. Gonzalez, Rafael C., and Richard. E. Woods, Digital Image Processing, 1st edition, Addison-Wesley, 1993 Shen D.F and Kuo-Shu Chang, “Fast full search algorithm for VQ initial codebook design”, SPIE symposium on Visual Communication and Image Processing, January 1998, San Jose, CA. Vol. 3309, part II, pp. 842-850. Pasi Franti, Timo Kaukoranta, Day-Fann Shen and Kuo-Shu Chang, “Fast and Memory Efficient Implementation of the Exact PNN”, IEEE Transactions on Image Processing, VOL. 9., No. 5., May 2000, page 773-777.