A Bag-of-Paths based Serialized Subgraph Matching for Symbol

0 downloads 0 Views 3MB Size Report
ing algorithm based on bag-of-paths for solving the problem of symbol .... vectorized floorplan, here the circles denote the occurrences of the query symbol.
A Bag-of-Paths based Serialized Subgraph Matching for Symbol Spotting in Line Drawings Anjan Dutta∗ , Josep Llad´os∗ , and Umapada Pal+ ∗

Computer Vision Centre, Edifici O, Campus UAB, 08193 Bellatera, Barcelona, Spain + CVPR Unit, Indian Statistical Institute, 203, B.T.Road, Kolkata-700108, India {adutta,josep}@cvc.uab.es [email protected]

Abstract. In this paper we propose an error tolerant subgraph matching algorithm based on bag-of-paths for solving the problem of symbol spotting in line drawings. Bag-of-paths is a factorized representation of graphs where the factorization is done by considering all the acyclic paths between each pair of connected nodes. Similar paths within the whole collection of documents are clustered and organized in a lookup table for efficient indexing. The lookup table contains the index key of each cluster and the corresponding list of locations as a single entry. The mean path of each of the clusters serves as the index key for each table entry. The spotting method is then formulated by a spatial voting scheme to the list of locations of the paths that are decided in terms of search of similar paths that compose the query symbol. Efficient indexing of common substructures helps to reduce the computational burden of usual graph based methods. The proposed method can also be seen as a way to serialize graphs which allows to reduce the complexity of the subgraph isomorphism. We have encoded the paths in terms of both attributed strings and turning functions, and presented a comparative results between them within the symbol spotting framework. Experimentations for matching different shape silhouettes are also reported and the method has been proved to work in noisy environment also. Keywords: Symbol spotting, Serialization of graphs, Graph matching, Bag-of-paths, Attributed strings, Turning function, Graphical indexing, Mean paths.

1

Introduction

Information spotting is a major branch of indexing and retrieval methods. In document analysis, the research community is mainly focused in word spotting for textual documents and symbol spotting for graphical documents. Nowadays, symbol spotting has experienced a growing interest among the Graphics Recognition community, a subfield of Document Image Analysis and Recognition (DIAR). It can be defined as locating a given query symbol into a graphical document image, which is commonly referred as focused retrieval. The main application of symbol spotting is indexing and retrieval into large database of

2

Anjan Dutta, Josep Llad´ os, Umapada Pal

graphical documents, e.g. finding a mechanical part into a database of engineering drawings. The desired output for a particular query should be a ranked list of retrieved symbols in which the true positives should appear at the beginning. Although symbol spotting is an emerging topic, several efforts have been made among the graphics recognition community for spotting symbols in graphical documents [10]. The algorithms proposed by Messmer [5], M¨ uller and Rigoll [6] are among the first few approaches of symbol spotting. Graph based methods [4] are also popular, but they often suffer from computational complexity. Among the others, Rusi˜ nol and Llad´os have used a technique of splitting the symbols into several primitives and used graph matching [9] and off-the-shelf shape descriptors [8] to represent them. Recently Nayef and Breuel [7] proposed a branch and bound algorithm for spotting symbols in documents. The preprocessing is done by simple morphological operation and then thinning. In this paper we propose a symbol spotting technique in line drawings based on attributed subgraph matching. Graphs are very suitable to represent graphical entities, in particular, line drawings. Also graph representations, when storing geometric information can efficiently handle various affine transformations viz. rotation, translation, scaling. Hence symbol spotting can be solved by inexact subgraph isomorphism techniques, which can take the advantage of the soundness of graph theory. Moreover, graphs are widely adapted by the research community as a robust tool since a long back, as a result lots of efficient methods and algorithms are available to handle the graph based methods. On the other hand, (sub)graph isomorphism is considered as a computationally hard problem, for that reason handling a large database of graphical documents using graphs is difficult as it increases the computational complexity. To avoid the computational burden, in this paper we propose a method based on factorization of graphs that represent the documents and find the common factorized substructures within the entire collection. The finding of the common substructures for the whole document collection and then performing the graph matching helps to reduce the computational complexity at the ultimate level. The proposed method can also be seen as a way to serialize graphs i.e. to represent them by one-dimensional structures. This further allows to reduce the complexity of subgraph isomorphism process to spot symbols on documents. The rest of the paper is organized into three sections. In Section 2 we present the detailed methodology of our algorithm with a brief description of the spotting architecture. Section 3 presents the experimental results of the proposed methods. After that in Section 4, we conclude the paper and future research lines to extend the present work are defined.

2

Proposed methodology

In this work we propose an efficient error tolerant subgraph matching algorithm based on the idea of bag-of-paths. Bag-of-paths for a particular graph is informally defined as the set of all acyclic paths between each pair of connected nodes of that graph. The construction of bag-of-paths is based on the idea of

A Bag-of-paths based Symbol Spotting in Line Drawings

3

graph factorization. For our case we factorize the graphs constructed from the documents within the database, as well as the query symbol. Similar paths of the whole document database are clustered into a lookup table so that the mean path of each of the clusters can serve as the key index of that cluster. Here the basic idea is to find the best matching cluster for each of the paths in the model bag-of-paths and apply a spatial voting scheme to the terminal points of the paths to detect the symbol in the whole database simultaneously.

Query Symbol

Path Computation

Store

storage

…………..…… …………..…… LUT …………..……

…………..…… Paths of the existing documents

Path Clustering

Path Computation

New Document Path Comparison, Voting, Spatial clustering of Points

Symbol Spotting Ranked results

Construction of LUT

…………..…… …………..…… LUT …………..……

…………..…… Offline Step

Online Step

Fig. 1: Outline of the spotting architecture.

Our entire framework can be divided into two different parts viz. offline and online (see Fig. 1). The offline part includes the computation of all the acyclic paths, clustering of those paths within the whole collection, construction of lookup table and computation of the mean path which acts as an index key for each of the table entries. Each time a new document is being included in the database the entire offline procedure is repeated to create the updated lookup table. For each of the documents in the database all the computed paths are stored to reduce the further path computation time. On the other hand, the online part includes the querying of the graphic symbol by an end user, computation of all the acyclic paths for that symbol, a voting scheme which is based on the similarity measure of the paths composed of the query symbol and a spatial clustering technique to detect the query symbol on the image database. 2.1

Bag-of-paths

Our bag-of-paths approach can be motivated by an analogy to learning methods using the bag-of-words representation for text categorization. Bag-of-paths for a particular graph G can be defined as a set P of all acyclic paths between any two connected nodes of that graph. For instance in Fig 2(a) we have shown one

4

Anjan Dutta, Josep Llad´ os, Umapada Pal

sample symbol, in which all the acyclic paths from the point A to the point C are A − B − C, A − D − C and A − C, which are shown in Fig 2(b) and all the acyclic paths for that symbol are shown in Fig. 2(c). So for a graph corresponding to a vectorized document or model symbol, which can be thought of as an union of several symbols, we can find a set of paths which is referred as bag-of-paths.

A

B

D

All acyclic paths between A and C C

B A

A

CD

(a)

C

C

(b) A

B

A

A

C D

A-B

C

C

A

A

D

D

B A CD

A-C

B A

B A

B

A

B

C D

C

C

D

D

B-C

A

B A CD

B-D

C

Input graphs

Bag of input paths

A-D B C D

A

A

CD

CD

B C

C-D

(c)

Model graph

(d)

Bag of model paths

Fig. 2: Bag of paths representation (a) An example symbol. (b) All the acyclic paths between the points A and C of the symbol. (c) All the acyclic paths of the symbol. (d) Bag-of-paths representation for the document database and query symbol.

In this work the paths are encoded in two different ways viz. (1) attributed string [11] and (2) turning function [1]. String edit distance [11] and Lp distance [1] based metric are respectively used to measure the similarity between different encoded paths. Finally, the performance of these two metrics are compared within a symbol spotting framework. 2.2

Construction of the LUT

The lookup table (LUT) is constructed by clustering the similar paths within the entire collection of documents. The clustering is intended to separate the structurally dissimilar paths into different clusters and accumulate the similar paths into same clusters. The LUT consists of two different items: a representative path of each cluster which acts as the indexing key and the list of locations where the paths belong in the document database. For our case the representative path is the mean path which is efficiently computed from the mean turning functions of the respective paths as detailed in [3]. The clustering of the paths is done in two steps: (1) all the paths in a single document are clustered, represented by the mean path and (2) all the mean paths in the document database are clustered and the final lookup table is constructed. In both the steps we have done the hierarchical clustering by computing the proximity matrix of all the candidate paths, where we have used string edit distance [3] as the distance measure.

A Bag-of-paths based Symbol Spotting in Line Drawings

2.3

5

Voting scheme

A voting space is defined over the images of the database dividing them into grids of several sizes (10 × 10, 20 × 20 and 30 × 30). Multiresolution grids are used to detect the symbols accurately within the image and the sizes of them is experimentally determined. For a particular model path, we select the best matching cluster or entry in the lookup table and accumulate the votes to the nine nearby grids (see Fig. 3(c)) of each of the two terminal vertices of each of the paths of the cluster. Vote to a particular grid is inversely proportional to the path distance metric and is weighted by the Euclidean distance to the centers of the respective grids from the terminal of the selected path. Fig. 3(e) shows the accumulation of votes by a 3D plot which clearly discriminates the occurrences of the query symbol on the document with the higher peaks. The grids constituting the higher peaks are filtered by the k-means algorithm applied in the voting space with k=2. Finally the occurrences of the query symbol on the documents are detected by another hierarchical clustering algorithm which clusters the spatial points contributed from all the grids considered.

(a)

(b)

(d)

(c)

(e)

Fig. 3: Voting scheme for a given query and a particular document (a) The vectorized floorplan, here the circles denote the occurrences of the query symbol. (b) The given query symbol. (c) Nine neighboring grids for a particular terminal points. (d) Accumulated votes for a document, here the circles denote the higher frequencies. (e) Accumulated votes showing with a 3D plot.

3

Experimental results

In order to evaluate the proposed spotting methodology, we present two different experiments. The first one only focuses on the bag-of-paths based shape matching algorithm as a distance measure between different shapes. The second experiment is designed to test the symbol spotting method in a document image database of real architectural drawings. This experiment also reports the comparative study between the two path comparison metrics viz. (i) attributed string and (ii) turning function.

6

Anjan Dutta, Josep Llad´ os, Umapada Pal

3.1

Shape matching experiments

This experiment is done to test the efficiency of the bag-of-paths based shape matching algorithm as a shape descriptor. The algorithm is used to measure the distance between two shapes represented by bag-of-paths. It is expected that for a match the algorithm will give lower distance than a mismatch. We have used two different isolated symbol datasets for that purpose and they are (i) SESYD Queries (floorplans) [2] and (ii) GREC-POLY [8]. The results of the experiment are represented in the confusion matrices (see Fig. 4). From the confusion matrices, we can conclude that the method has succeeded in most of the model classes, but has confused when the symbols contain significant structural similarity. This is due to the generation of similar factorized substructure (or paths).

(a) SESYD Queries (floorplans)

(b) GREC-POLY

Fig. 4: Confusion matrices shown for the two datasets.

3.2

Symbol spotting experiments

Finally, we have tested our method with a collection of ten floorplans and twelve different symbols as the queries. This dataset is a subset of FPLAN-POLY benchmark [8] which is available in the vectorized form and the vectorization is done by the Qgar software1 . The floorplans in the database consists of approximately 90,000 paths and after lookup table construction these paths result in 7,135 entries. The amount of string comparison metric computation thus reduced by 12.6 times than the sequential access of the paths in the whole collection. The query symbols for the experiment are shown in Table 1. The spotting experiments are performed by encoding the paths in terms of (i) attributed string and (ii) turning function. In Table 2 we present a detailed set of measures to evaluate the performance of the algorithm for the two metrics. We 1

http://www.qgar.org

A Bag-of-paths based Symbol Spotting in Line Drawings

7

Table 1: Query Symbols used for our experiments. Symbol-01 Symbol-02 Symbol-03 Symbol-04 Symbol-05 Symbol-06

Symbol-07 Symbol-08 Symbol-09 Symbol-10 Symbol-11 Symbol-12

Table 2: Values of different measures of our spotting experiments. Symbols Symbol-01 Symbol-02 Symbol-03 Symbol-04 Symbol-05 Symbol-06 Symbol-07 Symbol-08 Symbol-09 Symbol-10 Symbol-11 Symbol-12 Mean

Precision 57.14 50.00 57.14 72.72 57.14 57.14 80.00 100.00 23.53 80.00 100.00 50.00 65.40

Attributed string Turning function Recall F-index AveP Time (secs./doc) Precision Recall F-index AveP Time 100.00 72.72 70.95 26.20 50.00 100.00 66.67 43.33 100.00 66.67 76.03 22.04 46.67 100.00 63.64 40.31 100.00 72.72 47.62 31.21 66.67 100.00 80.00 77.08 100.00 84.21 89.94 36.09 72.73 100.00 84.21 84.09 100.00 72.72 74.70 15.36 10.81 100.00 19.51 45.97 100.00 72.72 66.79 60.18 50.00 100.00 66.67 75.00 100.00 88.89 95.00 107.93 100.00 100.00 100.00 100.00 100.00 100.00 100.00 10.41 100.00 100.00 100.00 100.00 100.00 38.10 66.30 2.97 30.77 100.00 47.06 60.19 100.00 88.89 80.42 4.85 50.00 100.00 66.67 50.00 100.00 100.00 100.00 2.46 100.00 100.00 100.00 100.00 100.00 66.67 75.00 2.08 50.00 100.00 66.67 72.92 100.00 77.03 78.56 26.81 60.64 100.00 71.75 70.74

(secs./doc) 2.99 3.19 4.45 4.99 2.24 4.76 7.08 2.22 1.17 1.50 0.90 0.80 3.02

can see the recall values for all the symbols have reached 100% and this illustrates that the algorithm is able to retrieve all the occurrences of all the symbols in the whole database. However there is a good number of false positives appear and this affects the precision. But the important thing is that we obtained good average precision values for almost all the symbols and for both the metrics. This is crucial for any retrieval method because this ensures the occurrences of the true positives at the beginning of the ranked list. The performance of both the path encoding techniques are competitive to each other since attributed string achieves higher average precision than the turning function but it is less efficient than the other in terms of computation time.

4

Conclusions and future work

In this paper we have proposed an error tolerant subgraph matching algorithm based on the idea of bag-of-paths. Bag-of-paths for a particular collection of graphs is the set of all acyclic paths between each pair of connected nodes, which gives a factorized representation of graphs. Finding the common factorized substructures within the whole collection and then applying the serialized subgraph isomorphism reduces the computational complexity of the usual graph based methods. Although the performance of our method is quite high both for symbol matching and symbol spotting, the method has an important limitation to deal with

8

Anjan Dutta, Josep Llad´ os, Umapada Pal

real-world database. This is due to the clustering technique we have used for clustering the paths. Clustering of structural information is a separate research issue and it needs further investigation. Our future research will also focus on other graph serialization methods which will reduce the computational complexity of usual graph based methods.

Acknowledgement This work has been partially supported by the Spanish projects TIN2009-14633C03-03, TIN2008-04998 and CONSOLIDER-INGENIO 2010 (CSD2007-00018).

References 1. Esther M. Arkin, L. Paul Chew, Daniel P. Huttenlocher, Klara Kedem, and Joseph S. B. Mitchell, An efficiently computable metric for comparing polygonal shapes, IEEE Trans. Pattern Anal. Mach. Intell. 13 (1991), no. 3, 209–216. 2. Mathieu Delalandre, Tony Pridmore, Ernest Valveny, Herv´e Locteau, and Eric Trupin, Building synthetic graphical documents for performance evaluation, pp. 288–298, Springer-Verlag, Berlin, Heidelberg, 2008. 3. Anjan Dutta, Symbol spotting in graphical documents by serialized subgraph matching, Master’s thesis, Computer Vision Centre, Universitat Aut` onoma de Barcelona, Edifici O, Campus UAB, 08193 Bellatera, Barcelona, Spain, September 2010. 4. Josep Llad´ os, E. Mart´ı, and Juan Jos´e Villanueva, Symbol recognition by errortolerant subgraph matching between region adjacency graphs, IEEE Transactions on Pattern Analysis and Machine Intelligence 23 (2001), 1137–1143. 5. Bruno T. Messmer and Horst Bunke, A new algorithm for error-tolerant subgraph isomorphism detection, IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (1998), 493–504. 6. Stefan M¨ uller and Gerhard Rigoll, Engineering drawing database retrieval using statistical pattern spotting techniques, Graphics Recognition Recent Advances (Atul Chhabra and Dov Dori, eds.), Lecture Notes in Computer Science, vol. 1941, Springer Berlin / Heidelberg, 2000, pp. 246–255. 7. Nibal Nayef and Thomas M. Breuel, A branch and bound algorithm for graphical symbol recognition in document images, Proceedings of Ninth IAPR International Workshop on Document Analysis System (DAS,’2010), 2010, pp. 543–546. 8. M. Rusi˜ nol, A. Borr` as, and J. Llad´ os, Relational indexing of vectorial primitives for symbol spotting in line-drawing images, Pattern Recognition Letters 31 (2010), no. 3, 188–201. 9. M. Rusi˜ nol, J. Llad´ os, and G. S´ anchez, Symbol spotting in vectorized technical drawings through a lookup table of region strings, Pattern Analysis and Applications 13 (2009), 1–11. 10. Karl Tombre and B. Lamiroy, Pattern recognition methods for querying and browsing technical documentation, 13th Iberoamerican Congress on Pattern Recognition, CIARP 2008, LNCS, vol. 5197, Springer-Verlag, 2008, pp. 504–518. 11. Wen-Hsiang Tsai and Shiaw-Shian Yu, Attributed string matching with merging for shape recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence 7 (1985), 453–462.

Suggest Documents