Learning Semantic Cluster for Image Retrieval Using ... - CiteSeerX

ICICS-PCM 2003 15-18 December 2003 Singapore

3B2.4

Learning Semantic Cluster for image retrieval using association rule hypergraph partitioning Lijuan Duan1, Yiqiang Chen2, Wen Gao2,3,4 The College of Computer Science, Beijing University of technology, Beijing 100022, China 2 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China 3 Graduate School, Chinese Academy of Sciences, Beijing 100039, China 4 Department of Computer Science, Harbin Institute of Technology, 150001, China

1

Abstract Semantic clustering is an important and challenge task for content-based image database management. This paper proposes a semantic clustering learning technique, which collects the relevance feedback image retrieval transaction and uses hypergraph to represent images correlation ship, then obtains the semantic clusters by hypergraph partitioning. Experiments show that it is efficient and simple.

1. Introduction Content-based image retrieval (CBIR) is a set of techniques for retrieving semantically relevant images from image database based on automatically derived image features [1,2,3]. Semantic feature extraction and description is the headstone of CBIR. For human, it is not difficult to extract the semantic information from an image. The background knowledge plays an important role in human object recognition. So, manual or semi-automatic image annotation methods are adopted by some CBIR systems. Image annotation is used in traditional image database systems. It is a tedious work to annotate images in large databases manually. However, without the help of human beings, it is very difficult to extract the semantic content of an image automatically [4]. image index manual

DATABASE Semantic

hard manual

hard

USER Semantic

“driving”

easy

possible

Objects

query example

Objects

hard

“car”, “road”

easy easy

automatic

Visual Feature

Visual Feature

color, shape

Fig.1 Image feature extraction [4] To resolve the problem, relevance feedback is introduced into CBIR. Relevance feedback is a power technique, which improves the retrieval performance by adjusting the original query, based on the relevant and irrelevant image

examples designated by users [1,2,5]. Many feedback image retrieval systems [6,7,8] are developed to support content-based image retrieval. But most feedback approaches do not have a learning mechanism to memorize the feedbacks conducted previously and reuse them in favor of future queries [9]. This paper introduces a semantic learning method by analyzing the relevance feedback image retrieval process, which could cluster the semantic related images into same class. For relevance feedback image retrieval system, the images relevant to query are pointed as positive example, otherwise the images irrelevant to query are pointed as negative examples. It is assumed that these positive examples are related in semantic content. So, we can collect the correlation ships of images from retrieval processes. Semantic retrieval and clustering is carried out based on these association relationships. For a relevance feedback image retrieval system, it is not reality to capture the semantic relationship of all images by analyzing one query or one feedback. Even by the same query requirement, different user has respective interest, so the query result may not be consistent with others. The “correct” semantic of a multimedia is what most people (but not necessarily all the people) agree upon [14]. Noisy information must be identified and eliminated. So, the learning algorithm must robust and efficient. Zhuang [9] proposes a graphic theoretic model for incremental relevance feedback in image retrieval. In fact, the correlation ships of images are very complex. We use hypergraph to describe the relationship of images. Furthermore, we use association rule hypergraph partitioning method [10,11,12] to obtain the semantic clustering. Association rule hypergraph partitioning algorithm (ARHP) is a new clustering method, which is based on generalizations of graph partitioning, do not require prespecified ad hoc distance functions [10,11,12]. It has used to clustering related items in transaction database. The rest of this paper is organized as follows. Section 2 introduces the details of learning algorithm. The effectiveness of the algorithm is presented in Section 3. Section 4 concludes the paper.

______________________ This paper is supported by the Ph. D. Research Fund of Beijing University of Technology. 0-7803-8185-8/03/$17.00 © 2003 IEEE

them into highly connected partitions. We use HMETIS to partition the complex hypergraph gained from section 2.2.

2. Learning Algorithms The basic idea of the semantic learning algorithm is very simple. It collects the relevance feedback image retrieval transaction and uses hypergraph to represent images correlation ship, then obtains the semantic clusters by hypergraph partitioning. Following sections 2.1, 2.2, and 2.3 describes the details of these steps.

2.1 Constructing image retrieval transaction database In information system, transaction is a useful concept. It is the basic logical unit of execution in an information system. Image retrieval transaction is a whole query process for relevance feedback image retrieval system. The outcome Ri of relevance feedback image retrieval is represented by an image retrieval transaction < Q, P, N , I > . Q is query example. P is positive examples, P = { p1 , p 2 , p 3 , p 4 ...., p n } . N is negative examples, N = {n1 , n 2 , n3 , n 4 ...., nl } I = {i1 , i 2 , i3 , i 4 ...., i m } .

. I is

the

retrieved

images

The images related in semantic level are called semantic correlation images. Such as Q and P = { p1 , p 2 , p3 , p 4 . ..., p n } are

semantic

correlation

images.

Q and N = {n1 , n2 , n3 , n4 ...., nl } are not semantic correlation images.

2.2 Representing image correlation using association rule hypergraph

H = (V , E ) consists of a set of vertices (V ) and a set of hyperedges ( E ) [10,11,12]. A

A

hypergraph

hypergraph is an extension of a graph in the sense that each hyperedge can connect more than two vertices [10,11,12]. A key problem in modeling data items as a hypergraph is determining that related items can be grouped as hyperedges and determining the weights of the hyperedge [10,11,12]. In this case, hyperedges represent the frequent item sets found by the association rule discovery algorithm. In our system, the set of vertices V corresponds to the set of images being clustered, and each hyperedge (e ∈ E ) corresponds to a set of related images. A frequent item sets found using the association rule discovery algorithm corresponds to a set of images that have been feedback as positive examples simultaneity. These frequent images are mapped into hyperedges in a hypergraph.

2.3 Learning the semantic cluster The hypergraph representation can then be used to cluster relatively large groups of related images by partitioning

HMETIS is a multilevel hypergraph partitioning algorithm that has been shown to produce high quality bisections on a wide range of problems arising in scientific applications [11,12,13], such as VLSI application, document category, traffic assignment problems. In the beginning, HMETIS partitions the hypergraph into two parts such that the weight of the hyperedges that are cut by the partitioning is minimized. Each of these two parts can be further bisected recursively, until each partition is highly connected. At each recursion, HMETIS minimizes the weighted hyperedge cut, and thus tends to create partitions in which the connectivity among the vertices in each partition is high, resulting in good clusters. So, bad clusters are eliminated by using the following cluster fitness criterion [13]. The fitness function is used to measure the ratio of weights of edges that are within the partition and weights of edges involving any vertex of this partition. Let e be a set of vertices representing a hyperedge and C be a set of vertices representing a partition. The fitness function that measures the goodness of partition C is defined as follow [12,13]: ∑e⊆C Weight(e) (1) fitness(C ) = ∑ e∩C >0 Weight(e) In formula 1, the high fitness value implies that the partition has more weights for the edges connecting vertices within the partition [12,13]. To filter out vertices that are not highly connected to the rest of the vertices of the partition, each partition is examined. Connectivity, another measure function is defined in ARHP. It measures the percentage of edges that each vertex is associated with. High connectivity value suggests that the vertex has many edges connecting good proportion of the vertices in the partition. The connectivity function of vertex v in C is defined as follow [12,13]: {e e ⊆ C , v ∈ e} (2) connectivity ( v, C ) = {e e ⊆ C In formula 2, the vertices with connectivity measure greater than a give threshold value are considered to belong to the partition, and the remaining vertices are dropped from the partition [12,13]. In ARHP, the support criteria in the association rule discovery algorithm is also used to filter out of non-relevant images. Depending on the support threshold, images that do not meet support criteria (i.e., images that do not relevant with other images) will be pruned.

3. Experiment In order to test the efficiency of mentioned algorithm, an experiment system is established, which adopts the Rich Get Richer relevance feedback strategy [15]. Tests are performed on commercial database Corel Gallery. It contains 1,000,000 images, being classified into many

semantic groups. We create a test database by randomly selecting 20 categories of Corel Photographs (30 images in each category). We collect retrieval results of two stages. In the first stage, 113 times retrieval processes are collected. The query images are different for these retrievals. In another word, 113 images are point as query example to carry relevance feedback image retrieval. In the second stage, 7196 times retrievals are collected. 600 images are pointed as query examples in these retrieval processes. The query times for each image of the test database are different. In section 3.1, we illustrate the semantic clustering results using algorithm mentioned in section 2. In section 3.2, we compare the results of different clustering methods. In section 3.3, we illustrate the semantic retrieval result.

3.1 Semantic clustering results Image transaction database are constructed using the method mentioned in section 2.1. The first transaction database ( called I ) includes 113 transactions, which

involves 555 images in the test database. The second transaction database (called II) includes 7196 transactions, which involves 600 images. To filter out of non-relevant images can be achieved using the support criteria in the association rule discovery process. Depending on the support and confidence threshold, images that don’t meet threshold will be pruned. In the experiment, support threshold is 1%, confidence threshold is 10%, and each hyperedge has 3~15 vertices. For transaction database I, the hypergraph has 2187190 hyperedges. For transaction database II, the hypergraph has 331102 hyperedges. Fig.2 illustrates the detail of hypergraph creating process for transaction database I. The hypergraph representation can then be used to cluster relatively large groups of related images by partitioning them into highly connected partitions. We use HMETIS to partition the complex hypergraph. Fig. 3 displays the semantic clustering process for transaction database II.

Fig.2 Creating hypergraph

……

Fig.3 hypergraph partitioning

Transaction Database

I II

Transactions 113 7196

Images 555 600

Support 1% 1%

Confidence 10% 10%

Hyperedges 2187190 331102

Precision 79.5% 85.8%

Tab.1 Clustering result The hypergraph is portioned into 20 classes in Fig.3. It takes 14 minutes to gain the clustering result on the computer ( Pentium Pro Pentium 1.4GHz with 256M of RAM ). The relevant images should be partitioned into the same cluster according to the partition rule of HMETIS. The test database includes 20 categories of Corel Photographs (30 images in each category). We wish that the semantic clustering results could consistent with test database categories. As shown in Fig.3, the number of images for each cluster is from 23 to 37. It is obviously there is some error in the processing. To verify the clustering results, each cluster is observed manually and a topic is given by the observer. In factually, the topic comes from the categories of Corel Photographs. In another word, the topic is assigned to each cluster according to the category that most images of the cluster belong to. For example, there are 35 images in cluster 1 and 28 images are roses. The topic of the cluster is “rose”. It is obviously that 7 images are not rose in cluster 1. Tab.1 displays clustering precision for each database. For database I, the clustering precision is 79.5%. For database II, the clustering precision is up to 85.8%. It shows that the clustering method is efficient.

(a)

(b)

3.2 Comparing with traditional clustering method To illustrator the efficiency of semantic clustering method, we compare it with the result by using traditional clustering mechanism. Traditional image clustering technique is based-on visual feature, such as color or texture. Fig.4 displays some images with red color, which are clustered into one cluster by using color feature. It is obviously that these images can be partition into many semantic classes. Fig.5 displays some clusters by using the mentioned algorithm in section 2. It is obviously that the images related in semantic (high level feature) are clustered together.

（c） Fig.5 Semantic clustering result

3.3 Experiment result of semantic retrieval The result of clustering could then be used to semantic retrieval for searching similar images. Fig. 6 gives an example of semantic retrieval using the cluster result mentioned in section 3.1. Fig. 7 gives an example of visual feature based retrieval. In Fig.6 and Fig. 7, the query image is displayed at the upper left corner. As these show that the query images of the two experiments are same, which come from the “sand” category of Corel database. The best 32 retrieved images are displayed. In Fig.6, 90% images are belonging to “sand” category. While in Fig.7, 80% images are not belonging to the “sand” category. It obviously shows that the clustering method is very useful for semantic retrieval.

Fig.4 Image clustering result based on visual feature

Fig.6 Retrieval result by using the cluster result

Fig.7 Retrieval result based on visual feature

4 Conclusion A semantic clustering learning technique is proposed in this paper. Different from traditional image clustering technique, it extracts image’s semantic content by analyzing image’s correlation ship, not by image understanding techniques. The basic idea of the semantic learning algorithm is very simple. It collects the correlation ships of images from retrieval processes, uses hypergraph to represent the images relationships, and obtains images semantic clustering using hypergraph partitioning method. Experiments show that it is efficient and simple.

Reference [1] Yong Rui, Thomas S. Huang, and Shih-Fu Chang. “Image Retrieval: Past, Present, and Future”. Invited paper in Int Symposium on Multimedia Information Processing , Dec 11-13, 1997, Taipei, Taiwan [2] Catherine. Lee, Wei-Ying Ma, Hongjiang Zhang. “Information Embedding Based on User’s Relevance Feedback for Image Retrieval”. Technical report HP Labs, 1998. [3] James Z. Wang, JiaLi and Gio Wiederhold. “SIMPLIcity: Semantics-sensitive Integrated Matching for Picture Libraries”. IEEE Transactions On Pattern Analysis And Machine Intelligence, Vol. 23, No. 9, September 2001.

[4] J. R. Smith, Shih-Fu Chang. “Automated Image Retrieval Using Color and Texture”. Columbia University, Dept. of Electrical Engineering and Center for Telecommunication Research. 1995 [5] Yong Rui, Thomas S. Huang, Sharad Mehrotra, and Michael Ortega. “Relevance Feedback: a power tool for interactive content-based image retrieval”. IEEE trans. Circuits and systems for video technology, vol. 8, no. 5, pp. 644-655, Sep. 1998. [6] Y.Lu,C.-H.Hu,X.-Q.Zhu,H.-J.ZhangandQ.Yang, “A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems”, ACM Multimedia, 2000. [7] I. J. Cox, M. L Miller, S. M. Omohundro, P. N. Yianilos, “Pichunter: Bayesian relevance feedback for image retrieval system”, In Intl. Conf. On Pattern Recognition, Vienna, Austria, August 1996, pp.361369. [8] Jun Yang, Liu Wenyin, Hongjiang Zhang, Yueting Zhuang . “An Approach to Semantics-Based Image Retrieval and Browsing”. International Workshop on Multimedia Database Systems, Taipei, Sept. 2001. [9] Yueting Zhuang, Jun Yang, Qing Li and Yunhe Pan. “A Graphic-Theoretic Model For Incremental Relevance Feedback In Image Retrieval”. ICIP2002 [10] E. H. Han, G. Karypis, V. Kumar, and B. Mobasher. “Clustering based on association rule hypergraphs”. In workshop on Research Issues on Data Mining and Knowledge Discovery, Tucson, Arizona, 1997, pp.913. [11] E. H. Han, G. Karypis, V. Kumar, and B. Mobasher. “Hypergraph based clustering in high-dimensional data sets: A summary of results”. Bulletin of the Technical Committee on Data Engineering, 1998, Vol.21, No.1, pp.15-22. [12] D. Boley, M. Gini, R. Gross, E.H. Han, K. Hastings, G. Karypis, V. Kumar, B. Mobasher, J. Moore. “Partitioning-Based Clustering for Web Document Categorization”. Decision Support Systems (1999) Vol.27, pp.329-341. [13] E. H. Han, G. Karypis, V. Kumar, and B. Mobasher. “Clustering In A High-Dimensional Space Using Hypergraph Models”. Technical Report TR-97-063, Department of Computer Science, University of Minnesota, Minneapolis, 1997. [14] Q. Li, J. Yang, Y. T. Zhuang, “Web-Based Multimedia Retrieval: Balancing Out between Common Knowledge and Personalized Views”, Proceedings of the 2nd International Conference on Web Information Systems Engineering (WISE'01), Organized by WISE Society and Kyoto University, Kyoto, Japan, 3-6 December 2001, Volume 1, pp.92101. [15] L. J. Duan, W. Gao and J. Y. Ma. “A Rich Get Richer Strategy for Content-Based Image Retrieval”, Fourth International Conference On Visual Information Systems, Lyon, France, November 2000.