Efficient Web Browsing with Semantic Annotation: A Case ... - CiteSeerX

0 downloads 0 Views 347KB Size Report
May 5, 2005 - key words: Web browsing, semantic annotation, user intention modeling, .... guages. Practically, the metadata editor like RDFPic[12].
IEICE TRANS. INF. & SYST., VOL.E88–D, NO.5 MAY 2005

843

PAPER

Special Section on Cyberworlds

Efficient Web Browsing with Semantic Annotation: A Case Study of Product Images in E-Commerce Sites Jason J. JUNG†a) , Kee-Sung LEE† , Seung-Bo PARK† , and Geun-Sik JO† , Nonmembers

SUMMARY Web browsing task is based on depth-first searching scheme, so that searching relevant information from Web may be very tedious. In this paper, we propose personal browsing assistant system based on user intentions modeling. Before explicitly requested by a user, this system can analyze the prefetched resources from the hyperlinked Webpages and compare them with the estimated user intention, so that it can help him to make a better decision like which Webpage should be requested next. More important problem is the semantic heterogeneity between Web spaces. It makes the understandability of locally annotated resources more difficult. We apply semantic annotation, which is a transcoding procedure with the global ontology. Therefore, each local metadata can be semantically enriched, and efficiently comparable. As testing bed of our experiment, we organized three different online clothes stores whose images are annotated by semantically heterogeneous metadata. We simulated virtual customers navigating these cyberspaces. According to the predefined preferences of customer models, they conducted comparison-shopping. We have shown the reasonability of supporting the Web browsing, and its performance was evaluated as measuring the total size of browsed hyperspace. key words: Web browsing, semantic annotation, user intention modeling, comparison shopping

1. Introduction Since the Internet infrastructure was widely spread, many users have been concerned about Web in order to search useful information. Especially, electronic commerce (ecommerce) sites have been popular with online product purchasing. Now, too many kinds of online shopping malls and products have been showing up on the web space after all. They have causes information overloading problem to make users lostf their ways of efficiently searching relevant product items in e-commerce sites [1]. As the number of products is increasing, navigation task using simple web browsers is getting more time wasting and boring. We need to develop the smart and adaptive assistants to identify and discriminate the most interesting and valuable information [2]. As a close example, “Letizia” is the personal agent system interleaving with the users [6]. This can design browsing processes of the user and infer the user’s interests based on simple keywords set. In particular, in terms of commercial concepts, this software component assisting users has to be rational, because of the various purchasing behaviors related to economic mechanisms [3]. In order to solve this problem, this paper proposes a Manuscript received August 28, 2004. Manuscript revised November 22, 2004. † The authors are with the School of Computer and Information Engineering, Inha University, Korea. a) E-mail: [email protected] DOI: 10.1093/ietisy/e88–d.5.843

novel browsing method of supporting user activities on the Web space like e-commerce sites. There are two main contributions in this paper. First is semantic annotation of Web resources for dealing with semantic heterogeneities among the various application domains. In fact, Web site designers and administrators have to annotate the resources included in their Web sites, because of the several reasons such as the efficient management of Web site and the accessibility of Web crawlers (or robots) from search engines. However, we have to consider that the integration problem between these Web sites may be caused by the existing annotation methods based on the domain-specified ontologies. There are two types of semantic heterogeneities as follows: i) name heterogeneity, regarding lexical conflicts, such as synonyms and homonyms, and ii) structural heterogeneity, concerning the representation of the information in different metadata (or conceptual schema). We employ semantic annotation, which is a transcoding techniques based on the global ontology. Resources annotated by local domain ontologies should be enriched by the global ontology which is established by using DAML+OIL as an ontology language. Second contribution is user intention modeling with the access patterns of annotated resources. After conducting semantic annotation, we exploit probabilistic approach to recognize user intentions. Then, a user can be supported with the guides of assistant agents. In order to extract features from Webpages, we have been focusing on the annotated resources in each Webpage. Such resources are not only HTML codes but also multimedia data like image, sound, and movie files. Semantic transcoding can make the features extracted from annotated resources semantically homogeneous. Therefore, resources can be compared with each other and with the estimated user intentions. We formulate three different heuristic equations for measuring the similarity between a particular user intention and resources. These heuristics are related to the size of feature vector, the number of matched features and the cosine distance between vectors. As a case study of this paper, we chose comparisonshopping tasks in online stores. Most of online shopping malls have been strategically conducting advertisements by using various multimedia data formats that are more sensuous and intuitively identifiable than textual data [4]. Especially, image data is the most general and appropriate way to publish the Webpages in shopping mall sites like clothes and shoes. We have assumed that the visual data formats like images should be intuitively better way to present a

c 2005 The Institute of Electronics, Information and Communication Engineers Copyright 

IEICE TRANS. INF. & SYST., VOL.E88–D, NO.5 MAY 2005

844

particular product in the shopping malls rather than textual format. Online customers try to search the most relevant items, as comparing the downloaded product images. The agent system is proactive to search relevant product images. Thereby each agent has to monitor the browsing patterns of the corresponding user and extract the intentions of them, like which products this user is trying to buy. Semantic annotation can make software assistants possible to unify and compare between images annotated from different shopping malls. Then, they can suggest to the corresponding customers which hyperpaths should be better and more efficient. In the following section, we find out the previous work related to our contributions. Section 3 describes which problems we are trying to deal with. Section 4 mentions the exploitation of global annotation ontology for semantic interoperability among heterogeneous Web spaces, and then, Sect. 5 presents a novel browsing method based user intention modeling using similarity measurement between annotated images. In Sects. 6 and 7, we show how to verify our approaches and implement our systems. Finally, Sect. 8 addresses conclusion remarks and future work. 2. Backgrounds and Related Work In order to perform retrieving resources like image data from the Web, there have been two main approaches, which are text-based and content-based retrievals [5]. Features can be extracted from the resources themselves or the textual information. Depending on where the features are extracted from, we can classify those two approaches that are valid for the Web case. We therefore have been trying to exploit the text-based image retrieval approach based on annotation using the surrounding text and keywords. While tradition annotation had been regarded as a comment added to particular resources by designers, recently, data annotation is an important kind of metadata that occur in the form of externally assigned descriptions of particular features in Web pages [8]. Some commercial systems, such as Google image search [9] and AltaVista photo finder [10], use this technique. With the emergence of semantic Web technology, ontology languages like DAML+OIL have been applicable to resource annotation [23]. We define semantic annotation as the ontology-based transcoding procedure of the resources annotated by local domain knowledge. Basically, ontology is defined as the specification of conceptualization [14]. Similar to our case study, Alice [27] framework is based on the use of the ontologies for representing knowledge related to online shopping. For example, ontologies on ecommerce are item names, property names, and concepts about business process. Additionally, there have been several annotation tools for helping Web designers to index the resources like images to their local databases. Annotea [24] provides an open framework for RDF-based web annotations. S-CREAM [25] and Melita [26] are interactive annotation tools that make use of a separate training phase to

learn annotation rules that are used to make suggestions to users for subsequent texts. 3. Problem Description There are several difficulties in browsing the Web space for searching relevant information. We note general problems as follows. • The amount of search space. The more items dealt in online stores cause the amount of search space overwhelming users. Especially, as items are getting more subdivided, the depth of search space is more increasing, as shown in Fig. 1 (a). • Domain-specific knowledge. Most users lack domainspecific knowledge. To query what they are looking for is too tough for them. It makes users hard to generate the query for a specific item. • Semantic heterogeneity. The link structure of each online store is different from the others, as shown in Fig. 1 (b). More seriously, keywords describing an item are semantically same, but literally different such as synonyms. In the aspect of comparison-shopping, this can be closely related to the two former problems. It means this can be one of the serious causes exponentially increasing search space, also burden the lack of domain-specific knowledge. Now, we want to define the notions for formulating basic manipulation function of annotated resources. A set of resources (or images) in a hyperspace is represented as X = x1 , x2 , . . . , xn . Let Fi be a set of features of a resource xi . For the case study of product images in e-commerce sites, we assume that all product images have to be assigned a category feature presenting the hierarchical path on the category tree in order to retrieve relationships between items and improve the precision of comparison with semantically different shopping malls. For example, a product image’s

Fig. 1 (a) Difficulties of browsing Web space; (b) Search space on the shopping malls.

JUNG et al.: EFFICIENT WEB BROWSING WITH SEMANTIC ANNOTATION

845

feature set can be represented as {id: 141-D2538, name: Basic Editions Stone Twill Pant, price: $12.99, color: Gray, size: 32 × 30, category: Top>Clothing>Men>Pants}. 4. Semantic Annotation for Product Images For advertising a specific product in e-commerce sites, its corresponding image should be shown up in the Webpage. We can extract this image by simple HTML parsing, and each image has to be annotated, according to the domain concept hierarchy for a specific shopping mall. Generally, image retrieval systems can be dealt with content based and keyword based approaches. Keyword-based image retrieval is needed to annotate images in advance. However, annotating with noise like subjective opinion causes misunderstanding and lower performance [7]. In order to solve this problem, we conduct semantic annotation of images based on global ontology. 4.1 Image Annotation Traditionally, there are two kinds of image annotation, which are automatic and manual model. While the automatic models such as co-occurrence model, translation model, and cross-media relevance model need content analysis based on image processing [11], the manual model is simply generated by the domain experts’ heuristics and languages. Practically, the metadata editor like RDFPic [12] can be utilized. In order to embed metadata in an image, we have been considering RDF (Resource Description Framework) based representation. A RDF(S) (RDF Schema) is essentially needed to describe RDF documents [13]. The heuristics of domain experts or local concept hierarchy generate the RDF(S). For example, a product on a Web page has several attributes such as name, id, price, and color, as shown in Fig. 2. Web designers can annotate this image by using classes and vocabularies defined in this RDF(S) like File, Item Name, Item Cost, and so on. However, it is difficult for exotic agent systems to understand this locally annotated resource. 4.2 Ontology-Supported Transcoding For the global understandability, we want to apply semantic annotation scheme based on higher-level ontologies. This procedure can transform the metadata used for domainspecific annotation to be comprehensible. It means that every tag (or container) has to be conceptualized. Then, we can perform feature extraction. So far, there have been several studies about information integration based on ontology. Maedche and Staab [15] proposed the way of measuring similarity between local ontology for retrieving relevant ontologies, and Clio’s schema mapping [16] and Soo et al. [17] presented ontological mapping methods. However, these methods are focusing on mapping a particular XML schema to free textual documents by using ontologies. In this paper, we want to trans-

Fig. 2 ing.

Simple image annotation and semantic annotation with transcod-

form two tagged documents to be comparable and possible to extract the relations between them. We simply employ a tag replacement based on high-level ontology. As shown in Fig. 2, a pair of tags before and behind content like “” and “” can be replaced to “” and “”. Additionally, for conceptualizing content between a pair of tags, we conduct basic data preprocessing procedures such as parsing, step-words removing, and stemming. In particular, numeric attributes such as price have to be discretized. 4.3 Comparison between Semantically Annotated Images In order to measure the similarities between annotated images, the comparison between annotated images can be regarded as the comparison of semi-structured documents extracted from these images. We want to propose some heuristic-based formulation. Let imgi and Fi be an image annotated by local domain ontology and the extracted feature set, respectively. The similarity between two images, imgi and img j , is measured by N    S im(imgi , img j ) = H  (o(F ˜ i ), o˜ (F j )) (1) n=1

where the function o˜ means a tag replacement procedure for looking up the high-level ontology. Then, the semantically homogeneous features can be obtained. More importantly, the notation H indicates several heuristic functions

IEICE TRANS. INF. & SYST., VOL.E88–D, NO.5 MAY 2005

846

that users can apply to quantitating the similarities between two images. 5. Anticipatory Browsing Based on Heuristic Search Basically, anticipatory browsing is composed of two steps, estimating user intention and comparing with annotated images. Agent system thereby has to monitor the browsing patterns in order to estimate the intention of the corresponding users. Then, through looking ahead the Webpages linked from the current Webpage, it can predict the degree of association between the estimated user intention and the annotated resources on those Webpages. Best-first search is one of the well-known searching strategies based on the following cost function f  = g + h

(2)

where f  is the total estimated cost for the goal. As shown in Fig. 1 (b), hyperspace of a shopping mall is organized as tree structure consisting of hyperlinks to child nodes and images on certain web pages through simple HTML parsing. Therefore, this cost can be the time taken to search the web pages of relevant items. Now, we focus on how to exactly formulate the cost function h’ computing the similarities between the estimated user’s goal and the next states. We propose three different measurements for quantifying the similarities between images. Furthermore, we will describe how to compare two annotated images for conducting comparisonshopping tasks. However, even exactly matched images do not mean that they are the goal image, in the end. User interactions while browsing can continuously update the probabilities of feature set. 5.1 Estimating User Intention We assume that a user try to search relevant items, according to his own intentions. These intentions are represented as a set of features FU , and each feature fi in this set is represented as the probability how much the user is interested in this feature. The interval of this probability is [0, 1]. More importantly, these features are assumed to be mutually exclusive and equally weighted. It means each feature is independent, and the sum of each feature is always equal to one, as shown the following equations. FU = { fcolor , fbrand , f size }U    fcolor,a = fbrand,b = f size,c = 1 a

b

(3) (4)

c

Additionally, features are adaptable. Their probabilities are continuously changed by streaming user browsing actions. The estimated probability of feature fi, j at time t + 1 = fi,(t)j × (1 + η) fi,(t+1) j

(5)

where η is the coefficient for learning rate. By Eqs. (3), (4),

and (5), the other probabilities have to be updated as the following equation  (t+1) (t)    − fi,k fi,k (t+1) (t)   fi,k = fi,k × 1 − | fi | − 1   (t)    fi,k (t)   = fi,k × 1 − η × (6) | fi | − 1  where k means from 1 to | fi | except to j, and | fi | is the number of values in feature fi . The initial probability of features is 0.5. 5.2 Measuring Similarity between Images There are two issues related to measuring similarity between images. The first is to compare estimated model of user intentions with a set of images retrieved from next nodes. Based on the Eq. (1), we formulate several heuristic ways, as shown in the following equations S im(imgi , FU ) = =

N max(|imgi |, |FU |) (t) N maxn=1 (FU,n )

N (t) F n=1 U,n

N =

N

(7)

where N is the number of matched features. These three equations return the number of matched features, the maximum probability among matched features, and the mean probability of matched features, respectively. The next issue is to compare two images annotated by different two shopping malls. This is to support comparisonshopping of the customer. In this case, the feature “category” plays an important role improving the precision. We consider this feature more important, so that this feature can be used as a threshold value, additionally. We can simply measure the categorical similarity between both images, as shown in the following equation S imC ( fi,C , f j,C ) =

θ max((Li − θ), (L j − θ))

(8)

θ = min(Mi , M j ) where Li and Mi are the length of fi,C and matched path from root of fi,C , respectively. Therefore, similar to this equation, measuring similarity between images is shown as N max(|imgi |, |img j |) if S imC ( fi,C , f j,C ) ≥ 

S imC (imgi , img j ) =

(9)

where  is the threshold. 5.3 Looking-Ahead for Best-First Search Users, before starting shopping, remind themselves of

JUNG et al.: EFFICIENT WEB BROWSING WITH SEMANTIC ANNOTATION

847

some features about specific items they are interested in. While clicking hyperlinks, the user’s interests can be often changed, as strongly dependent on the local context of the other contents [18]. Therefore, we have to extract and maintain the initial intentions of the users, in order to make the length of their browsing paths shorter. As monitoring user’s browsing, we can obtain the sequence of images on web pages that the user has visited. We apply these data to adapting our model to user’s real intention. The estimated user intention is compared with a set of images prefetched from hyperlinked next web pages. We recommend top-N images estimated as the most satisfiable to users, after previously mentioned similarity measurement processes. Those images will be shown in order of relevance. Sometimes auxiliary pages, which play a role of bridging other pages, can appear as child page. In this case, we expand child nodes in next depth until finding images. For comparison shopping, we retrieve items, which are related to the selected item among recommended images, from the other shopping malls. We assume customers’ choosing a specific item means that they try to look for items in the same product category. Therefore, similarity measurement based on feature “category ” can minimize falsenegative error. For example, customers trying to compare the prices of “Red Shirt for Men” do not want to get information about “Red Pant for Men”. 6. Implementation We implemented this system, named as IRCS (Image Retrieval for Comparison Shopping mall) system, by using Borland Delphi. For ontology design, we have focused on DAML+OIL. As application domain of this system, we chose clothing and online clothes stores. We deployed DAML+OIL for representing clothing ontology. The whole data is available from our IRCS web site (http://eslab.inha.ac.kr/ ircs). In order to organize testing bed, we collected image and content data as well as hierarchical product categories from real three shopping malls. 6.1 Configuration of Shopping Malls We organized three shopping malls as testing bed. The specifications of each shopping mall are shown in Table 1. About 11.3% of total web pages are auxiliary for containing link information about their child nodes. More importantly, hierarchical structure of these synthesized shopping malls is generic tree. It means users have to backtrack web pages they already visited for searching the other branches. 6.2 Navigation for Comparison-Shopping Users can be “implicitly and continuously” recommended by IRCS browser. Figure 3 shows a snapshot of user interface of IRCS system. The GUI of this system consists of

Table 1

Specification of shopping malls organized as testing bed.

Shopping Mall Domail Ontology Language Tree Page Number Max. Depth Images Number

Fig. 3

A RDF(S) 183 3 338

B DAML +OIL 226 4 267

C DAML +OIL 171 3 304

User interface of IRCS.

four parts of frames. Frame I presents HTML documents of requested URL’s, just like normal browsers, and frame II is a space for only the annotated images extracted from current web pages. As a main function of IRCS browser, frame III contains the set of relevant product items generated by comparison with estimated user intentions. Thus, users can suspend the browsing tasks by choosing a particular item from this frame. Users can adjust the amount of recommendations by controlling the threshold value. They also select shopping malls they want to compare items on the frame IV. 7. Experimental Results In order to prove the performance of this system, we conducted two kinds of comparative experiments. First experiment is the evaluation of heuristic functions measuring the similarities. Second is the comparative evaluation with item-based recommendation algorithms. 7.1 Evaluation of Similarity Measurement We compared and evaluated three heuristics in Eq. (7) proposed for comparing an image with estimated user intentions, as mentioned in Sect. 5. Based on each heuristic, five people have tried to search ten predefined items, and their browsing paths including backtracking were recorded, as shown in Tables 2 and 3. The learning rate coefficient was 0.05. As a result, the third heuristic, which is the mean of feature probabilities, is the most proper evaluation method to most users (User 1, 2, 3, 5). User 2, especially, fulfilled searching tasks with the least backtracking, the only 7.4% of backtracking during single navigation. In case of User 4’s

IEICE TRANS. INF. & SYST., VOL.E88–D, NO.5 MAY 2005

848 Table 2 Experiment result of total length of browsing path during comparison shopping. Users 1 2 3 4 5

Total length of browsing path (Single) (with IRCS) 73 79/64/53 96 93/71/41 86 74/68/44 82 69/54/35 72 75/62/59

Table 3 Experiment result of total length of backtracking during comparison shopping. Users 1 2 3 4 5

Total length of browsing path (Single) (with IRCS) 35 37/24/13 54 53/35/4 48 35/27/8 41 36/16/18 37 31/25/14

Table 4 The recall measured by two item-based algorithms and heuristic function in Eq. (7).

Cosine-based item similarity Conditional probability based item similarity Eq. (7) without semantic transcoding Eq. (7) with semantic transcoding

Recall with shopping malls (A, B, C) A B C A+B+C 0.23 0.17 0.19 0.215 0.19

0.22

0.27

0.227

0.24

0.18

0.21

0.194 0.372

mendation algorithm is the best, the experiment by combined dataset (A+B+C) proved the heuristic function with semantic transcoding to be the most efficient method. 8. Conclusions and Future Work

browsing, we obtained that the second heuristic is more efficient than the third one. We found out that feature weighting or subset selection process is needed to discriminate more important and user associated features, according to the user profile. As second experiment, we tried to support comparison-shopping to users. Lee et al. have tried to conduct comparison-shopping on the semantic web [19]. However, they assumed that all shopping malls have to apply same ontology to image annotation. By human evaluation, we verified clustering of items from heterogeneous shopping malls. This experiment is for proving semantic matching of IRCS system. While searching items in shopping mall C, users made relevance feedback about recommended items retrieved from shopping mall A and B, respectively. IRCS system have shown almost same level of item clustering performance between RDF(S) and DAML+OIL. Shopping mall C described in DAML+OIL can fulfill comparison shopping based on information sharing with not only shopping mall B using same ontology language, but also shopping mall A using RDF(S). 7.2 Comparative Evaluation with Item-Based Recommendation Algorithms In order to deal with drawbacks of the user-based recommendation, several studies developed and evaluated Ntop item-based recommendation algorithms such as cosinebased similarity and conditional probability-based similarity [28], [29]. After splitting the product images of three different shopping malls into training and testing set, we applied two item-based similarity algorithms and third heuristic function (Eq. (7)) to compute the recall, as shown in Table 4. We used N = 5 in all of experiments. With semantic transcoding of annotation information, we were able to find out approximately 92 percent of improvement of the heuristic function. More importantly, while the experiments (B, C) by using single shopping mall dataset showed conditional probability-based item recom-

We proposed semantic annotation based on ontology for improving the understandability of images from heterogeneous information sources. Semantic annotation made it possible to compare images and measure the similarities between them. Then, we embedded the function of manipulating this novel annotation into an agent model, in order to support users’ activities during browsing to search products on the Web. For improving the efficiency of the performance of agent system, we deploy heuristic-based searching scheme like best-first searching, as estimating user intention model from the sequential clickstream of Web space. Typically, simple browsing task on the Web is very lonely. We therefore have regarded implicit recommendation as the most useful solution. Our target domain was extracting the most suitable items from online clothing shopping malls and conducting comparison-shopping. We have proposed IRCS system for supporting user browsing based on semantic annotation of images. With advent of semantic web [20], especially, our work is getting more important, because visual data like images and movie files is more appropriate for semantic annotation. For verifying our system, we compared both cases of browsing, with or without implicit recommendation. The backtracking with our system’s support was dramatically decreased. We also proved the semantic interoperability by user’s relevance feedbacks. As a future work, we will urgently conduct large-scaled experiments with more users and product images, in order to efficiently detect the preference variation among individuals. Meanwhile, because the Web resources may not be well annotated in some cases of real industrial fields, we have to apply them to estimate the relationships with other wellannotated ones. We are also concerning about user-centered query generation for explicitly representing the features of relevant images, and also, more various feature similarity measuring methods like feature subset selection are feature ranking should be studied. Like RDF/XML based image annotation on P2P (peer to peer) introduced in [21], distributed environment will be next testing bed in order to improve the

JUNG et al.: EFFICIENT WEB BROWSING WITH SEMANTIC ANNOTATION

849

scalability of our system. Furthermore, we are focusing on the extension of our system to pervasive computing environment. Similar to AURA [22], we will be able to use mobile devices for being helped to travel and guide in smart space in which objects are semantically annotated.

[23]

[24]

References [25] [1] M. Klusch, ed., Intelligent Information Agents: Agent-Based Information Discovery and Management on the Internet, Springer-Verlag, Berlin, 1999. [2] P. Maes, “Agents that reduce work and information overload,” Commun. ACM, vol.37, no.7, pp.30–40, 1994. [3] N. Vulkan, “Economic implications of agent technology and Ecommerce,” Econ. J., vol.109, no.453, pp.67–90, 1999. [4] V.C. Storey, D.W. Straub, K.A. Stewart, and R.J. Welke, “A conceptual investigation of the E-commerce industry,” Commun. ACM, vol.43, no.7, pp.117–123, 2000. [5] M.L. Kherfi and D. Ziou, “Image retrieval from the World Wide Web: Issues, techniques, and systems,” ACM Comput. Surv. vol.36, no.1, pp.35–67, 2004. [6] H. Lieberman, “Letizia: An agent that assist web browsing,” Proc. 14th International Joint Conf. on Artificial Intelligence (IJCAI-95), pp.924–929, Montreal, Canada, Aug. 1995. [7] E. Hyvonen, S. Saarela, A. Styrman, and K. Viljanen, “Ontologybased image retrieval,” Proc. 12th International World Wide Web Conference (WWW 12), Budapest, Hungary, 2003. [8] M. Gertz, K.-U. Sattler, F. Gorin, M. Hogarth, and J. Stone, “Annotating scientific images: A concept-based approach,” Proc. 14th International Conf. on Scientific and Statistical Database Management (SSDBM’02), pp.59–68, 2002. [9] Google image search, http://www.google.com [10] AltaVista photo finder, http://www.altavista.com [11] J. Jeon, V. Lavrenko, and R. Manmatha, “Automatic image annotation and retrieval using cross-media relevance models,” Proc. 26th International ACM SIGIR Conf. on Research and Development in Information Retrieval, pp.119–126, 2003. [12] World Wide Web Consortium, Describing and retrieving photos using RDF, http://www.w3.org/TR/photo-rdf/, 2002. [13] World Wide Web Consortium, RDF vocabulary description language 1.0, http://www.w3.org/TR/2004/ REC-rdf-schema-20040210/, 2004. [14] T.R. Gruber, “A translation approach to portable ontologies,” Knowledge Acquisition, vol.5, no.2, pp.199–220, 1993. [15] A. Maedche and S. Staab, “Measuring similarity between ontologies,” Proc. 13th European Conf. on Knowledge Engineering and Knowledge Management, pp.251–263, Siguenza, Spain, 2002. [16] M.A. Hernandez, R.J. Miller, L.M. Haas, L. Yan, C.T.H. Ho, and X. Tian, “Clio: A semi-automatic tool for schema mapping,” Proc. ACM SIGMOD International Conf. on Management of Data, pp.607–608, Santa Barbara, USA, 2001. [17] V.-W. Soo, C.-Y. Lee, C.-C. Li, S.L. Chen, and C.-C. Chen, “Automated semantic annotation and retrieval based on sharable ontology and case-based learning techniques,” Proc. 3rd ACM/IEEE-CS Joint Conf. on Digital Libraries, pp.61–72, 2003. [18] T. Hirashima, N. Matsuda, T. Nomoto, and J. Toyoda, “Contextsensitive filtering for browsing in hypertext,” Proc. International Conf. on Intelligent User Interface, pp.119–126, 1998. [19] K.-S. Lee, Y.-H. Yu, and G.-S. Jo, “Comparison shopping system using image retrieval on the semantic web,” Proc. 2004 Korean Information Science Society Conf., pp.256–258, 2004. [20] T. Berners-Lee, J. Hendler, and O. Lassila, “The semantic Web,” Scientific American, vol.285, no.5, pp.34–43, 2001. [21] L. Miller and M. Poulter, “Easy image annotation for the semantic Web,” ILRT Technical Report, no.1065, University of Bristol, 2003. [22] M. Smith, D. Davenport, and H. Hwa, “AURA: A mobile platform

[26]

[27]

[28]

[29]

for object and location annotation,” Proc. International Conf. on Ubiquitous Computing, pp.48–52, 2003. S. Bechhofer and C. Goble, “Towards annotation using DAML+OIL,” Proc. International Workshop Knowledge Markup and Semantic Annotation, pp.13–21, 2001. J. Kahan and M. Koivunen, “Annotea: An open RDF infrastructure for shared Web annotations,” Proc. 10th World Wide Web Conference (WWW 10), pp.623–632, Hong Kong, 2001. S. Handschuh, S. Staab, and F. Ciravegna, “S-CREAM: Semiautomatic CREAtion of metadata,” Proc. 13th International Conf. on Knowledge Engineering and Knowledge Management (EKAW 2002), pp.358–372, Spain, 2002. F. Ciravegna, A. Dingli, D. Petrelli, and Y. Wilks, “User-system cooperation in document annotation based on information extraction,” Proc. 13th International Conf. on Knowledge Engineering and Knowledge Management (EKAW 2002), pp.122–137, Spain, 2002. J. Domingue, M. Martins, J. Tan, A. Stutt, and H. Pertusson, “Alice: Assisting online shoppers through ontologies and novel interface metaphores,” Proc. 13th International Conf. on Knowledge Engineering and Knowledge Management (EKAW 2002), pp.335–351, Spain, 2002. G. Karypis, “Evaluation of item-based top-N recommendation algorithms,” Proc. 10th International Conference on Information and Knowledge Management (CIKM 2001), pp.247–254, Atlanta, Georgia, USA, 2001. B. Kitts, D. Freed, and M. Vrieze, “Cross-sell: A fast promotiontunable customer-item recommendation method based on conditional independent probabilities,” Proc. 6th ACM SIDKDD International Conf. on Knowledge Discovery and Data Mining, pp.437– 446, Boston, USA, Aug. 2000.

Jason J. Jung received the B.Eng. and M.Eng. degrees in Computer Science and Engineering from Inha University in 1999 and 2002, respectively. He is now working as Ph.D candidate in Intelligent E-Commerce Systems (IES) Laboratory, Inha University. He has visited Fraunhofer Institute (FIRST) in Berlin, Germany for 6 months. His research topics are semi-supervised learning, Web mining, social network analysis and ambient intelligence.

Kee-Sung Lee received the B.S. degree in Computer Science from Cheon-An University in 2003. He is working as M.Eng. student in Intelligent E-Commerce Systems (IES) Laboratory, Inha University.

IEICE TRANS. INF. & SYST., VOL.E88–D, NO.5 MAY 2005

850 Seung-Bo Park is currently a Ph.D. student at the School of Computer Science and Engineering, Inha University, Korea. His research interests are in the field of AI, Information Search, Intelligent Agent, and Summarization.

Geun-Sik Jo is currently professor in the School of Computer and Information Engineering, Inha University, Korea. He is the Chief Information Officer of Inha University. He received the B.S. degree in Computer Science from Inha University in 1982. He received M.S. and Ph.D. degrees in Computer Science from City University of New York in 1985 and 1991, respectively. His research interests include intelligent scheduling, intelligent electronic commerce systems, intelligent agents, and constraint programming.