A query expansion and user profile enrichment ... - Semantic Scholar

2 downloads 190 Views 978KB Size Report
Jan 28, 2010 - Social media applications, such as blogs, multimedia and link sharing sites, ..... basis of the resources labelled by them; TUG handles the ...
User Model User-Adap Inter (2010) 20:41–86 DOI 10.1007/s11257-010-9072-6 ORIGINAL PAPER

A query expansion and user profile enrichment approach to improve the performance of recommender systems operating on a folksonomy Pasquale De Meo · Giovanni Quattrone · Domenico Ursino

Received: 17 December 2008 / Accepted in revised form: 12 January 2010 / Published online: 28 January 2010 © Springer Science+Business Media B.V. 2010

Abstract In this paper we propose a query expansion and user profile enrichment approach to improve the performance of recommender systems operating on a folksonomy, storing and classifying the tags used to label a set of available resources. Our approach builds and maintains a profile for each user. When he submits a query (consisting of a set of tags) on this folksonomy to retrieve a set of resources of his interest, it automatically finds further “authoritative” tags to enrich his query and proposes them to him. All “authoritative” tags considered interesting by the user are exploited to refine his query and, along with those tags directly specified by him, are stored in his profile in such a way to enrich it. The expansion of user queries and the enrichment of user profiles allow any content-based recommender system operating on the folksonomy to retrieve and suggest a high number of resources matching with user needs and desires. Moreover, enriched user profiles can guide any collaborative filtering recommender system to proactively discover and suggest to a user many resources relevant to him, even if he has not explicitly searched for them. Keywords Folksonomies · Query expansion · Recommender systems · Tag ranking · Social tagging · Personalised query answering

P. De Meo · G. Quattrone · D. Ursino (B) DIMET, Università Mediterranea di Reggio, Calabria, Via Graziella, Località Feo di Vito, 89122 Reggio, Calabria, Italy e-mail: [email protected]; [email protected] P. De Meo e-mail: [email protected] G. Quattrone e-mail: [email protected]

123

42

P. De Meo et al.

1 Introduction Social media applications, such as blogs, multimedia and link sharing sites, question and answering systems, wikis and online forums, are growing at an unprecedented rate and are estimated to generate a significant amount of the contents currently available on the Web (Amer-Yahia et al. 2008b; Heymann et al. 2008). One of the most valuable applications in this scenario is represented by the collaborative tagging systems; collaborative tagging is the process of collectively annotating and classifying resources of various types (e.g., URLs, photos, videos, scientific papers, and so on) by means of plain keywords known as tags. The result of the collaborative tagging practice is also known as folksonomy. Folksonomy users are generally provided with (quite simple) tools allowing folksonomies to be browsed and queried in such a way to retrieve the resources of their interest. However, due to the limitedness of these tools, users may experience some difficulties in formulating their queries tightly and precisely and, ultimately, in effectively retrieving the resources of their interest. In order to make these difficulties clearer we observe that in traditional folksonomies the resource retrieval task is carried out by applying algorithms defined in Database and Information Retrieval research fields. Specifically, once a user submits a query, all the resources labelled with at least one term of this query are retrieved; each retrieved resource is then ranked according to some specific functions. Such a ranking methodology suffers from some limitations in that traditional ranking functions take neither social nor behavioural facts into account. Some authors suggested to extend traditional object retrieval approaches by implementing modules capable of keeping trace of social and behavioural facts; for this purpose a promising solution is given by recommender systems (Amer-Yahia et al. 2008a,b; Garg and Weber 2008; Jaschke et al. 2007; Tso-Sutter et al. 2008). In a recommender system user preferences and needs are stored in a suitable data structure called user profile. Recommendations provided by these systems are often obtained by means of two basic strategies called content-based (hereafter, CB) and collaborative filtering (hereafter, CF). In CB strategy, an object is classified as relevant to a user u if it is similar to objects that, in the past, were recommended to u and accepted by him. In CF strategy an object is suggested to a user u if it was rated as relevant by a group of users having a profile similar to the one of u. In the folksonomy scenario, many approaches suggest to define a tag-based user profile, i.e. to use tags to represent user profiles (Carmagnola et al. 2007; Firan et al. 2007; Zanardi and Capra 2008; Zhao et al. 2008). The tags appearing in the profile of a user can be those directly specified by him to label folksonomy resources (Carmagnola et al. 2007; Zanardi and Capra 2008), or those labelling the resources accessed by him (Firan et al. 2007; Zhao et al. 2008) or, finally, those chosen by him from a list of suggested tags (Carmagnola et al. 2007). Actually such a definition of user profile induces a notion of similarity between a user profile and a resource profile and a notion of similarity between two user profiles which are not effective in many practical contexts; this often makes the suggestions returned by traditional CB and CF recommender systems not adequate when they work to retrieve resources of interest

123

A query expansion and user profile enrichment approach

43

to a user by performing browsing and searching activities on a folksonomy storing and classifying the tags used to label available resources. Specifically, CF systems often show a poor performance, in the scenario into consideration, owing to the “power law” distribution of tags and resources. In fact, some experimental studies carried out on real folksonomies (Cattuto et al. 2007; Golder and Huberman 2006; Zanardi and Capra 2008) reveal that only a few users label resources in a folksonomy; most users, instead, do not perform any labelling action. This deeply impacts on CF systems where the similarity degree of two users depends on the ratio of the number of resources they have jointly labelled to the total number of available resources or to the number of resources labelled by at least one of them. In such a context, if the two users into consideration labelled many resources the system works well; however, due to the “power law” distribution, these users are only a small portion of the set of folksonomy ones. In all the other cases the system works badly. In fact, if a user labelled many resources and another labelled few resources, their similarity degree is low, even if the former labelled all the resources labelled by the latter. If two users labelled few resources the probability that both of them labelled the same resources is extremely low; as a consequence, their similarity degree is probably low even if they share the same interests. This reasoning is confirmed by the experimental studies described in Zanardi and Capra (2008) where almost all pairs of users examined in the corresponding tests showed a very low similarity degree. As for CB systems, the main reason underlying their poor performances when applied in the scenario into consideration regards the poor knowledge about a user that they can handle. In fact, due to the “power law” distribution, the vast majority of users examines few resources and applies few tags to label a resource. As a consequence, the profile of a user generally reflects only a partial and limited view of his preferences and needs. This implies that many resources, even if relevant to a user, would be filtered out because their description does not match with the contents of his profile. In this paper we propose a query expansion and user profile enrichment approach aiming at solving the problems introduced above. Our approach builds and maintains a profile for each folksonomy user, as well as a knowledge base consisting of two graphs called Tag Resource Graph (hereafter, TRG) and Tag User Graph (hereafter, TUG). These graphs register the tags exploited in the folksonomy and the way they label involved resources (as for TRG) or the way they are registered in the user profiles (as for TUG). When a user submits a query (consisting of a set of tags) our approach automatically finds further “authoritative” tags (i.e., tags having a high PageRank in TRG and/or in TUG) to enrich this query and proposes them to him. All “authoritative” tags considered interesting by the user are exploited to refine his query and, along with those tags directly specified by him, are stored in his profile in such a way to enrich it. The expansion of user queries and the enrichment of user profiles allow any CB system operating on the folksonomy to retrieve and suggest a high number of resources matching with user needs and desires; as a consequence, our approach overcomes the limitations of classical CB systems specified above. Moreover, enriched user profiles can guide any CF system to proactively discover and suggest to a user many resources

123

44

P. De Meo et al.

relevant to him even if he has not explicitly searched for them; therefore, our approach overcomes the limitations of classical CF systems specified above. The main novelties introduced by our approach are the following: • Our approach is capable of deriving the most “authoritative” tags to enrich a user query and a user profile. The “authoritative” tags proposed by it and accepted by the user are appended to his query and his profile. In this way all the problems affecting the traditional CB or CF recommender systems when applied on folksonomies are avoided. This favours the discovery of resources really interesting to the user and, ultimately, allows recommender systems operating on a folksonomy to produce more refined recommendations. • In our approach the profile of a user u does not contain only the tags explicitly specified by him in his queries, but it is enriched with the tags discovered by exploring TRG and TUG. Therefore, this profile is rich and expressive and contains tags representing interests which u was not able to explicitly specify in the past. As a consequence, the similarity between a user profile and a resource profile or between two user profiles can be computed in a more precise fashion, and this ultimately allows high quality suggestions to be produced. • In our approach the construction of TRG and TUG is almost automatic and requires a limited human effort. In the literature many recommender systems adopt a support knowledge base (Adomavicius and Tuzhilin 2005); they produce accurate and complete recommendations but they require costly and time-expensive methods to acquire and organise the support knowledge. Thanks to the presence of TRG and TUG, our approach can produce accurate and complete recommendations without requiring a great effort to the user to build and maintain these graphs. • Our approach is able to produce two different types of tag lists, corresponding to two different and independent perspectives; both of them are, then, exploited to enrich user queries and profiles and, ultimately, to obtain more refined recommendations. The former list considers how tags are related each other in labelling resources; its construction requires the TRG exploration. The latter list considers the collective behaviour of users in specifying tags; its construction requires the TUG browsing. The structure of this paper is as follows: in Sect. 2 we introduce some preliminary concepts. In Sect. 3 we provide a detailed description of our approach. Section 4 illustrates the experiments we have been carrying out to measure its performances. In Sect. 5 we present some related approaches and analyse the similarities and the differences between each of them and ours. Finally, in Sect. 6, we draw our conclusions. 2 Preliminaries In this section we introduce some preliminary concepts that will be extensively used in the following parts. The first concept we consider is the folksonomy one (Hotho et al. 2006).

123

A query expansion and user profile enrichment approach

45

Fig. 1 The tag cloud associated with our folksonomy fragment

Table 1 Distribution of the 5 considered tags among the 10 resources into consideration Tag/resource

r1

t1 = “Database”

r2

r3

r4

r5

r6

r7

r8

X

X

X

X

X

X

X

X

X

X X

X

X

t2 = “Relational algebra” t3 = “SQL” t4 = “XML” t5 = “XQuery”

X

X

r9

r10

X

X

X

X

X

Definition 2.1 Let U S = {u 1 , . . . , u p } be a set of users, let RS = {r1 , . . . , rm } be a set of resource URIs and let T S = {t1 , . . . , ts } be a set of tags. A folksonomy F is a tuple F = U S, RS, T S, AS, where AS ⊆ U S × RS × T S is a ternary relationship called tag assignment set.   In this definition we do not make any hypothesis about the nature of involved resources; they could be URLs, photos, videos, music files, text documents, and so on. An element a = u, r, t, belonging to AS, indicates that the user u labelled the resource identified by r with the tag t. Observe that a user can tag many resources and that a resource can be labelled with multiple tags by a user. Example 2.1 Let us consider a simple folksonomy which refers to a scenario dealing with Databases and Information Systems. Some of the tags appearing in this folksonomy are reported in Fig. 1 in the form of a tag cloud. This is a graphical representation of folksonomy tags; the size of the font of a tag reflects its popularity (i.e., the number of times it has been used). Due to space constraints, in order to represent how users labelled involved resources, we consider only: (i) 5 tags, namely “Database”, “Relational Algebra”, “SQL”, “XML” and “XQuery”; (ii) 10 resources, namely r1 , . . . , r10 ; (iii) 10 users, namely u 1 , . . . , u 10 . The way the 5 tags are associated with the 10 resources is reported in Table 1; in this table rows are associated with tags and columns with resources; the symbol “X” at the intersection of the row corresponding to the tag ti and the column corresponding to the resource r j indicates that ti is one of the tags labelling r j . For instance, Table 1 shows that r1 is labelled only with the tag “SQL”. Analogously, Table 2 specifies the tags adopted by each user to label the resources of his interest. In this table rows are associated with tags and columns with users; the

123

46

P. De Meo et al.

Table 2 Adoption of the 5 considered tags performed by the 10 users into consideration to label resources of their interest Tag/user

u1

t1 = “Database”

X

t2 = “Relational algebra”

u2

X

t3 = “SQL” t4 = “XML” t5 = “XQuery”

X

u3

u4

X

X

u5

u6

u7

X

X

u8

X

u9

u 10

X

X

X X

X

X

X

X

X

X

X X

symbol “X” at the intersection of the row corresponding to the tag ti and the column corresponding to the user u j indicates that ti is one of the tags exploited by u j . For   instance, Table 2 shows that u 1 exploits only the tag “Database”. In our approach user profiles play a key role. A user profile can be defined as follows: Definition 2.2 Let u be a user of US; the profile UP of u is a set of pairs of the form U P = {t1 , oc1 , . . . , tq , ocq }, where ti , for i = 1 . . . q, is a tag of TS and oci (called occurrence number) is an integer coefficient.   Now we introduce a definition to relate a tag to the resources labelled by it and to the users storing it in the corresponding profiles. Definition 2.3 Let F = US, RS, TS, AS be a folksonomy and let ti be a tag of TS. The projection πtRi S of RS over ti is defined as the set of resources of RS labelled by ti ; analogously, the projection πtUi S of US over ti is defined as the set of users of US   so that ti is stored in their profiles. Example 2.2 Consider the tag t4 of the folksonomy fragment introduced in Example 2.1; it is straightforward to check that πtR4 S = {r6 , r7 , r8 , r9 , r10 } and πtU4 S =   {u 2 , u 5 , u 7 , u 8 , u 9 }. Starting from the concepts introduced in Definition 2.3 we are interested in defining some parameters for measuring how much a tag can be considered related to another one. In the literature, the co-occurrence frequency of two tags is often regarded as a reliable indicator of their correlation degree (Li et al. 2007; Sigurbjornsson and van Zwol 2008). We agree with this intuition and measure the co-occurrence of two tags by considering both the resources they jointly label and the users who jointly exploited them. As for resources, we observe that a tag t j can be considered related to a tag ti if: (i) the resources labelled by ti are frequently labelled by t j ; (ii) the presence of ti and/or t j as labels of resources of RS is not sporadic. Analogously, as for users, a tag t j can be considered related to a tag ti if: (i) the users who adopted ti to label the resources of their interest frequently adopted also t j ; (ii) many users of US adopted ti and/or t j in the past. Observe that these two conditions correspond to the concepts of confidence and support introduced in association rules (Han and Kamber 2006). These reasonings can be formalized as follows:

123

A query expansion and user profile enrichment approach

47

Definition 2.4 Let F = US, RS, TS, AS be a folksonomy and let θ R S , ϑ R S  be a pair of real values belonging to the interval [0,1]. Given two tags ti and t j of TS we say that t j is R-related (resource related) to ti with respect to θ R S , ϑ R S  if: (i) the ratio of the number of resources labelled by ti or t j to the whole number of folksonomy |πtR S ∪πtR S |

resources is higher than or equal to θ R S , i.e., i |R S| j ≥ θ R S ; (ii) the ratio of the number of resources jointly labelled by ti and t j to the number of resources labelled by ti is higher than or equal to ϑ R S , i.e.,

|πtRi S ∩πtRj S | |πtRi S |

≥ ϑRS .

 

Definition 2.5 Let F = US, RS, TS, AS be a folksonomy and let θU S , ϑU S  be a pair of real values belonging to the interval [0,1]. Given two tags ti and t j of TS we say that t j is U-related (user related) to ti with respect to θU S , ϑU S  if: (i) the ratio of the number of users who adopted ti or t j to label the resources of their interest to the whole |πtU S ∪πtU S |

number of folksonomy users is higher than or equal to θU S , i.e., i |U S| j ≥ θU S ; (ii) the ratio of the number of users who adopted both ti and t j to label the resources of their interest to the number of users who adopted ti is higher than or equal to ϑU S , i.e.,

|πtUi S ∩πtUj S | |πtUi S |

≥ ϑU S .

 

Example 2.3 Consider the tags t4 and t1 of the folksonomy fragment introduced in Example 2.1. Assume that θ R S and θU S have been set to 0.10, whereas ϑ R S and ϑU S have been set to 0.50. Then, it is possible to verify that: |πtR4 S ∪ πtR1 S | |RS| |πtR4 S ∩ πtR1 S | |πtR4 S |

=

|{r2 , r3 , r4 , r5 , r6 , r7 , r8 , r9 , r10 }| 9 = ≥ θRS 10 10

=

3 |{r6 , r7 , r8 }| = ≥ ϑRS |{r6 , r7 , r8 , r9 , r10 }| 5

and that: |πtU4 S ∪ πtU1 S | |U S| U S |πt4 ∩ πtU1 S | |πtU4 S |

=

|{u 1 , u 2 , u 3 , u 4 , u 5 , u 6 , u 7 , u 8 , u 9 , u 10 }| 10 = ≥ θU S 10 10

=

2 |{u 7 , u 9 }| = ≤ ϑU S |{u 2 , u 5 , u 7 , u 8 , u 9 }| 5

As a consequence of these computations, it is possible to conclude that t1 is   R-related to t4 as well as t1 is not U-related to t4 . As for the tuning of θ R S , ϑ R S , θU S and ϑU S the following considerations hold: 1. The values of θ R S and θU S should be low. In fact, given two tags ti and t j , if θ R S |πtR S ∪πtR S |

were high, then the condition i |R S| j > θ R S would be satisfied only if the number of resources labelled by ti or t j is large in comparison to the total number of resources in the folksonomy. This hypothesis is clearly unrealistic because, in real

123

48

P. De Meo et al.

folksonomies, most of the tags usually label few resources (see Sigurbjornsson and van Zwol 2008; Zanardi and Capra 2008 for an experimental analysis about this fact). This implies that, in general, |πtRi S |, |πtRj S | and, then, |πtRi S ∪ πtRj S | are very small in comparison with |RS|; as a consequence, almost all pairs ti , t j  of |πtR S ∪πtR S |

tags would fail to satisfy the condition i |R S| j > θ R S . An analogous reasoning holds for θU S . 2. The values of ϑ R S and ϑU S should be quite high. In fact, given two tags ti and t j , if ϑ R S were low, it would be sufficient that t j labels few resources labelled by ti to satisfy the condition

|πtRi S ∩πtRj S | |πtRi S |

> ϑ R S . In this case, a fortuitous co-occurrence of

two tags in labelling few resources would be wrongly interpreted as a form of correlation between them. Clearly, such a correlation would be false and, in our case, it would lower the quality of performed recommendations. Therefore, low values of ϑ R S should be avoided. However, very high values of ϑ R S are unadvisable too. In fact, if ϑ R S were very high, a lot of relevant correlations among tags would be filtered out, and this would negatively impact on recommendation activities. An analogous reasoning holds for ϑU S . As will be clear later on, Definitions 2.4 and 2.5 play a key role in the computation of the authoritativeness of tags. The concept of authoritativeness is intrinsically asymmetric and the algorithms generally exploited to compute it (e.g., PageRank, which also our approach is based on) operate on directed graphs which are intrinsically asymmetric. For all these reasons the concepts of R-relatedness and U-relatedness expressed in these definitions are necessarily asymmetric.

3 Approach description In this section we illustrate the various steps of our approach. Specifically, we first present the construction of the support data structures (Sect. 3.1). Then, we describe the algorithm to generate the lists of candidate tags for each query Q that a user submits to the reference folksonomy (Sect. 3.2). After this, we present the application of the Borda Count technique to merge the lists of candidate tags into a unique list (Sect. 3.3). Finally, we describe how candidate tags are analysed by the user and how those selected by him are processed in such a way to produce the final recommendations and to enrich the user profile (Sect. 3.4). 3.1 Construction of the support data structures As pointed out in Sect. 2, a folksonomy is a “tri-dimensional” data structure whose “dimensions” are represented by users, tags and resources (i.e., users label resources by means of tags). Such a concept has been formalized at an abstract level in the approach of Mika (2007) which represents a folksonomy as a tripartite graph with hyperedges. Since this model is quite difficult to handle, many authors suggested to reduce it to more manageable data structures; for instance, Mika (2007) proposes to map the tripartite graph representing a folksonomy into three bipartite graphs. We agree

123

A query expansion and user profile enrichment approach

49

with these ideas and decided to map a folksonomy onto two “bi-dimensional” data structures. Specifically, these support data structures consist of two directed graphs called Tag Resource Graph (hereafter, TRG) and Tag User Graph (hereafter, TUG). TRG and TUG represent a form of ontological knowledge that helps our approach in expanding queries and enriching user profiles and which is built with a limited effort. Light ontologies have been explored in a broad range of applications like ubiquitous computing (Niu and Kay forthcoming), service adaptivity (De Meo et al. 2007a,b) and Web 2.0 (Specia and Motta 2007). TRG captures the relationships among tags on the basis of the resources labelled by them; TUG handles the relationships among tags on the basis of the users exploiting them. Before introducing TRG and TUG we observe that, in a folksonomy, there could exist some tags (i.e., qualitative or subjective tags, like “beautiful” or “fun”) which reflect the user standpoint and/or his moods in particular circumstances and, therefore, are not able to describe the content of a resource; clearly, these tags are useless to our goals and should be filtered out. In the literature, various approaches have been proposed to perform this filtering task; one of them is presented in Carmagnola et al. (2007). In the following we indicate by T S the set of tags obtained by removing qualitative tags from TS (we used the approach of Carmagnola et al. (2007) to perform this removal). A formal definition of TRG and TUG is given below. Definition 3.1 Let F = US, RS, TS, AS be a folksonomy and let θ R S , ϑ R S  be a pair of real values belonging to the interval [0,1]. The Tag Resource Graph corresponding to F is a directed graph TRG = NRG, ARG where: • NRG is a set of nodes; there is a node nri ∈ N RG for each tag ti ∈ T S; each node has associated a real value, called rank; it can be defined a function φ R S (·) that receives a node and computes its rank. • ARG is a set of arcs; an arc ari j from nri to nr j indicates that t j is R-related to ti with respect to θ R S , ϑ R S .   Definition 3.2 Let F = US, RS, TS, AS be a folksonomy and let θU S , ϑU S  be a pair of real values belonging to the interval [0,1]. The Tag User Graph corresponding to F is a directed graph T U G = N U G, AU G where: • NUG is a set of nodes; there is a node nu i ∈ N U G for each tag ti ∈ T S; each node has associated a real value, called rank; it can be defined a function φU S (·) that receives a node and computes its rank. • AUG is a set of arcs; an arc au i j from nu i to nu j indicates that t j is U-related to   ti with respect to θU S , ϑU S . The idea of labelling each node of TRG and TUG with a rank is explained by the need of finding the most “authoritative” nodes to complete user queries and profiles. In our approach the “authoritativeness” of a tag is computed by exploiting some notions of graph theory; specifically, a node in TRG (resp., TUG) is “authoritative” if many “authoritative” nodes are R-related (resp., U-related) to it. Such an assumption recalls the concept of eigenvector centrality. It suggests to choose the well-known PageRank

123

50

P. De Meo et al.

Fig. 2 The TRG associated with our folksonomy fragment

nr5 XQuery

nr4 XML

nr3 SQL

nr2 Relational Algebra

Database

nr1

algorithm (which is based on eigenvector centrality) to compute the “authoritativeness” of a node and, therefore, to implement functions φ R S (·) and φU S (·). More specifically, φ R S (·) can be defined as follows: φ R S (nri ) = d + (1 − d)

 φ R S (nr j ) out (nr j )

nr j ∈Ri

here d is a coefficient ranging in the real interval [0, 1] and called damping factor; it is usually set to 0.15. Ri indicates the set of nodes of TRG pointing to nri . Finally, out (·) receives a node nr j in TRG and returns the outdegree of nr j , i.e. the number of arcs outgoing from nr j . In an analogous fashion it is possible to define the function φU S (·). Example 3.1 In order to illustrate how TRG and TUG are built and how the “authoritativeness” of their nodes can be computed, let us consider the folksonomy fragment introduced in Example 2.1. If we examine Table 1, we observe that most of the resources labelled by “XML”, “SQL” and “Relational Algebra” are also labelled by “Database”, and that all the resources labelled by “XQuery” are also labelled by “XML”. By contrast, it does not happen that most of the resources tagged by “XQuery” are also tagged by “SQL” or that most of the resources tagged by “Database” are also tagged by “XML” or by “SQL”. This leads to conclude that “Database” is an authoritative tag because many resources labelled by different other tags are also labelled by “Database”. By contrast “SQL”, “XQuery” and “Relational Algebra” are not authoritative tags whereas “XML” has a certain authoritativeness, even if less than the “Database”s one (due to the fact that all the resources labelled by “XQuery” are also labelled by “XML”). In order to verify if the function φ R S is capable of capturing this intuition we first construct the TRG associated with our folksonomy fragment (in this construction we have set θ R S = 0.10, ϑ R S = 0.50 and γ = 0.15). Figure 2 reports it. The function φ R S (·), when applied to nr5 , returns 0.15 because there is no node in TRG that points to nr5 (and, therefore, R5 is empty). Analogously, φ R S (nr2 ) = 0.15 and φ R S (nr3 ) = 0.15.

123

A query expansion and user profile enrichment approach

51

As for nodes nr4 and nr1 we obtain: φ R S (nr5 ) = 0.15 + 0.85 × 0.15 = 0.28 out (nr5 )   φ R S (nr4 ) φ R S (nr3 ) φ R S (nr2 ) φ R S (nr1 ) = 0.15 + 0.85 × + + out (nr4 ) out (nr3 ) out (nr2 ) = 0.15 + 0.85 × (0.28 + 0.15 + 0.15) = 0.64

φ R S (nr4 ) = 0.15 + 0.85 ×

Values obtained for φ R S (·), when applied to nodes n 1 . . . n 5 , confirm that this function is capable of capturing the intuitive idea of authoritativeness. The structure of TUG and the computation of the ranks of its nodes are analogous to the corresponding ones that we have seen for TRG; due to space limitations, we do not illustrate them here.   As for computational complexity we observe that the construction of TRG (resp., TUG) requires to consider all pairs of tags and to determine if they are R-related (resp., U-related). The total number of pairs of tags is |T S|2 . The computation of R-relatedness and U-relatedness between a pair of tags is based on the computation of the intersection or the union of two sets of resources or users. Since the number of resources labelled by a tag and the number of users exploiting a tag can be huge, and since they may rapidly vary over time (because resources may be inserted in or removed from the folksonomy and users can join or leave it), a classical technique for the computation of the intersection and the union of two sets could have prohibitively high computational costs. For this reason we decided to apply a Monte Carlo technique, originally proposed in Cohen (1997) to estimate the size of the transitive closure of a graph, which performs an approximate, but fast, computation of the Jaccard coefficient of two sets; this computation is based on the construction of  random permutations of the involved sets. With a negligible effort, this technique can be extended to determine the size of both the intersection and the union of two sets. In “Appendix A” we show how it can be applied in such a way to fulfil our goals. If we apply the approach described in Appendix A, the computational complexity to determine if two tags are R-related (resp., U-related) is O(), where  is a fixed number (see Appendix A). As a consequence, the computational complexity of the construction of TRG (resp., TUG) is O( · |T S|2 ). Owing to the cardinality of T S such a complexity could appear excessively high in some real cases; however, it is worth pointing out that the construction and/or the update of TRG (resp., TUG) can be performed offline and asynchronously, after a certain number of resources (resp., users) has been inserted, modified or removed. In fact, TRG (resp., TUG) is constructed on the basis of the concept of R-relatedness (resp., U-relatedness) and the computations underlying this concept, as specified in Definition 2.4 (resp., 2.5), are not sensitive to small changes in the set of resources (resp., users) owing to the presence of |RS| (resp., |U S|) in the denominator of the corresponding fractions. Interestingly enough, this choice is compliant with what happens in most Web search engines (e.g., Google).

123

52

P. De Meo et al.

3.2 Generation of the lists of candidate tags Our approach to generate the lists of candidate tags receives a query Q = {t1 , t2 , . . . , tn }, submitted by a user u, and, for each tag ti ∈ Q, performs the following steps: • It identifies the node nri (resp., nu i ) of TRG (resp., TUG) corresponding to ti . • It builds an empty list T RLi (resp., T ULi ) of candidate tags. • It performs an Iterative Deepening Depth-First Search (hereafter, IDDFS) on TRG (resp., TUG) starting from nri (resp., nu i ). IDDFS starts from a node n in a graph and selects the first node adjacent to n. It continues to search in the forward (deeper) direction; it stops when it reaches nodes whose depth with respect to n is higher than a value MAX_DEPTH specified by u. Since TRG (resp., TUG) is an un-weighted graph and since no pruning is applied to it, it is possible to show that IDDFS is able to find the shortest path linking nri (resp., nu i ) with any node it encounters (Russell and Norvig 2002). A score is associated with each node nr j (resp., nu j ) reached by IDDFS; this score is obtained as the ratio of the rank of nrj (resp., nu j ) to the length of the shortest path from nri (resp., nui ) to nrj (resp., nu j ). This formula can be explained by the following reasoning. Given a query Q our approach aims at enriching it by adding tags that are “authoritative” and at the same time very correlated to the tags already present in Q. In our approach an indicator of the “authoritativeness” of a tag is given by its rank, since this is computed by means of the PageRank algorithm. This explains the presence of the rank of nrj (resp., nu j ) at the numerator of the ratio. An indicator of the correlation of two tags is given by the length of the shortest path connecting the corresponding nodes in TRG (resp., TUG) since we assume that the tag correlation decreases as long as the path length increases. For this reason the length of the shortest path from nri (resp., nui ) to nrj (resp., nu j ) is present at the denominator of the ratio. After this, the tags associated with the reached nodes, along with the corresponding scores, are inserted in T RLi (resp., T ULi ) in a decreasing order of their scores. • It maintains in T RLi (resp., T ULi ) the top-V tags and removes the other ones. • It returns T RLi and T ULi . We selected IDDFS because it shows some properties which perfectly adhere to the specificities of our application scenario. Specifically, we can observe that: • It is space-efficient (Russell and Norvig 2002). This feature is particularly valuable in our scenario because, due to the large size of real folksonomies, TRG and TUG may be significantly large. • It fits well with the philosophy underlying our definition of score. In fact, let us consider a tag ti of the user query; it has a node nri (resp., nui ) associated in TRG (resp., TUG); the nodes “too far” from nri (resp., nui ) in TRG (resp., TUG) can be considered loosely correlated to it and, then, the contribution they can give to query expansion can be regarded as negligible. IDDFS finds all the nodes in TRG (resp., TUG) which are up to MAX_DEPTH hops far from nri (resp., nui ) and cuts off all the other nodes, i.e., those nodes which are irrelevant to query expansion.

123

A query expansion and user profile enrichment approach

53

• It is time-efficient; in fact, since the parameter MAX_DEPTH is fixed, IDDFS explores only a portion of TRG (resp., TUG) rather than the whole graph. In real cases MAX_DEPTH is low (i.e., not higher than 3) and, then, the computation of the shortest paths is fast. From a computational standpoint we observe that: • The identification of the node nri in TRG costs O(log |N RG|) with the support of a table mapping tags onto the nodes of TRG; this table must be ordered by the lexicographic order of tags. • The execution of IDDFS on TRG costs O(|N RG| + |A RG|) (Russell and Norvig 2002); since |A RG| is O(|N RG|2 ) this execution costs O(|N RG|2 ). • The construction of T RLi costs O(|N RG| · log |N RG|) if T RLi is implemented by means of a heap. • The maintenance of the top-V tags in T RLi and the removal of the other ones costs O(V ), where V is a constant number less than or equal to |N RG|. As a consequence, the computational complexity of the construction of T RLi is O(log |N RG|)+ O(|N RG|2 )+ O(|N RG|·log |N RG|)+ O(V ), that is O(|N RG|2 ). Since the number of query tags is n, the overall computational complexity of the construction of T RL1 , . . . , T RLn is O(n · |N RG|2 ). An analogous reasoning allows us to conclude that the computational complexity of the construction of T UL1 , . . . , T ULn is O(n · |N U G|2 ). 3.3 Merging of the lists of candidate tags At the end of the second step our approach generates 2 · n lists of candidate tags, each having its own tags that could (either partially or totally) differ from the tags of the other lists. Our goal is to merge the tags coming from these lists in such a way to obtain a global list of tags which is better (i.e., more sound and complete) than each original list. The task of merging multiple lists of tags is hard because information represented by the various lists could be conflicting. As a consequence, it appears necessary to achieve a consensus among all the available lists of candidate tags. The following example clarifies this problem. Example 3.2 Consider the folksonomy fragment introduced in Example 2.1 and assume that a user submits the query Q = {“Database”,“PHP”} because he wants to find material about the construction of dynamic Web pages in PHP. Here below we will indicate by t1 the tag “Database” and by t2 the tag “PHP”. Assume that the lists of candidate tags associated with t1 and t2 are: • • • •

T RL1 = {“Oracle”, “MySQL”, “XML”}; T UL1 = {“CMS”, “ODBC”, “MySQL”}; T RL2 = {“HTML”, “MySQL”, “Ajax”}; T UL2 = {“HTML”, “ODBC”, “MySQL”}.

This example clearly highlights the need of achieving a consensus among all available lists of candidate tags since the information represented by them could be conflicting. For instance, the following questions could arise.

123

54

P. De Meo et al.

Is the tag “Oracle” a good candidate to complete the query Q or is it better to complete this query with other tags, such as “ODBC” or “MySQL”? The fact that the tag “Oracle” takes the first position in T RL1 could suggest that it is a good candidate to complete Q. However, the fact that it does not belong to any other list could suggest that it is not an excellent candidate to complete Q. On the other hand, the tag “MySQL” takes the second position in T RL1 ; this could suggest that it is less adequate than “Oracle” to complete Q. However, “MySQL” is present also in all the other lists; this could suggest that it is the best candidate to complete Q. In order to merge lists of candidate tags, our approach adopts a popular voting system called Borda Count (Saari 2001). In the Borda Count technique we have a group of C candidates of an election and a group of voters. Each voter is asked to rank candidates according to his preferences; the ith ranked candidate receives C − i + 1 votes. Some candidates can be left unranked and, in this case, the not yet assigned votes are equally divided among all unranked candidates. Once all voters have expressed their preferences, the votes received by each candidate are summed up. Finally, the top-V candidates are returned as winners. The main reasons underlying the choice to adopt the Borda Count technique can be expressed as follows: • Firstly, the Borda Count technique tends to favour candidates supported by a broad consensus among voters. In our context, this feature can be explained as follows: given a query Q = {t1 , t2 , . . . tn } and the lists T RLi and T ULi , 1 ≤ i ≤ n, of candidate tags, the tags in Q can be considered as voters whereas those in T RLi and T ULi can be considered as candidates. The Borda Count technique selects those candidate tags globally significant to most tags of Q rather than favouring those tags which are very relevant to few tags of Q and totally useless to the remaining ones. This has a deep practical impact because it improves our approach’s capability of producing sound and complete recommendations. Example 3.2 provides an illustration of this fact. • The Borda Count technique does not make any hypothesis about the number of voters and candidates and works well even though the number of voters is small. In our reference context, this means that it returns satisfying results even when the number of tags appearing in a query is small. Such a situation is quite frequent in folksonomies where users often submit short queries. • Finally, the Borda Count technique does not require any initial training phase and is fast. In fact, in order to produce the final list of tags, it simply needs to scan all the lists of candidate tags; as a consequence, the time required to produce the final list of tags is linear against the total number of candidate tags. In our approach we applied the Borda Count technique to the 2 · n lists of candidate tags returned at the end of the second step. For this purpose we first defined a function (called voting function) that receives a tag tk ∈ T RLi (resp., T ULi ) and returns an integer value vik representing the number of votes assigned to tk in T RLi (resp., T ULi ). Since the number of tags in T RLi (resp., T ULi ) is V , vik ranges between 1 and V . We have defined three strategies to implement the voting function, namely: • Unpersonalised strategy. With this strategy a tag tk ∈ T RLi (resp., T ULi ) receives vik votes if it has the (V − vik + 1)th highest score among the tags

123

A query expansion and user profile enrichment approach

55

of T RLi (resp., T ULi ). As a consequence, the tag of T RLi (resp., T ULi ) having the highest score receives V votes, whereas the one having the lowest score receives only one vote. Observe that this strategy does not consider any information about user preferences. u−R u−U (resp., vik ) the value returned by the voting Later on we will denote with vik function when it is applied on the tag tk ∈ T RLi (resp., tk ∈ T ULi ) and the Unpersonalised strategy is adopted. • Personalised strategy. With this strategy a tag tk ∈ T RLi (resp., T ULi ) receives its votes as follows: – The score of tk is modified by setting it to the ratio of the corresponding occurrence number in the profile UP of u to the length of the shortest path from nri (resp., nu i ) to n k in TRG (resp., TUG), if tk is present in UP, or to 0, otherwise. – If tk has a score different from 0, it receives vik votes if it has the (V −vik +1)th highest score in T RLi (resp., T ULi ). – If tk has a score equal to 0, it is considered as an “unranked candidate” and its i )+1 (resp., χ (T U2Li )+1 ), where χ (T RLi ) (resp., votes are computed as χ (T RL 2 χ (T ULi )) denotes the total number of unranked candidates of T RLi (resp., T ULi ). It is straightforward to show that any tag tk of T RLi (resp., T ULi ) present in UP receives more votes than any other tag tk of T RLi (resp., T ULi ) not present therein. Despite tk does not appear in UP, we cannot conclude that it is not actually related to user needs; in fact, tk could represent a need which u was not able to express in the past. These considerations suggested us to pay a particular attention to the number of votes to assign to a tag which does not appear in UP. In fact, if we assigned 0 votes (or, in general, an excessively low number of votes) to tk , then the tags which do not appear in UP would be excessively penalised; this would make our system unable to discover new tags capable of expressing hidden user interests. As an opposite case, we could assign to tk a large amount of votes; for instance, this number could be set equal to the minimum number of votes assigned to a “ranked” tag of UP and T RLi (resp., T ULi ). If this happens, tags not related to UP but appearing in many lists are likely to totalize a large number of votes and, then, they could be included in user queries. If these tags are not really related to user needs, they would simply introduce “noise” and many not relevant resources to u i would be suggested to him. In order to avoid these two extreme situations we decided to assign an “intermei )+1 (resp., χ (T U2Li )+1 ). We have also diate” number of votes to tk , i.e. χ (T RL 2 decided to assign the same number of votes to all “unranked tags” because, in absence of any further information, all of them must be equally treated. p−R p−U (resp., vik ) the value returned by the voting Later on we shall denote as vik function when it is applied on the tag tk ∈ T RLi (resp., tk ∈ T ULi ) and the Personalised strategy is adopted. • Hybrid strategy. The Hybrid strategy mixes the Unpersonalised and the Personah−R lised strategies. In fact, in this case, given a tag tk ∈ T RLi , the value vik returned by the voting function is computed as:

123

56

P. De Meo et al. p−R

h−R u−R vik = α · vik + (1 − α) · vik

h−U Analogously, given a tag tk ∈ T ULi , the value vik is computed as: p−U

h−U u−U vik = α · vik + (1 − α) · vik

In both formulas α belongs to the real interval [0, 1]. These three strategies have been defined to provide our system with a high flexibility. In fact, they show different behaviours. The user is provided with a user-friendly wizard allowing him to choose among them on the basis of his exigencies; for instance, if he is a “low tagger” his profile is poor and the wizard can suggest to use the UnPers strategy since it is not based on the user profile and takes advantage from the experience of the other users. Specifically: • UnPers selects candidate tags on the basis of their scores. Recall that the score of a node is high when its PageRank in TRG and TUG is high and the corresponding tag is strictly correlated to the tags composing the user query. Tags corresponding to nodes with a high PageRank are “authoritative” and, then, label many resources; this allows UnPers to detect a lot of resources relevant to the user. However, a certain number of proposed resources could be uninteresting to the user. As a consequence, a user who wants to receive many resource proposals, running the risk that a certain number of them are uninteresting, can choose UnPers. • An opposite behaviour is shown by Pers. In fact, this strategy selects only tags already present in the user profile. In this way a user query is mainly enriched with tags accurately describing user needs; as a consequence, almost all the proposed resources are presumably relevant to the user. However, Pers filters out tags not directly present in the user profile even if the corresponding nodes have a high PageRank. These “authoritative” tags could label many resources some of which could be relevant to the user; this might lead to the loss of relevant resources. As a consequence, a user who wants to receive only few but reliable resource proposals, running the risk to filter out also those resources potentially relevant to him but not labelled with the tags already present in his profile, can choose Pers. • As for Hyb, we can observe that its behaviour lies between those achieved by UnPers and Pers and can be tuned by choosing the suitable values of α; this provides our system with the maximum possible flexibility degree. A comparative analysis of these three strategies is reported in Sect. 4. By applying one of the strategies described above, a number of votes is assigned to each tag present in at least one of the lists of candidate tags. If a tag appears in more than one list, the number of votes it receives is equal to the sum of the numbers of votes assigned to it in each list where it appears. Finally, the V tags receiving the highest numbers of votes form the final list of tags presented to u.

123

A query expansion and user profile enrichment approach

57

3.4 User query processing and user profile update During the last step of our approach the following tasks are performed: 1. u receives the final list of tags constructed during the previous step and he is required to accept or reject each tag of the list; accepted tags are appended to Q. 2. Q is processed by means of any recommender system operating on the folksonomy. 3. The resources retrieved by the recommender system are presented to u who specifies those relevant to him. 4. The profile UP of u is updated. This task is carried out as follows; let ti be a tag of Q after its enrichment performed by our approach; two cases may be considered, namely: • ti is already registered in UP, i.e., there exists a pair ti , oci  in UP; in this case oci is increased by 1; • ti is not registered in UP; in this case the pair ti , 1 is inserted therein. Example 3.3 In order to better understand the first task mentioned above, consider the folksonomy fragment introduced in Example 2.1 and assume that u has submitted the query Q = {“CMS”,“PHP”} because he wants information about the personalisation of a CMS written in PHP. Assume that, after the process of Q, our system returns the following list of tags: {“XML”, “AJAX”, “CSS”, “JDBC”, “MySQL”, “Tutorial”}. In this case u accepts the tags “CSS”, “MySQL” and “Tutorial” because he believes they are relevant to retrieve resources useful to his goals; for instance, the tag “CSS” would allow the retrieval of resources about the usage of a CSS which plays a key role in the personalisation of a CMS. On the contrary, u rejects tags like “XML”, “JDBC” and “AJAX” because he does not want to use these technologies. It is worth pointing out that our approach allows the construction and the management of very rich and complete user profiles containing not only tags explicitly provided by users in their submitted queries, but also many “authoritative” tags that users did not explicitly include in their queries, but that have been suggested by the system and accepted by them. Due to the query expansion and the profile enrichment capabilities, our approach is able to improve the performance of both CB and CF recommender systems operating on a folksonomy and can overcome the limitations (outlined in the Introduction) shown by these systems in this application scenario. Specifically, as far as CB systems are concerned, the expansion of user queries allows a large set of resources relevant to user goals to be discovered. In addition, the enrichment of user profiles provides a more detailed picture of user needs. As a consequence of these features, a CB system can operate on an enlarged set of resources and has a more refined knowledge of user needs and desires; therefore, it has much more resources and information at its disposal, to perform its recommendations and to satisfy its users. In absence of query expansions and profile enrichments, a CB system could recommend only the resources labelled by the tags strictly matching those

123

58

P. De Meo et al.

represented in user queries and could handle only the information about user needs and desires explicitly provided by the user himself; as a consequence, it would be able to recommend only a small number of resources fitting the needs and the desires explicitly specified by a user. Interestingly enough, our approach, when operates in conjunction with a CB system, shows a reactive behaviour, because it provides its suggestions as answer to user queries. As far as CF systems are concerned, our approach is able to enrich user profiles; in this way it can consider desires and needs not directly specified by a user (for instance, because he did not know their existence). Moreover, it could happen that two user profiles that did not appear similar on the basis of the information explicitly provided by users, are recognised to be similar after their enrichment. Both these two facts lead to an increase of both the number and the quality of provided suggestions. Interestingly enough, our approach, when operating in conjunction with a CF system, shows a proactive behaviour because it provides its suggestions to a user whenever it finds another user having a similar profile, without the need of any user query. We remark again that our system is orthogonal to the adopted recommender system and can increase the performance of this last one. In the prototype implementing our approach we have chosen, as underlying recommender system, X-Compass, an XML agent based recommender system supporting a user in his search of information on the Web Garruzzo et al. (2006). In X-Compass a user profile consists of a set of interests (represented as keywords) and a set of relationships between interests; two kinds of relationships are considered, namely: (i) is-a relationships, representing subsumption relationships between interests, and (ii) associative relationships, stating correlations between interests on the basis of past user behaviours. X-Compass can behave as a content-based or as a collaborative filtering recommender system. As a CB system, X-Compass ranks each Web page by solving a maximum weight matching problem on a bipartite graph whose nodes represent both user interests and keywords extracted from the Web page itself. As a CF system, X-Compass computes the similarity degree of two users by solving a maximum weight matching problem on a bipartite graph so that the first group of nodes represents the interests of the first user and the second group of nodes represents the interests of the second user. We have used X-Compass since its way to organise the underlying information is very close to the one adopted by our system. As a matter of fact, a user profile built by our system can be mapped without any effort onto a user profile handled by X-Compass. In fact: (i) in our system a user profile consists of a set of tags and each tag corresponds to an interest of an X-Compass user profile; (ii) R-relatedness and U-relatedness relationships defined in our system correspond to associative relationships in X-Compass. It is worth pointing out that the usage of X-Compass as underlying recommender system does not produce any bias in the experimental results about the performance of our system. In fact, the adopted recommender system has to be seen as a black box which receives user profiles and queries and provides recommendations. Our experimental campaign aims at measuring the improvements produced by our system to the performance measures returned by the black box when it operates alone. As a consequence, the core of the evaluation is represented by the improvements and not by the

123

A query expansion and user profile enrichment approach

59

absolute values achieved by the black box (which, in principle, could be different on the basis of the algorithm implemented by it). In our experiments and, more in general, in all the exploitations of our system, we have used the CB component of X-Compass. Only in the experiments illustrated in Sect. 4.4 the CF component has been used; this was due to the intrinsic nature of these experiments (see Sect. 4.4 for all details).

4 Experiments In order to evaluate our approach we built a prototype in Java and MySQL. We carried out all experiments on a Personal Computer equipped with a Core Duo T2500 Processor and 2 GB of RAM. The dataset exploited in our tests was extracted from del.icio.us (Delicious 2009); this is a Web-based social bookmarking tool which allows a user to manage a personal collection of links (bookmarks) to Web sites and to annotate these links with one or more tags. Users of del.icio.us are mostly young and technologically aware (Mika 2007). Each time a user introduces and annotates new bookmarks he receives an instant gratification in the form of bookmarks to other Web pages. These factors contribute to make del.icio.us one of the most popular examples of folksonomies. Clearly, other folksonomies are available and they differ for the nature of resources stored in them (e.g., photos Sigurbjornsson and van Zwol 2008, academic papers Zanardi and Capra 2008 and information about cultural events Carmagnola et al. 2007). However, it is worth observing that our approach works on the tags labelling folksonomy resources or composing user queries, and makes no assumption on the nature of available resources. As a consequence, the results it can obtain depend on the quality of the tags specified by users when they label their resources or compose their queries, and not on the nature of available resources. With regard to the tag quality, a study described in Carmagnola et al. (2007) shows that most of the tags exploited by a user in a folksonomy (i.e., about 60%) are specific (i.e., they describe a resource feature) and have a positive impact on the system performance; only a small fraction of them (i.e., about 2%) are subjective (i.e., they reflect user opinions) and have a negative impact on the system performance. The fraction of specific and subjective tags slightly differ on the basis of the resource nature (e.g., the fraction of subjective tags is likely to be higher in photos than in bookmarks) but these differences are very slight and have a negligible impact on the system performance. We observe also that the querying activity managed by our system comprises all the tasks performed from the first submission of a query to its final enrichment. After the query has been enriched it is processed by the underlying recommender system which determines the resources potentially answering it and presents them to the user; this last can either accept or refuse any proposed resource. In order to facilitate this last task and, ultimately, to improve the overall performance, it would be desiderable the recommender system being provided with a user-friendly, not-invasive and user-transparent graphical interface; for instance, this interface should present a summary for each resource of real interest to the user. This would allow a better selection of the resources of real interest to the user; as a consequence of this fact, the

123

60

P. De Meo et al.

Table 3 Distribution of retrieved tags and bookmarks over the corresponding topics Topic

Percentage of tags (%)

Database design

19.68

19.53

Querying

21.19

24.57

Storage and indexing Data warehouse

Percentage of bookmarks (%)

2.18

2.38

12.99

11.11

SQL server and PL/SQL

18.12

15.34

Other

25.84

27.07

feedback provided by him would be more precise and reliable. As pointed out in Sect. 3.4, the recommender system exploited in the prototype implementing our system is X-Compass whose graphical interface presents the desiderable characteristics mentioned above. We collected data from del.icio.us according to the methodology described in Hotho et al. (2006). Specifically, we used an open source Web crawler called wget. Initially, we applied it to del.icio.us starting from its top page and we obtained 496 folksonomy users, 17,057 tags and 31,465 bookmarks. After this, we applied wget in a recursive fashion on del.icio.us users, to obtain new bookmarks, and on bookmarks, to obtain new del.icio.us users. At the end of this process we obtained a dataset consisting of 8,267 different del.icio.us users, 146,326 tags and 710,495 bookmarks. Retrieved tags and bookmarks refer to the Database and Information Systems domains; in Table 3 we report the topics they refer to, along with their distribution over these topics. Furthermore, we defined a set of 20 users, called TUS (Test User Set), different from those present in del.icio.us, and we asked them to take part in our experiments by using the folksonomy fragment described above. For this purpose, we first asked each user to specify an initial set of tags representing his interests; in this way it was possible to associate a starting, even if quite simple, profile with each user. In order to refine user profiles and to enrich our folksonomy fragment we asked users to submit queries in our folksonomy fragment for a period of 10 days. For each query submission our system suggested some bookmarks to the corresponding user and, at the same time, enriched his profile. A user was able to accept or reject any bookmark recommended by our system. For each accepted bookmark we asked the corresponding user to access it and to add the desired tags to it. As a consequence of this behaviour, each query submission led to an enrichment of the profiles of both the corresponding user and the bookmarks accessed in by him. Clearly users submitting many queries in this training period enriched their profile much more than users submitting few queries. For this reason, after this training period, we grouped users into three categories, namely: heavy taggers (made up of users who tagged more than 50 bookmarks), medium taggers (made up of users who tagged between 10 and 50 bookmarks), and low taggers (made up of users who tagged less than 10 bookmarks). Clearly, heavy taggers had rich profiles, low taggers had poor profiles, whereas the profile of medium taggers was good even if potentially incomplete. More specifically, the minimum number of tags present in the profile of

123

A query expansion and user profile enrichment approach

61

160000

Number of bookmarks

140000 120000 100000 80000 60000 40000 20000 0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20

Number of tags Fig. 3 Distribution of bookmarks against the number of tags labelling them

Percentage of tagged bookmarks

100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 19 (95%)

20 (100%)

18 (90%)

16 (80%)

17 (85%)

15 (75%)

14 (70%)

13 (65%)

12 (60%)

11 (55%)

9 (45%)

10 (50%)

8 (40%)

7 (35%)

6 (30%)

5 (25%)

4 (20%)

3 (15%)

2 (10%)

0

1 (5%)

0%

Tags quantiles Fig. 4 Cumulative distribution of bookmarks against tag quantiles

a heavy tagger was 38; the number of tags present in the profile of a medium tagger ranged from 12 to 44; finally, the maximum number of tags present in the profile of a low tagger was 14. We carried out a preliminary analysis on the available data; the corresponding results are reported in Figs. 3 and 4. Specifically, Fig. 3 plots how many bookmarks were labelled with a specific number of tags; from this figure it is possible to observe that the 73% of bookmarks were labelled by less than 5 tags, the 22% of them were labelled by 5–10 tags and only the 5% of them were labelled with more than 10 tags. For a convenience of ours, we classified available bookmarks into two categories, namely: (i) very labelled bookmarks (hereafter vl-bookmarks), i.e., bookmarks labelled by five or more tags, and (ii) little

123

62

P. De Meo et al.

labelled bookmarks (hereafter, ll-bookmarks), i.e., bookmarks labelled by less than five tags. After this, we computed the popularity of each tag (defined as the number of bookmarks labelled by it) and we sorted tags according to their popularities. Then, we split the sorted set of tags into 20 quantiles, each comprising the 5% of available tags. Finally, we computed the cumulative distribution of bookmarks against tag quantiles; obtained results are depicted in Fig. 4. From the analysis of this figure we can observe that about the 20% of tags is enough to label more than the 70% of bookmarks, whereas the 40% of tags is able to label about the 95% of bookmarks. This analysis supplies some meaningful insights. In fact, it indicates that some observations presented in the Introduction and that influenced our approach (i.e., the fact that, in a folksonomy, a resource is usually labelled with few tags or that only a little number of tags is frequently used) find a strong confirmation in the real world. As shown in the Introduction, in this scenario, a traditional recommender system adopted to recommend resources in a folksonomy would fail to produce high quality results; this highlights the usefulness of our proposal to support the activity of a recommender system (see Sect. 3.4 for all details). The construction of TRG and TUG depends on the computation of the R-relatedness and U-relatedness relationships among tags. To carry out this computation it is necessary to tune θ R S , θU S , ϑ R S and ϑU S . For this purpose we performed a tuning activity which consisted of the following steps: • We chose ten values of θ R S (resp., θU S ) uniformly distributed in the real interval [0, 0.02], and ten values of ϑ R S (resp., ϑU S ) uniformly distributed in the real interval [0.25, 0.75]. With regard to this choice we observe that, at a first glance, the chosen values of θ R S and θU S could appear extremely low. Actually, these values are reasonable also from a theoretical point of view. In fact, in a real folksonomy, there are millions of users and resources; as a consequence, the denominator of θ R S (resp., θU S ) is extremely high. Due to the power law distribution of tags, the number of resources labelled by a tag (resp., the number of users who applied a tag to label the resources of their interest) is generally low. As a consequence πtRi S and πtRj S (resp., πtUi S and πtUj S ) are small and |πtRi S ∩ πtRj S | (resp., |πtUi S ∩ πtUj S |), which represents the numerator of θ R S (resp., θU S ), is generally extremely low. • We considered all the possible pairs of values θ R S , ϑ R S  (resp., θU S , ϑU S ) and, for each pair, we examined the corresponding graphs.1 Specifically: – Low values of θ R S , ϑ R S  (resp., θU S , ϑU S ) generated a TRG (resp., TUG) with many arcs. Some of these arcs originated R-relatedness (resp., U-relatedness) relationships considered incorrect by the user. – High values of θ R S , ϑ R S  (resp., θU S , ϑU S ) generated a TRG (resp., TUG) with few arcs. As a consequence, some actual R-relatedness (resp., U-relatedness) relationships were not detected by our system.

1 Recall that an arc in TRG (resp., TUG) denotes that the linked nodes are R-related (resp., U-related)—see

Definitions 3.1 and 3.2.

123

A query expansion and user profile enrichment approach

63

– Intermediate values of θ R S , ϑ R S  (resp., θU S , ϑU S ) led our system to detect many correct R-relatedness (resp., U-relatedness) relationships and to discard most of the incorrect ones. • From the previous examination we have found that the best tradeoff was obtained when θ R S and θU S are equal to 0.01 and ϑ R S and ϑU S are equal to 0.45. These are the final values chosen by us. Since the number of involved tags is enormous we applied the heuristics described in “Appendix A” to compute the R-relatedness and the U-relatedness relationships and, ultimately, to construct TRG and TUG. This choice could lead to approximate versions of TRG and TUG; we call TRGapp and TUGapp these versions. We have carried out some tests to tune  (see Sect. 3.1 and “Appendix A”) and we observed that for low values of  (i.e.,  ≤ 10) the computation of TRGapp (resp., TUGapp ) was very fast but TRGapp (resp., TUGapp ) did not perfectly match TRG (resp., TUG). Specifically, in TRGapp (resp., TUGapp ) there were some arcs not existing in TRG (resp., TUG) and, at the same time, some arcs of TRG (resp., TUG) were missed in TRGapp (resp., TUGapp ). By increasing  we observed that the “errors” appearing in TRGapp (resp., TUGapp ) were drastically reduced at the expense of a slight increase of the time required to construct this graph. We found that, for  ≥ 40, TRGapp (resp., TUGapp ) always coincided with TRG (resp., TUG). In order to be more guaranteed about the coincidence between TRGapp and TRG (resp., TUGapp and TUG) we set  to 50. Our experimental tests have been designed to answer the following questions: • Is our approach capable of producing accurate (i.e., both correct and complete) results? • Is our approach able to find bookmarks even if they are labelled with few tags? • Is our query expansion activity effective? Is it competitive against the already existing approaches? • Is our profile enrichment activity effective? Is it competitive against the already existing approaches? The experiments carried out to answer these questions are reported in the following subsections. 4.1 Analysis our approach’s accuracy In order to analyse the accuracy of our approach we adopted two popular parameters defined in Information Retrieval, namely Precision and Recall (Baeza-Yates and Ribeiro-Neto 1999). Precision measures the capability of our system of finding bookmarks considered relevant by the user, whereas Recall measures the capability of our system of not missing bookmarks relevant to the user. While the evaluation of Precision does not imply particular problems, the computation of Recall is a daunting task. In fact, in its purest sense, this computation would require each user to view and validate all bookmarks of the whole folksonomy fragment constructed previously. Such a task is clearly unfeasible since the number of available bookmarks of our reference folksonomy fragment exceeds 700,000. For this reason, for the computation of Recall, we

123

64

P. De Meo et al.

decided to construct a sample Folk FragSample of our folksonomy fragment. This contained 710 bookmarks; it was built in such a way to mirror the features of the original fragment. For this purpose a stratified sampling technique was applied on our folksonomy fragment. Stratified sampling is a technique widely applied when the population to sample consists of various sub-populations (each of them called stratum) which differ each other significantly. In a stratified sampling population is subdivided into strata and, for each stratum, a sampling is applied. Strata must be exhaustive and mutually exclusive (i.e., each element of the population is assigned to exactly one stratum). The hypotheses underlying stratified sampling perfectly fit our scenario due to the power law distribution of bookmarks against the number of tags labelling them (see Fig. 3), and, then, to the existence of various bookmark sub-populations which differ for the number of tags labelling each bookmark. The 710 bookmarks of Folk FragSample were obtained by sampling 142 bookmarks for each stratum. We used five strata which were equally dense, i.e., the ratio of the number of bookmarks in a stratum to the total number of tags labelling them was constant. A larger size of Folk FragSample implied that a user should have examined a very large set of bookmarks in such a way to indicate, for each of them, if he was interested in it or not. Such an activity was clearly time-consuming and boring, and many users may have decided to abandon the validation phase. The computation of Precision and Recall was performed as follows: 1. Let u be a user of TUS and let Q be a query submitted by him. Q was processed by applying the following approaches: (i) Basic, i.e., no query expansion was applied to it; (ii) UnPers, i.e., the Unpersonalised strategy was applied to expand it; (iii) Pers, i.e., the Personalised strategy was applied to expand it; (iv) Hyb, i.e., the Hybrid strategy was applied to expand it; as for this last strategy, the parameter α was set to 0.25, 0.50 and 0.75, respectively (see Sect. 3.3). Let Sys RS be the set of Web pages whose URL’s coincide with the bookmarks proposed to u. 2. u was asked to specify the set Corr ect RS (resp., Complete RS) of Web pages of Sys RS (resp., Folk FragSample) that he considered relevant to him. 3. The Precision Pr e and the Recall Rec were defined as: Pr e =

|Corr ect RS| |Sys RS|

Rec =

|Corr ect RS ∩ Complete RS| |Complete RS|

Here Pr e represents the fraction of the set Sys RS of Web pages returned by our system that were considered relevant by u whereas Rec represents the fraction of the Web pages of Folk FragSample considered relevant by u (i.e., Complete RS) that have been discovered by our system. Both Precision and Recall range in the real interval [0,1]; the higher they are, the better our system works. Our experiment was performed as follows. We asked each user of TUS to submit five queries and, for each query, we computed the Precision and the Recall achieved by our system when each of the possible strategies (i.e., Basic, UnPers, Pers, Hyb) was applied. After this, we computed the Average Precision and the Average Recall

123

A query expansion and user profile enrichment approach

65

0,80

Average Precision

0,70

Basic UnPers Pers

0,60 0,50 0,40 0,30 0,20 0,10 0,00

H e a v y Ta g g e rs

M e d iu m T a g g e r s

L ow Ta ggers

Fig. 5 Average Precision of our approach (Basic, UnPers, Pers strategies) 0,80

Average Precision

0,70

Hyb (alpha=0,25) Hyb (alpha=0,50) Hyb (alpha=0,75)

0,60 0,50 0,40 0,30 0,20 0,10 0,00

H e a v y Ta g g e rs

M e d iu m T a g g e r s

L o w Ta g g e rs

Fig. 6 Average Precision of our approach (Hyb strategy)

of all queries submitted by heavy, medium and low taggers. The obtained results are reported in Figs. 5, 6, 7 and 8. From the analysis of these figures we can conclude that, in all scenarios, both the Precision and the Recall of our approach are better than those achieved by the Basic one. This result indicates that query expansions and profile enrichments performed by our approach are really able to help a user when he queries a folksonomy. From Figs. 5 and 7 it is possible to observe that UnPers (resp., Pers) privileges Recall (resp., Precision) over Precision (resp., Recall). As for Hyb (Figs. 6 and 8) we can observe that the Precision and the Recall achieved by it lies between those achieved by UnPers and Pers and can be tuned by choosing the suitable values of α; this confers a high flexibility degree to this strategy. As an example, in a real scenario, a user with a consolidated knowledge in some fields and no knowledge in other related ones (e.g., a student having a good knowledge of SQL but no knowledge of XQuery) could start by choosing a high value of α to obtain a wide set of bookmarks for each of his queries. Then, he could access these bookmarks and refine his knowledge about the new fields accordingly. After this, he could decide to set α to a low value to

123

66

P. De Meo et al. 0,80 0,70

Basic UnPers Pers

Average Recall

0,60 0,50 0,40 0,30 0,20 0,10 0,00 H e a v y Ta g g e rs

M e d iu m T a g g e r s

L o w T a g g e rs

Fig. 7 Average Recall of our approach (Basic, UnPers, Pers strategies) 0,80 0,70

Hyb (alpha=0,25) Hyb (alpha=0,50) Hyb (alpha=0,75)

Average Recall

0,60 0,50 0,40 0,30 0,20 0,10 0,00

H ea v y Taggers

M e d iu m T a g g e r s

L ow Ta g g ers

Fig. 8 Average Recall of our approach (Hyb strategy)

process his next queries; this would allow him to receive, for each query, a narrow set of bookmarks strictly adhering to his expectations. The behaviours of UnPers, Pers and Hyb are directly explained by considering the meaning of Precision and Recall and by taking into account the theoretical observations reported in Sect. 3.3, fully confirmed by this experiment. Our approach is particularly useful to medium and low taggers. The frequency of the tagging activities performed by these users is low and, then, their knowledge about available bookmarks and tags is vague. Thus, without any suggestion, their queries would produce imprecise results because a lot of not relevant bookmarks could be retrieved and, at the same time, a lot of relevant bookmarks could be filtered out. As a consequence of this fact, for these groups of users, the Precision and the Recall achieved by the Basic approach are quite low. By contrast, when our approach is applied, both Precision and Recall considerably increase and achieve values comparable with those obtained by heavy taggers. From the analysis of Figs. 5, 6, 7 and 8 it is possible to infer a further observation; specifically, in presence of low taggers, the choice between UnPers and Pers

123

A query expansion and user profile enrichment approach

67

or the tuning of the value of the α parameter in Hyb, do not strongly influence the system performance. This behaviour can be explained by considering that the profiles of low taggers are quite poor. The difference between UnPers and Pers consists in the fact that, to perform its suggestion, the former strategy considers only user queries and resource profiles whereas the latter one considers also user profiles. If user profiles are poor the two strategies tend to coincide; this is also the reason for which the choice of the α parameter in Hyb does not considerably affect the values of Precision and Recall for low taggers. An opposite behaviour can be observed for heavy taggers; in fact, they have very rich profiles and, then, the choice between UnPers and Pers or the tuning of the value of the α parameter in Hyb influences the values of Precision and Recall in a more evident fashion. 4.2 Analysis of our approach’s capability of supporting a user to retrieve little labelled bookmarks In this section we are interested in verifying if our system is able to find ll-bookmarks, i.e., bookmarks labelled by less than five tags. This is a challenging issue because some of these bookmarks could be relevant to the user but they are hard to be retrieved by applying the Basic approach. In fact, the overlap between the tags associated with a user query and those labelling an ll-bookmark is often empty and, then, the Basic approach would fail to retrieve this bookmark. In order to verify if our system is able to retrieve ll-bookmarks we considered the three categories of users introduced in the previous section, i.e., heavy, medium and low taggers. For each user category we considered the queries submitted by the corresponding members. For each submitted query we considered the corresponding set Corr ect RS, as defined in Sect. 4.1, and we computed the percentage of its components that are ll-bookmarks. After this, for each user category, we averaged the percentages of ll-bookmarks obtained for all queries submitted by the corresponding members. The final results are reported in Figs. 9 and 10.

Percentage of ll-bookmarks

50% 45%

Basic UnPers Pers

40% 35% 30% 25% 20% 15% 10% 5% 0%

Heavy Taggers

Medium Taggers

Low T agger s

Fig. 9 Percentage of ll-bookmarks retrieved by our approach (Basic, UnPers, Pers strategies)

123

68

P. De Meo et al.

Percentage of ll-bookmarks

0,50 0,45

Hyb (alpha=0,25) Hyb (alpha=0,50) Hyb (alpha=0,75)

0,40 0,35 0,30 0,25 0,20 0,15 0,10 0,05 0,00 Heavy Taggers

Medium Taggers

Low T agger s

Fig. 10 Percentage of ll-bookmarks retrieved by our approach (Hyb strategy)

From the analysis of these figures it is possible to observe that, as for the Basic approach, from the 24% (for low taggers) to the 29% (for heavy taggers) of retrieved bookmarks were ll-bookmarks; moreover, the percentage of ll-bookmarks retrieved by UnPers (resp., Pers, Hyb) ranges from the 35% (resp., the 33%, the 34%), obtained for low taggers, to the 42% (resp., the 38%, the 40%), obtained for heavy taggers. This improvement can be explained as follows. Our approach differs from the Basic one because it suggests to users some tags to enrich their queries. Suggested tags are “authoritative” (i.e., semantically relevant and commonly used) tags of the folksonomy that have been estimated to be sufficiently related to user queries. Users can exploit suggested tags to enrich their queries; as a consequence, enriched queries contain “authoritative” tags; this fact lowers the probability to find an empty overlap between the tags present in user queries and those labelling ll-bookmarks. Therefore, additional tags suggested by our approach can create a bridge between user queries and ll-bookmarks. From a further analysis of Figs. 9 and 10 it is possible to observe that UnPers is able to discover a higher number of ll-bookmarks than Pers and Hyb. This fact can be explained by considering that UnPers suggests tags only on the basis of their PageRank; as a consequence, it selects the most “authoritative” tags in the folksonomy that appear someway related to user queries. By contrast, in Pers, candidate tags not related to user profile are filtered out, even if they are “authoritative”; the absence of these last tags in the expanded queries lowers the probability to retrieve ll-bookmarks; therefore, the percentage of ll-bookmarks discovered by Pers is lower than the percentage of ll-bookmarks discovered by UnPers. Finally, Hyb shows an intermediate behaviour between those characterizing UnPers and Pers. In order to further investigate our intuition about the role of the “authoritative” tags in retrieving ll-bookmarks, we have computed the Average Precision and the Average Recall for ll-bookmarks obtained by Basic, UnPers, Pers and Hyb (with α set to 0.5). Obtained results are reported in Figs. 11 and 12. From the analysis of these figures it is possible to observe that the Average Precision for ll-bookmarks obtained by the four approaches is comparable with the one obtained by the same approaches for all bookmarks (see Figs. 5, 6 and 11). This result can be

123

A query expansion and user profile enrichment approach 0,80

Average Precision

0,70

69

Basic UnPers Pers Hyb (alpha = 0.5)

0,60 0,50 0,40 0,30 0,20 0,10 0,00 Heavy Taggers

Medium Taggers

Low Taggers

Fig. 11 Average Precision of our approach (Basic, UnPers, Pers, Hyb strategies) for ll-bookmarks 0,80 0,70

Basic UnPers Pers Hyb (alpha = 0.5)

Average Recall

0,60 0,50 0,40 0,30 0,20 0,10 0,00

Heavy Taggers

Medium Taggers

Low Taggers

Fig. 12 Average Recall of our approach (Basic, UnPers, Pers, Hyb strategies) for ll-bookmarks

explained by considering that the low number of tags labelling ll-bookmarks has a strong impact on the ability of a system to find these bookmarks. However, once found, they are bookmarks like any other, and, therefore, with the same probability of being of interest to the user. The low number of tags characterizing ll-bookmarks should have a strong impact on Recall. This intuition is confirmed by examining Figs. 7, 8 and 12 where we can see that the Average Recall obtained for ll-bookmarks is substantially lower than the one obtained for all bookmarks. This behaviour is clearly explained by the fact that the probability of having an empty overlap between the tags composing a user query or a user profile and those labelling an ll-bookmark is high; when this happens, the involved ll-bookmark would be filtered out and not suggested to the user, and this lowers Recall. Such a problem is very evident in the Basic approach where no query expansion activity is executed; on the contrary, it is highly mitigated in our approach. In fact, this is based on the idea of expanding a query with “authoritative” tags and, as

123

70

P. De Meo et al.

specified above, these tags are very useful to create a bridge between user queries and ll-bookmarks. This explains also why the approach showing the best performances for ll-bookmarks is UnPers, which appends to a user query the most “authoritative” tags someway correlated to those already present therein, without considering if these tags are stored in the user profile. Pers suffers the presence of ll-bookmarks more than UnPers since it filters out those “authoritative” tags not present in the user profile. Finally, Hyb shows an intermediate behaviour between those characterizing UnPers and Pers. 4.3 Comparative analysis with some query expansion approaches In this section we present a quantitative comparison between our approach and some approaches for resource recommendations in folksonomies based on query expansion; in particular, we consider the approaches described in Zanardi and Capra (2008) and Sigurbjornsson and van Zwol (2008). The recommendation strategy followed by these approaches is “reactive”, in the sense that they perform their recommendations as a reaction to a user query. These approaches, along with their similarities and differences with ours, are described in detail in Sect. 5. We implemented the two approaches described above and ran both of them and our approach on the dataset and with the test users described in Sect. 4. As for our approach, we selected the Hybrid configuration with α = 0.5. This choice is justified by the fact that, according to the results presented in Sect. 4.1, it provides the best trade-off between Precision and Recall. Moreover, in our approach, we have chosen the content-based component of X-Compass as the underlying recommender system. We asked each test user to submit four queries and we ran the three systems into evaluation to answer these queries. For each submitted query we computed the corresponding values of Precision Pr e and Recall Rec achieved by each system. Since Precision and Recall specify the performance of a system from two totally different points of view, we also used a combined measure of them, namely F-Measure (Baeza-Yates and Ribeiro-Neto 1999). This last one represents the harmonic mean Pr e·Rec between Precision and Recall and is defined as: F-Measur e = 2 · Pr e+Rec . F-Measure ranges in the real interval [0,1]; the higher it is the better a system works. At the end of this experiment we averaged the values of Precision, Recall and F-Measure across all submitted queries. In Table 4 we report the obtained results. From the analysis of this table it is possible to observe that the three systems show different behaviours; this depends essentially on the different strategies used by them to rank tags. More in detail we can observe that Social Ranking generally obtains a higher Recall but a lower Precision than our system. This behaviour can be explained by considering that in Social Ranking a tag is included in the expanded query if it is similar to at least one of the tags appearing in the original query. As a consequence, the expanded query could contain many tags. This implies that only few resources relevant to the user are missed by this system, and this explains the high values of Recall achieved by it. However, it may happen that some appended tags are very similar to only few tags of the original query and quite dissimilar from all the remaining ones. This could lead

123

A query expansion and user profile enrichment approach

71

Table 4 Average Precision, Average Recall and average F-Measure achieved by our approach and those of Zanardi and Capra (2008) and Sigurbjornsson and van Zwol (2008) Approach

Avg. Precision

Avg. Recall

Avg. F-Measure

Social ranking (Zanardi and Capra 2008)

0.45

0.56

0.49

Approach of Sigurbjornsson and van Zwol (2008)

0.65

0.38

0.47

Our approach

0.60

0.52

0.55

Social Ranking to propose to the user also resources not really relevant to him, which lowers the values of Precision. A different behaviour emerges from the analysis of the approach of Sigurbjornsson and van Zwol (2008). In fact, the tag selection strategy adopted by this approach is “severe” and a tag must undergo various tests before determining if it is suitable to enrich a user query. This means that this approach associates a high score with few tags which are presumably very effective to expand the user query, and this explains the high value of Precision obtained by it. However, a low score is associated to most of the candidate tags, also to those which could, at least partially, satisfy user needs. As a consequence, many tags can have quite the same, very low, associated score; if this happens the approach of Sigurbjornsson and van Zwol (2008) could be unable to distinguish the tags partially useful for the user from those useless for him, and could omit to propose some useful tags; this lowers the value of Recall. Our approach ranks tags on the basis of their PageRank values in TRG and TUG; then, a tag is appended to a query if it is related to the query’s tags and is “authoritative” in TRG and TUG. Since “authoritative” tags generally point to many resources, our approach can propose a large set of resources to the user and, therefore, may obtain a good value of Recall. In addition, due to the adoption of the Borda Count technique, our approach selects a tag if it is “globally related” to most of the tags forming the original user query; in this way many “accidental” tags are filtered out, and this explains the high values of Precision achieved by our approach. The values of Precision and Recall achieved by our approach are very close to the maximum values obtained by Social Ranking (as for Precision) and by the approach of Sigurbjornsson and van Zwol (2008) (as for Recall). However, if we consider the F-Measure, we can observe that our approach achieves the best values. This implies that it is the most adequate when the user wants to obtain a trade-off between Precision and Recall without privileging one of them to the detriment of the other. 4.4 Comparative analysis with some profile enrichment approaches In this section we propose a quantitative comparison between our approach and some approaches for resource recommendation in folksonomies based on the enrichment of the profiles of involved resources and users; in particular, we consider the approaches described in Tso-Sutter et al. (2008) and Zhao et al. (2008). Differently from the approaches described in the previous section, which recommend resources as a reaction to user queries, these approaches proactively suggest resources to users on the

123

72

P. De Meo et al.

Table 5 Average Precision, Average Recall and Average F-Measure achieved by our approach and those of Tso-Sutter et al. (2008) and Zhao et al. (2008) Approach

Avg. Precision

Avg. Recall

Avg. F-Measure

Approach of Tso-Sutter et al. (2008) with (the collaborative filtering component of) X-Compass as underlying recommender system Approach of Zhao et al. (2008) with (the collaborative filtering component of) X-Compass as underlying recommender system Our approach coupled with (the collaborative filtering component of) X-Compass

0.31

0.43

0.34

0.46

0.28

0.32

0.42

0.39

0.40

basis of the similarity of the corresponding profiles. The recommendation strategy followed by them is clearly a collaborative filtering one. These approaches, along with their similarities and differences with ours, are described in detail in Sect. 5. Analogously to the experiment described in Sect. 4.3, we implemented the approaches of Tso-Sutter et al. (2008) and Zhao et al. (2008) and ran both of them and our approach on the dataset and with the test users described in Sect. 4. As for our approach, we selected the Hybrid configuration with α = 0.5. As for the approach of Tso-Sutter et al. (2008), we set the parameter λ to 0.4, as suggested by the authors. In order to perform a more uniform evaluation of the three approaches we chose X-Compass (specifically, its collaborative filtering component) as the support recommender system for them. In order to compare the three approaches we asked our test users to exploit each of them on our folksonomy fragment for a week; information gathered on user activity allowed the three systems to construct rich user profiles. After this period, we performed the proactive resource recommendation activity by means of the three approaches. For each approach, we computed Precision, Recall and F-Measure for each performed resource recommendation and we averaged the corresponding values. The obtained results are reported in Table 5. The analysis of this table indicates that: (i) the approach of Tso-Sutter et al. (2008) achieves the best Recall and the lowest Precision; (ii) the approach of Zhao et al. (2008) obtains the best Precision and the lowest Recall; (iii) our approach achieves high values of Precision and Recall and the best value of F-Measure. Observe that the values of Precision, Recall and F-Measure obtained in this experiment are generally lower than those obtained in the experiment described in Sect. 4.3; this trend was expected and is fully explainable if we consider that the resource recommendation in this experiment is proactive whereas in the previous experiment was reactive. The results of Table 5 can be explained by considering the approach policies used for selecting tags and building user profiles. In fact, the approach of Tso-Sutter et al. (2008) uses all tags present in the folksonomy (even those infrequently exploited by users) to complete the user-item matrix. As a consequence, since no filter is applied on tags, no resource someway related to user needs is missed, and this implies high

123

A query expansion and user profile enrichment approach

73

values of Recall. However, some of the proposed resources could be uninteresting to the user, and this lowers Precision. An opposite behaviour can be observed for the approach of Zhao et al. (2008); in fact, in this approach, the similarity of two users is high if the value of the maximum weight matching computed on a graph representing the sets of tags used by both of them is high. This condition is reasonably satisfied if: (i) the two users share a large number of tags, and (ii) the tags used by them are strongly related each other. This behaviour offers solid guarantees of recommending resources really relevant to a user, and this has a positive impact on Precision; however, it is very “severe” and, therefore, it could discard resources potentially interesting for the user, and this has a negative impact on Recall. Our approach applies a voting mechanism to selected tags. Due to this mechanism, only tags strongly related to user needs and desires are added to his profile. Therefore, if two users have similar profiles, they really share the same needs and desires. As a consequence, when X-Compass is applied on the user profiles handled by our approach it can achieve good values of Precision. The tags selected by our approach are also “authoritative” and, therefore, each of them presumably refers to many resources; this allows X-Compass to retrieve a large number of resources, having a positive impact on Recall. 4.5 Summary of the limitations of our approach After having examined both the technical details of our approach and the tests carried out to evaluate its performance, we are able to discuss its main limitations. These can be summarized as follows: • Our approach considers the “authoritativeness” of tags in selecting other tags useful to expand user queries. These tags are likely to point to many relevant resources but they could be not related to real user needs and preferences. As a consequence, the usage of “authoritative” tags may imply the insertion, in the set of retrieved results, of some resources which are later regarded as “noise” by the user. This leads our approach to achieve a Precision lower than the one obtained by other approaches, like those described in Sigurbjornsson and van Zwol (2008) and Zhao et al. (2008). • Our approach does not consider the specific reference context when it proposes a tag to the user. Actually, a tag might have different meanings in different contexts (think, for instance, of the tag “apple” in a context referring to fruits and in another one referring to computer science); on the other hand, different tags might represent the same meaning in a given context (think, for instance, of the tags “employee” and “subordinate” in a context referring to job). Finally, a community of users may create its own vocabulary (possibly including acronyms and jargon terms) and the meaning of a tag within a community could be totally different from that used in common language. This problem could lead our system to provide its users with some suggestions that are rejected by them. Taking the reference context into account could help to solve this problem. For this purpose the algorithms proposed in De Meo et al. (2009) could be applied.

123

74

P. De Meo et al.

These algorithms aim at finding synonymies and homonymies between tags; they perform this task by grouping tags into semantically homogeneous clusters each of which can be seen as a context. In this way a reference context could be associated with each tag. Another possible improvement which could help to solve this problem would be the idea to not consider the R-relatedness and the U-relatedness properties as independent similarity indicators but to jointly use them. By following this idea, two tags would be considered related in a given context if many resources referring to this context have been jointly labelled by them and, at the same time, many users have exploited them to label these resources. • Our approach applies a voting mechanism based on the Borda Count technique to select the tags candidate to expand user queries and to filter out the tags considered not relevant. Some of the filtered tags could direct to resources partially relevant to the user. This leads our approach to obtain a Recall lower than the one achieved by other approaches, like those described in Tso-Sutter et al. (2008) and Zanardi and Capra (2008). • Our approach uses some thresholds which can be tuned to prioritise Precision over Recall, or vice versa. This increases its flexibility. Clearly, setting these thresholds would require a certain level of experience (and a certain effort) to the user in order to maximize the performance of our approach. However, from our experiments, we have found a parameter configuration which can guarantee the best trade-off between Precision and Recall; this configuration can be regarded as the most satisfactory one by most of the users of our approach. 4.6 Conclusions From the analysis of the experiments presented in this paper it is possible to draw the following conclusions: • A user who wants to receive many resource proposals, running the risk that a certain number of them are uninteresting, can choose UnPers or Hyb with a value of α close to 1. • A user who wants to receive only few but reliable resource proposals, running the risk to filter out also resources potentially interesting for him but not labelled with tags already present in his profile, can choose Pers or Hyb with a value of α close to 0. • A user who does not want to privilege Precision over Recall, or vice versa, can choose Hyb with a value of α close to 0.5. • In presence of low taggers the differences among the three strategies are quite slight. By contrast, the differences discussed in the previous items are much stronger for heavy taggers. • UnPers is able to discover a higher number of ll-bookmarks than Pers and Hyb. 5 Related works In this section we examine some approaches conceived to produce recommendations to folksonomy users and highlight their similarities and differences with respect

123

A query expansion and user profile enrichment approach

75

to our approach. An extensive survey on the usage of semantic technologies in the context of Web 2.0 and, in particular, of folksonomies can be found in Torre (forthcoming). In Zanardi and Capra (2008) the authors propose Social Ranking, an approach to answer queries of folksonomy users by applying the collaborative filtering technology. In Social Ranking a tag similarity measure is defined; this measure considers how many times a tag has been used in resource labelling. A user query Q is parsed and, for each tag ti ∈ Q, the tags most similar to ti are selected and appended to Q. The selection of the tags completing Q is based on the K-Nearest Neighbour algorithm. There are some similarities between our approach and Social Ranking; specifically, both of them: (i) provide a mechanism to expand user queries; (ii) represent a folksonomy as a three-dimensional data structure relating users, tags and resources and provide a mechanism to map it into two bi-dimensional data structures; these consist of two square matrixes in Social Ranking and in two graphs in our approach. As for the main differences we observe that: (i) in Social Ranking queries are expanded by applying the K-Nearest Neighbour search algorithm; on the contrary, our approach performs multiple IDDFS searches on both TRG and TUG to build lists of candidate tags and merges these last ones by applying the Borda Count technique; (ii) the notion of tag similarity adopted in Zanardi and Capra (2008) is symmetric; on the contrary, our approach defines a pair of asymmetric measures inspired to support and confidence measures introduced in association rules. In Sigurbjornsson and van Zwol (2008) the authors propose a tag recommendation strategy to annotate photos in Flickr. Specifically, given a photo p that a user wants to label and a set T of tags already specified by him for this purpose, this approach builds a list of candidate tags for each tag in T on the basis of their co-occurrence frequency. A score is associated with each candidate tag; this score depends on many parameters (for instance, very general tags like “2006” are regarded as not significant and, then, they are penalised). We can recognise some similarities between our approach and the one proposed in Sigurbjornsson and van Zwol (2008). Specifically, both of them: (i) construct lists of candidate tags for each tag specified by the user; (ii) propose a technique to score candidate tags and to generate a final list of recommendations. As for the main differences, we observe that: (i) the goal of the approach of Sigurbjornsson and van Zwol (2008) is quite different from our approach’s one; in fact, the approach of Sigurbjornsson and van Zwol (2008) helps a user to choose the tags best describing the photos he is willing to post; on the contrary, our approach has been designed to help a user to find resources relevant to his needs and/or desires; (ii) the process of identifying candidate tags proposed in Sigurbjornsson and van Zwol (2008) relies on a statistical analysis of tag properties; on the contrary, our approach assigns a score to each tag depending on both its “importance” in TRG and TUG and its distance from the starting node. In Jaschke et al. (2007) the authors propose FolkRank, an approach to identify the potentially relevant tags to a folksonomy user. FolkRank converts a folksonomy into a tripartite and undirected graph whose nodes represent tags, users and resources. After this, it applies the PageRank algorithm twice on this graph; the first time it applies a modified version of PageRank which considers user preferences, whereas the second

123

76

P. De Meo et al.

time it applies the traditional PageRank. Finally, it computes the difference of fixpoints returned by these two executions and suggests to the user the tags corresponding to the top-N nodes of the difference vector. There are some similarities between our approach and FolkRank; specifically, both of them: (i) rely on a weight propagation algorithm which allows the computation of the importance of a tag; (ii) consider user preferences in the computation of the importance of a tag. As for the main differences between FolkRank and our approach we observe that: (i) FolkRank applies PageRank on a tripartite graph representing users, resources and tags; on the contrary, our approach applies PageRank on two directed graphs representing only tags; (ii) our approach is query-dependent, since its recommendations depend on the query submitted by the user; by contrast, FolkRank is query-independent; (iii) FolkRank applies a modified version of PageRank which relies on a preference vector whose components model user preferences; as a consequence, it produces a list of folksonomy tags sorted on the basis of these preferences; by contrast, our approach uses relationships among tags to enrich queries and profiles which are, then, processed by a content-based or a collaborative filtering Recommender System; this allows folksonomy resources to be ranked according to user preferences; in addition, in our approach, the rank of a tag is computed by applying the traditional version of PageRank on TRG and TUG, and, then, it depends only on the topological properties of these graphs; (iv) our approach associates a score with each tag; this score depends on the rank of the tag as well as on its distance, in TRG or TUG, from each tag appearing in the user query into examination; such a feature is not present in FolkRank because it is query-independent; (v) our approach computes two scores for each tag (and, therefore, it evaluates each tag from two different points of view); by contrast, FolkRank computes only one score for each tag. In Lerman et al. (2007) a probabilistic generative method to represent the behaviour of Flickr users is proposed. This method applies the Expectation Maximization algorithm to compute the probability that a photo is relevant to a given user. There are few similarities between our approach and the one of Lerman et al. (2007). Specifically, both of them: (i) assume that tags are a reliable indicator of user interests; (ii) exploit information about user interactions to personalise search results; this information is encoded in the U-relatedness concept in our approach whereas it coincides with the list of user contacts in the approach of Lerman et al. (2007). The main differences between the two approaches are the following: (i) the generative model adopted in Lerman et al. (2007) is very refined and capable of capturing relationships among tags, users and resources; experimental studies indicate that it achieves a high degree of accuracy and, then, that it effectively solves the information overload problem; however, some hypotheses concerning the underlying model (e.g., the fact that photos are tagged only by their owners) may be not valid in other contexts; (ii) the approach of Lerman et al. (2007) requires a time-consuming training phase; by contrast, our approach does not require any training activities. In Firan et al. (2007) the authors illustrate a tag-based recommender system for the users of Last.fm, a music portal allowing its users to tag available tracks. The approach of Firan et al. (2007) represents the profile of a user u as a list of tags; a score is associated with each tag t; this score depends on both the number of times

123

A query expansion and user profile enrichment approach

77

u listened to a track labelled with t and the number of times a track labelled with t has been listened by the other users; too popular tags are penalised. User profiles constructed in this way are then exploited by various recommender algorithms to produce their recommendations. There are some similarities between our approach and the one of Firan et al. (2007). Specifically, both of them: (i) represent a user profile as a set of tags; (ii) associate a relevance coefficient with each tag in the user profile. As for the main differences, we observe that: (i) our approach provides a methodology that expands user queries on the basis of the relationships among tags; on the contrary, the approach of Firan et al. (2007) does not perform any query expansion and tracks the behaviour of a user to determine the list of tags best representing his preferences; (ii) our approach includes a relevance feedback mechanism to update user profiles; such a feature is not present in the approach of Firan et al. (2007); (iii) the approach of Firan et al. (2007) exploits several recommending strategies, each tailored to a specific kind of user; this feature is not available in our approach. In Amer-Yahia et al. (2008a) the authors describe X.QUI.SITE, a system for producing recommendations in social tagging systems like del.icio.us. X.QUI.SITE has been designed to produce various kinds of recommendations; in fact, it can recommend resources, tags and people. Given a user u, it generates a set of seed users, i.e., users helpful in the discovery of recommendations for u. Specifically, it builds two kinds of seed users; the former consists of all users sharing a significant amount of tags with u, whereas the latter consists of all users who have bookmarked a large fraction of the URLs already bookmarked by u. After seed users have been constructed, X.QUI.SITE applies suitable algorithms on them to generate its recommendations. Some similarities between X.QUI.SITE and our approach can be recognised. Specifically, both of them: (i) use a notion of “context” to produce recommendations; in X.QUI.SITE this context is given by the set of seed users, whereas our approach uses TRG and TUG for this purpose; (ii) are able to operate with a wide range of recommender systems. As for the main differences, we observe that: (i) X.QUI.SITE has been designed to recommend not only resources but also tags and people; (ii) our approach is querydependent whereas X.QUI.SITE is query-independent. In Tso-Sutter et al. (2008) the authors propose an approach that exploits relationships among tags, users and resources to enhance the accuracy of a collaborative filtering recommender system. This approach first constructs a matrix, called useritem matrix, whose generic element M[i, j] indicates if the user u i has considered the resource r j relevant to him. After this, it constructs two further matrixes, starting from the user-item matrix, by considering the tags previously specified by users and the similarities among users, respectively. The recommender system to enhance is, then, applied to these two matrixes and two independent sets of item ratings are obtained. The top-N ranked items are finally proposed to the user. As for the main similarities between our approach and the one of Tso-Sutter et al. (2008), we can observe that both of them: (i) use the “triadic” structure of folksonomies, i.e., consider the relationships among users, tags and resources; (ii) associate multiple ratings with a single resource (in the approach of Tso-Sutter et al. 2008)

123

78

P. De Meo et al.

or with a single tag (in our approach) and provide a methodology to combine these ratings. As for the main differences we observe that: (i) in the approach of Tso-Sutter et al. (2008) the enrichment of the quality of provided recommendations is performed by suitably filling the empty elements of the user-item matrix; on the contrary, our approach carries out multiple IDDFS searches on both TRG and TUG and merges the obtained results by means of the Borda Count technique; (ii) the approach of Tso-Sutter et al. (2008) is query-independent whereas our approach is query-dependent. In Xu et al. (2006) the authors propose an approach to identify a set of tags to label a folksonomy resource. Specifically, for each resource r , this approach considers the whole set T of tags adopted by all the folksonomy users to label r and associates a score, called goodness, with each tag t ∈ T ; this score assesses how good is t in labelling r ; it is computed by applying the HITS algorithm Kleinberg (1999). Once goodnesses have been computed, a set of iterations based on the penalty-reward mechanism is performed; at each iteration, the tag t ∈ T with the highest goodness is selected, associated with r and removed from T . After this, the goodness of each tag t ∈ T is recomputed; specifically, it is reduced if the information provided by t is redundant with respect to the one provided by the already selected tags, whereas it is increased if t co-occurred with at least one of the already selected tags. The algorithm ends when a pre-defined number of iterations has been carried out. There are some similarities between the approach of Xu et al. (2006) and ours. Specifically, both of them: (i) associate a relative score, rather than an absolute one, with a tag; (ii) consider the co-occurrence patterns of tags in labelling a resource. As for the main differences between the two approaches, we observe that: (i) their goals are different; in fact, the approach of Xu et al. (2006) aims at finding the set of tags best describing a resource, whereas our approach explores tag correlations (encoded in TRG and TUG) to determine if two users (or a user and a resource) are similar; (ii) in Xu et al. (2006) the goodness of a tag is computed by applying the HITS algorithm on the bipartite graph formed by users and tags, whereas our approach computes the score of a tag by applying the PageRank algorithm on TRG and TUG. In Zhao et al. (2008) a tag-based collaborative filtering recommender system is proposed. Given two users u and v, this system considers the set of items rated by them, along with the set T of tags used to label these items. Each tag of T is mapped onto a node of the Concept Tree of WordNet and the semantic similarity of two tags is determined by computing the length of the shortest path connecting the corresponding nodes. Once these similarities have been computed, a bipartite and weighted graph G is built. The nodes of G correspond to the tags of T ; the weight of an edge in G is set equal to the semantic similarity of the tags associated with the nodes linked by it. The maximum weight matching algorithm is, finally, applied on G to determine the similarity existing between u and v. Once all user similarities have been found, the system performs its recommendations as follows: for each user u it determines the top-N nearest users to u and applies any collaborative filtering recommender system on both u and each of these users to find those items to be recommended to u. The system of Zhao et al. (2008) and our own are similar in that both of them aim at improving the performance of a recommender system by handling additional information provided by tags. In our approach this is achieved by constructing patterns of

123

A query expansion and user profile enrichment approach

79

co-occurrences of tags, whereas in the approach of Zhao et al. (2008) this is obtained by exploiting WordNet to determine their similarity degree. As for the main differences between the two systems we observe that: (i) the system of Zhao et al. (2008) has been explicitly conceived to improve the performance of a collaborative filtering recommender system; by contrast, our system can support both content-based and collaborative filtering recommender systems; (ii) in our system tags are selected on the basis of their centrality in TRG and TUG, whereas in the system of Zhao et al. (2008) they are selected on the basis of their semantic similarity. In Schwarzkopf et al. (2007) the authors propose an approach to create profiles for folksonomy users. This approach first applies association rules to organise tags in a hierarchy. After this, given a tag t and a user u, it computes the Jaccard Coefficient between the set of tags exploited by u and that formed by t and all its sub-tags in the induced hierarchy; the resulting value indicates how much t is relevant to u. Relevant tags are inserted in the profile of u even if they have not been explicitly used by him. There are some similarities between our approach and the one of Schwarzkopf et al. (2007). Specifically, both of them: (i) explore the space of tags to create user profiles; (ii) analyse the co-occurrence patterns of tags to find relationships among tags; in both the approaches, these relationships can be intuitively represented as association rules. As for the main differences, we can observe that: (i) the approach of Schwarzkopf et al. (2007) organises tags in a hierarchical fashion; for this purpose it defines a subsumption relationship between tags which resembles our definition of R-relatedness; by contrast, our approach uses two graphs which encode two different kinds of relationships between tags (i.e., they specify if tags co-occur in labelling resources and if they are frequently adopted together by two users); (ii) the approach of Schwarzkopf et al. (2007) does not perform tag ranking tasks. In van Setten et al. (2006) the authors discuss the role of annotations in supporting users to find relevant information. They consider various kinds of annotations among which there are social annotations that correspond to our notion of tags. The authors conjecture that annotations not only describe a resource but also reflect the opinion of the “annotator” about it. They argue that the knowledge of the annotator is a precious tool to find useful information; in fact, if a user believes that a particular annotator is authoritative, then it is likely that the resources tagged by that annotator will be of interest to him. The authors present four case studies giving an anecdotal evidence of their theses and indicate some practical ways to include the knowledge of annotators in collaborative filtering and case-based recommender systems. As for the main similarities between our approach and the one of van Setten et al. (2006), we observe that both of them: (i) propose a methodology to handle controversies (i.e., the absence of a global consensus in evaluating the relevance of a tag to a user); to perform this task, the approach of van Setten et al. (2006) relies on information like the role or the affiliation of an annotator, whereas our approach uses the Borda Count technique; (ii) propose the usage of tags to enrich user profiles. As for the main differences we observe that: (i) our approach exploits tag relationships to help a user in finding information; by contrast, the approach of van Setten et al. (2006) relies on the knowledge of annotators; (ii) our approach provides a mechanism to rank tags; such a feature is not present in the approach of van Setten et al. (2006).

123

80

P. De Meo et al.

In Carmagnola et al. (2007) the authors investigate on the contribution that the analysis of the tagging activity can bring to the construction of user profiles. Specifically, they describe an experimental analysis performed by them on the users of iCITY, a system that provides suggestions on cultural events and allows users to tag these events. Tags are analysed with the support of WordNet in such a way to classify them in various categories (e.g., subjective tags, if they reflect the user standpoint, or free tags, if they are not derived from the textual description of the cultural event). The tagging behaviours of users are then analysed by considering these categories in such a way to infer the corresponding user profiles (for instance, if a user generally selects the most popular tags, then it is possible to conclude that he has a high level of trust in the community members; if he often specifies the same tags, it is possible to conclude that he is inclined to regular habits, and so on). There are some similarities between our approach and the one of Carmagnola et al. (2007); specifically, both of them: (i) exploit tags to construct user profiles; (ii) build a knowledge base by inspecting the involved folksonomy; specifically, the approach of Carmagnola et al. (2007) adopts a set of categories to classify tags, whereas our approach represents the relationships between tags according to the notions of R-relatedness and U-relatedness. As for the main differences between these two approaches we observe that: (i) the approach of Carmagnola et al. (2007) focuses on building complex and refined user profiles by analysing the tagging behaviours, whereas our approach focuses on improving the quality of suggestions provided by a recommender system by including the knowledge coming from a folksonomy; (ii) in our approach tags are ranked according to their relevance for a user query; such a feature is not present in the approach of Carmagnola et al. (2007). In Table 6 we report a summarization of all the related approaches along with their similarities and differences with ours.

6 Conclusions In this paper we have presented an approach aimed at improving the quality of suggestions to users browsing a folksonomy. Our approach builds and maintains off-line two suitable data structures, namely a Tag Resource Graph (TRG) and a Tag User Graph (TUG); nodes in TRG (resp., TUG) represent tags whereas an arc between two nodes indicates that the corresponding tags frequently co-occur in labelling resources (resp., users frequently exploit together the corresponding tags to label their resources). Each time a user submits a query Q, our system parses it and, for each encountered tag, constructs two lists of candidate tags, derived by browsing TRG and TUG, respectively. All the lists of candidate tags constructed in this way are then merged by applying the Borda Count technique. The final list of tags is presented to the user who can accept or reject each tag of the list. Accepted tags are appended to Q and, along with those directly specified in Q by the user, are exploited to update his profile. In the future we plan to extend our work by applying stemming techniques on query tags in such a way to find all the morphological variants of a tag. Stemming

123

Yes Yes No Yes Yes Yes Yes

Yes

No

No

No

No

No

No

Approach of Zhao et al. (2008) Approach of Schwarzkopf et al. (2007) Approach of van Setten et al. (2006) Approach of Carmagnola et al. (2007)

Yes

Yes

Social Ranking (Zanardi and Capra 2008) Approach of Sigurbjornsson and van Zwol (2008) FolkRank (Jaschke et al. 2007) Approach of Lerman et al. (2007) Approach of Firan et al. (2007) X.QUI.SITE (Amer-Yahia et al. 2008a) Approach of Tso-Sutter et al. (2008) Approach of Xu et al. (2006) Yes No No No

No

No

No

No

Yes

Yes

Our Approach

Does it support a mechanism for scoring tags?

Does it support query expansion?

System

Table 6 A comparison between our approach and the related ones

Yes

Yes

Yes

Yes

No

Yes

No

Yes

Yes

Yes

Yes

No

Yes

Does it model user preferences?

Yes

Yes

Yes

Yes

No

Yes

No

No

No

No

Yes

No

Yes

Does it support profile enrichment?

Usage of information on annotators Construction of a knowledge base by tracking user behaviours

HITS plus a penalty-reward algorithm Collaborative filtering plus WordNet Association rules

Collaborative filtering

Recommender system

Collaborative filtering

EM algorithm

Personalised PageRank

Statistical analysis of tags

K-nearest neighbour

IDDFS and Borda count

Exploited technology

A query expansion and user profile enrichment approach 81

123

82

P. De Meo et al.

techniques are a powerful tool to deal with the lexical structure of tags; they are able to find substrings contained in a tag denoting a more general concept than the tag itself. Moreover, we plan to analyse the possibility to modify (instead of expanding) a user query, when either the system or the user realises that a query expansion is not enough to satisfy user exigencies. In addition, we plan to investigate the possibility to specify semantic relationships among tags. For instance, these relationships could specify if two tags are synonymous (i.e., they represent the same concept yet having different names), homonymous (i.e., they represent different concepts yet having the same name), or one is hyponymous of the other (i.e., one has a more specific meaning than the other), and so on. As a further extension of our approach we plan to study the possibility to construct “compressed versions” of TRG and TUG by removing the least important nodes and/or by clustering semantically homogeneous nodes in such a way to represent them by a unique “supernode”. The resulting graphs would contain a reduced number of nodes and arcs with respect to the original graphs, and this would meaningfully speed up the execution of the algorithms presented in this paper. Clearly, it would be necessary to investigate how much the approximations induced by these “graph compressions” impact on the quality of provided recommendations. Acknowledgments The authors thank the anonymous Referees whose precious suggestions allowed them to greatly improve the quality of this paper.

Appendix A: Fast and approximate computation of R-relatedness and U-relatedness In this section we illustrate a technique for efficiently computing the Jaccard coefficient |A∩B| |A∪B| of two sets A and B. After this, we show how this technique can be adapted to approximately compute |A ∩ B| and |A ∪ B|. This technique plays a key role in our approach in order to determine if two tags are R-related or U-related (see Definitions 2.4 and 2.5). Consider a set U (also known as the universe) and let A and B be two subsets of U. Let P be a permutation of U and let π A (P) (resp., π B (P)) be the first element of P belonging to A (resp., B). Assume, now, to perform  permutations of U independently and randomly (here,  is an arbitrary fixed integer) and let P1 , . . . , P be these permutations. We define the signature sig A (resp., sig B ) of A (resp., B) as an array that satisfies the following properties: (i) sig A (resp., sig B ) has  entries; (ii) sig A [i] = π A (Pi ) (resp., sig B [i] = π B (Pi )), 1 ≤ i ≤ . Let us define a coefficient jˆ as the ratio of the number of pairwise matches of sig A and sig B to ; in other words, jˆ can be computed by counting how many times sig A [i] = sig B [i], i = 1 . . . , and by dividing this value by . Interestingly enough, the computation of jˆ requires only to scan sig A and sig B . It is possible to show that jˆ is a good estimator of the Jaccard coefficient (Cohen 1997). With regard to this, Cohen (1997) showed that the probability that jˆ differs from the true value of the Jaccard coefficient is exponentially decreasing with . An interesting analysis about the tuning of  has been performed in Chen et al. (2003); in

123

A query expansion and user profile enrichment approach

83

this paper the authors showed that, in their reference scenario,  should range between 20 and 100. The main drawback of this technique regards the ability to efficiently generate  permutations of U. This activity could be excessively time consuming and could make the described approach unfeasible. A solution to this problem has been outlined in (Broder 1997; Chen et al. 2003; Cohen 1997) where the authors propose to exploit suitable hash functions to emulate a permutation of U. The previous result can be used also to efficiently compute the cardinality of both the intersection and the union of two sets. In fact, from the set theory we have that |A ∪ B| = |A| + |B| − |A ∩ B|. Now, if jˆ is a good estimator of the Jaccard coeffi|A∩B| ˆ ˆ ˆ ˆ cient we have that: jˆ |A∩B| |A∪B| ⇒ j |A|+|B|−|A∩B| ⇒ |A| j + |B| j − |A ∩ B| j |A∩ B| ⇒ |A∩ B| we have that |A ∪ B|

jˆ ·(|A|+|B|); in addition, since |A∪ B| = ˆ j+1 ˆ . = |A| + |B| − ˆ j (|A| + |B|) = |A|+|B| ˆ j+1 j+1

|A|+|B|−|A∩ B|

ˆ |A| and |B| are known. As a consequence, |A ∩ B| and |A ∪ B| can be estimated if j, ˆj can be estimated by applying the technique illustrated above. |A| and |B| are immediately available if each set is provided with a supplementary variable denoting its current size. This variable is updated each time one or more elements are inserted in or removed from the set. This is an important result; in fact, it specifies that the computational complexity to determine if two tags are R-related (resp., U-related) is O(), rather than O( min{|πtRj S |·log |πtRi S |, |πtRi S |·log |πtRj S |}) (resp, O( min{|πtUj S |·log |πtUi S |, |πtUi S |· log |πtUj S |})), as required by a classical approach; at the same time we have strong guarantees that this estimation is highly reliable if  is large enough. Example 6.1 In order to illustrate how the technique presented above works, consider the following scenario: U = {1, 2, 3, 4, 5}; A = {1, 2, 3}; B = {2, 3, 4}. If we set  = 4 we must consider four permutations of U. As an example, consider the following permutations: P1 = {5, 3, 4, 2, 1}, P2 = {4, 3, 1, 5, 2}, P3 = {1, 2, 4, 3, 5}, P4 = {3, 1, 2, 4, 5}. In this case: sig A [1] = π A (P1 ) = 3, sig B [1] = π B (P1 ) = 3, sig A [2] = π A (P2 ) = 3, sig B [2] = π B (P2 ) = 4,

123

84

P. De Meo et al.

sig A [3] = π A (P3 ) = 1, sig B [3] = π B (P3 ) = 2, sig A [4] = π A (P4 ) = 3, sig B [4] = π B (P4 ) = 3. Since sig A [1] = sig B [1] = 3, sig A [2] = sig B [2], sig A [3] = sig B [3] and sig A [4] = sig B [4] = 3, we have that jˆ = 24 = 0.5, |A ∩ B| = 0.5 1.5 · (3 + 3) = 2, ˆ |A ∩ B| and |A ∪ B| obtained |A ∪ B| = 3+3 = 4. In this example the values of j, 1.5 by applying the proposed heuristics coincide with the actual values of the Jaccard coefficient as well as with the actual values of |A ∩ B| and |A ∪ B|. References Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the stateof-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005) Amer-Yahia, S., Galland, A., Stoyanovich, J., Yu, C.: From del.icio.us to x.qui.site: recommendations in social tagging sites. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD 2008), pp. 1323–1326. ACM Press, Vancouver (2008a) Amer-Yahia, S., Markl, V., Halevy, A.Y., Doan, A., Alonso, G., Kossmann, D., Weikum, G.: Databases and web 2.0 panel at VLDB 2007. SIGMOD Rec. 37(1), 49–52 (2008b) Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley Longman, Prentice Hall, Boston, MA (1999) Broder, A.: On the resemblance and containment of documents. In: Proceedings of the International Conference on the Compression and Complexity of Sequences (SEQUENCES ’97), pp. 21–29. IEEE Computer Society, Positano (1997) Carmagnola, F., Cena, F., Cortassa, O., Gena, C., Torre, I.: Towards a tag-based user model: how can user model benefit From tags? In: Proceedings of the International Conference on User Modeling (UM 2007), Corfù, Greece. Lecture Notes in Computer Science, pp. 445–449. Springer (2007) Cattuto, C., Schmitz, C., Baldassarri, A., Servedio, V.D.P., Loreto, V., Hotho, A., Grahl, M., Stumme, G.: Network properties of folksonomies. Artif. Intell. Commun. 20(4), 245–262 (2007) Chen, Z., Korn, F., Koudas, N., Muthukrishnan, S.: Generalized substring selectivity estimation. J. Comput. Syst. Sci. 66(1), 98–132 (2003) Cohen, E.: Size-estimation framework with applications to transitive closure and reachability. J. Comput. Syst. Sci. 55(3), 441–453 (1997) Delicious. http://del.icio.us/, 2009 De Meo, P., Garro, A., Terracina, G., Ursino, D.: Personalizing learning programs with X-Learn, an XMLbased “user-device” adaptive multi-agent system. Inf. Sci. 177(8), 1729–1770 (2007a) De Meo, P., Quattrone, G., Terracina, G., Ursino, D.: An XML-based multi-agent system for supporting online recruitment services. IEEE Trans. Syst. Man Ad Cyber. A 37(4), 467–480 (2007b) De Meo, P., Quattrone, G., Ursino, D.: Exploitation of semantic relationships and hierarchical data structures to support a user in his annotation and browsing activities in folksonomies. Inf. Syst. 34, 511– 535 (2009) Firan, C.S., Nejdl, W., Paiu, R.: The benefit of using tag-based profiles. In: Proceedings of the Latin American Web Congress (LA-Web 2007), pp. 32–41. IEEE Computer Society, Santiago de Chile (2007) Garg,N., Weber, I.: Personalized tag suggestion for Flickr. In: Proceedings of the International Conference on World Wide Web (WWW ’08), pp. 1063–1064. ACM Press, Beijing (2008) Garruzzo, S., Modafferi, S., Rosaci, D., Ursino, D.: An XML-based agent model for supporting user activities on the web. Web Intell. Agent Syst. J. 4(2), 181–207 (2006) Golder, S.A., Huberman, B.A.: Usage patterns of collaborative tagging systems. J. Inf. Sci. 32(2), 198– 208 (2006) Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann Publishers, San Francisco (2006) Heymann P., Koutrika G., Garcia-Molina H.: Can social bookmarks improve web search? In: Proceedings of International Conference on Web Search and Data Mining (WSDM 2008), pp. 195–206. ACM Press, Stanford (2008)

123

A query expansion and user profile enrichment approach

85

Hotho, A., Jaschke, R., Schmitz, C., Stumme, G.: Information retrieval in folksonomies: search and ranking. In: Proceedings of the European Semantic Web Conference (ESWC’06), pp. 411–426. Lecture Notes in Computer Science. Springer, Budva (2006) Jaschke, R., Marinho, L.B., Hotho, A., Schmidt-Thieme, L., Stumme, G.: Tag recommendations in folksonomies. In: Proceedings of European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2007), pp. 506–514. Lecture Notes in Computer Science. Springer, Warsaw (2007) Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999) Lerman, K., Plangprasopchok, A., Wong, C.: Personalizing image search results on Flickr. In: Proceedings of the International Workshop on Intelligent Web Personalization (ITWP 2007), pp. 65–75. AAAI, Vancouver (2007) Li, R., Bao, S., Yu, Y., Fei, B., Su, Z.: Towards effective browsing of large scale social annotations. In: Proceedings of the International Conference on World Wide Web (WWW ’07), pp. 943–952. ACM Press, Banff (2007) Mika, P.: Ontologies are us: a unified model of social networks and semantics. Web Seman. 5(1), 5–15 (2007) Niu, W.T., Kay, J.: PERSONAF: framework for personalised ontological reasoning in pervasive computing. User Model. User-Adap. Inter. doi:10.1007/s11257-009-9068-2 (2009) Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall, Inc, Upper Saddle River (2002) Saari, D.G.: Chaotic Elections! A mathematician Looks at Voting. American Mathematical Society (2001) Schwarzkopf, E., Heckmann, D., Dengler, D., Kroner, A.: Mining the structure of tag spaces for user modeling. In: Proceedings of the International Workshop on Data Mining for User Modeling (DMUM’07), pp. 63–75. Corfù, Greece (2007) Sigurbjornsson, B., van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proceedings of the International Conference on World Wide Web (WWW ’08), pp. 327–336. ACM Press, Beijing (2008) Specia, L., Motta, E.: Integrating folksonomies with the semantic web. In: Proceedings of the European Semantic Web Conference (ESWC 2007), pp. 624–639. Lecture Notes in Computer Science. Springer, Innsbruck (2007) Torre, I.: Adaptive systems in the era of the semantic and social web, a survey. User Model. User-Adap. Inter. 19(5), 433–486 (2009) Tso-Sutter, K.H.L., Marinho, L.B., Schmidt-Thieme, L.: Tag-aware recommender systems by fusion of collaborative filtering algorithms. In: Proceedings of the ACM Symposium on Applied Computing (SAC 2008), pp. 1995–1999. ACM Press, Fortaleza (2008) van Setten, M., Brussee, R., van Vliet, H., Gazendam, L., van Houten, Y., Veenstra, M.: On the importance of “Who Tagged What”. In: Proceedings of the International Workshop on the Social Navigation and Community based Adaptation Technologies, pp. 552–561. Dublin, Ireland (2006) Xu, Z., Fu, Y., Mao, J., Su, D.: Towards the semantic web: collaborative tag suggestions. In: Proceedings of the International Workshop on Collaborative Web Tagging. Edinburgh (2006) Zanardi, V., Capra, L.: Social ranking: uncovering relevant content using tag-based recommender systems. In: Proceedings of the International Conference on Recommender Systems (ACM RecSys 2008), pp. 51–58. ACM Press, Lausanne (2008) Zhao, S., Du, N., Nauerz, A., Zhang, X., Yuan, Q., Fu, R.: Improved recommendation based on collaborative tagging behaviors. In: Proceedings of the International Conference on Intelligent User Interfaces (IUI ’08), pp. 413–416. ACM Press, Gran Canaria (2008)

Author Biographies Pasquale De Meo received the Laurea Degree in Electrical Engineering from the University Mediterranea of Reggio Calabria in May 2002 and the Ph.D. in System Engineering and Computer Science at the University of Calabria in February 2006. His research interests include user modelling, intelligent agents, e-commerce, e-government, e-health, machine learning, knowledge extraction and representation, scheme integration, XML, Cooperative Information Systems, Folksonomies, Social Internetworking.

123

86

P. De Meo et al.

Giovanni Quattrone received the Laurea Degree in Electrical Engineering from the University Mediterranea of Reggio Calabria in July 2003 and the Ph.D. in Computer Science, Biomedical and Telecommunications Engineering at the University of Reggio Calabria in February 2007. His research interests include user modelling, intelligent agents, e-commerce, machine learning, knowledge extraction and representation, scheme integration, XML, Cooperative Information Systems, Folksonomies, Social Internetworking. Domenico Ursino received the Laurea Degree in Computer Engineering from the University of Calabria in July 1995. From September 1995 to October 2000 he was a member of the Knowledge Engineering group at DEIS, University of Calabria. He received the Ph.D. in System Engineering and Computer Science from the University of Calabria in January 2000. From November 2000 to December 2004 he was an Assistant Professor at the University Mediterranea of Reggio Calabria. From January 2005 he is an Associate Professor at the University Mediterranea of Reggio Calabria. His research interests include user modelling, intelligent agents, e-commerce, knowledge extraction and representation, scheme integration and abstraction, semi-structured data and XML, Cooperative Information Systems, Folksonomies, Social Internetworking.

123

Suggest Documents