SFViz: interest-based friends exploration and ... - ACM Digital Library

8 downloads 689 Views 3MB Size Report
ABSTRACT. Friend recommendation is popular in social network services to help people make new friends and expand their networks. Friend recommendation ...
SFViz: Interest-based Friends Exploration and Recommendation in Social Networks Liang Gou1, Fang You2,*, Jun Guo2, Luqi Wu2, Xiaolong (Luke) Zhang1

College of Information Science & Technology1 The Penn State University

School of Communication and Design2 Sun Yat-Sen University

University Park, PA, USA

Guangzhou, China

[email protected], youfang@mail. sysu.edu.cn, [email protected], wuluqi@ student.sysu.edu.cn, [email protected] ABSTRACT Friend recommendation is popular in social network services to help people make new friends and expand their networks. Friend recommendation is either based on topological structures of a social network, or derived from profile information of users. However, dynamically recommending friends by considering both social connections and a context of social connections (e.g., similar interest) in a way of visual exploration is not well supported by existing tools. In this paper, we propose a novel visual system, SFViz (Social Friends Visualization), to support users to explore and find friends interactively under a context of interest. Our approach leverages both semantic structure of activity data and topological structures in social networks. In SFViz, a hierarchical structure of social tags is generated to help users navigate through a network of interest. Multiscale and crossscale aggregations of similarity among people are presented in the hierarchy to support users to seek potential friends. We report a case study using SFViz to explore the recommended friends based on people’s tagging behaviors in a music community, Last.fm. The results indicate that our system can enhance users’ awareness of their social networks under different interest contexts, and help users seek potential friends sharing similar interests in an interactive way.

ACM Classification Keywords H5.m. Information interfaces and presentation (e.g., HCI): Miscellaneous.

General Terms Design, Experimentation, Human Factors.

Keywords Social Recommendation, Social Tags, Social Network, SFViz.

INTRODUCTION Friend recommendation is a popular and important service in online social networks. This service suggests new potential friends so that people can expand their networks. Friend recommendation can utilize topological structures of a social ————————— *Corresponding author Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. VINCI’11, August 04- August 05 2011, Hong Kong, China. Copyright © 2010 ACM 978-1-4503-0875-5/11/08…$10.00.

network. In this approach, recommendation is purely based on how people are currently connected and whether friends of existing friends of a person can be new friends of the person. Recommendation can also include non-topological information. For example, one method is to use a user’s profile, such as education and professional background, to make recommendation. While this method can enhance recommendation performance, it largely focuses on static information, and overlooks changing interests of user. How to recommend friends by considering both existing social connections of a person and her dynamic interests? In this paper, we propose a novel approach to support users to explore and find friends interactively with different interests. More specifically, we first extract the information about user dynamic interest based on tags created by users in a social network service system. We construct tag networks to reflect a user’s interest and a hierarchical tag structure implying a knowledge structure shared among those people who generated them [1]. Then, we calculate similarities among people with the information of tag networks and social networks to recommend friends for users. We design a visualization system, SFViz (Social Friends Visualization), to help users explore recommendation with different interests in an interactively way. This paper is organized as the following. Section 2 reviews relevant literature on tag visualization and social recommendation visualization, profile-based and topology-based recommendation approaches. Section 3 presents our framework and a hybrid approach of social tags and social networks. Section 4 describes the visualization design and implementation of a system, SFViz, to support user interaction and exploration. Section 5 illustrates a case study of using SFViz for friend exploration and recommendation in a music community, Last.fm. The paper concludes with the discussion on the findings and implications of our research, and future research directions.

RELATED WORK AND BACKGROUND In this section we review relevant literature in the areas of tag visualization, social recommendation based user profiles, recommendation based on network structures, and the visualization of recommendation results.

Tag Visualization The most often seen visualization design for tags is tag clouds. In a tag cloud, popular tags are organized alphabetically and the size of a tag is weighted by the frequency of the tag [2, 3]. This approach primarily focuses on the presentation of tags and their popularity, but lacks in-depth information about relationships among tags. More sophisticated tag visualization designs have

been proposed. For example, tags can be positioned based on their semantic relationships [4]; or tag clouds can be organized in a multi-level structure to support semantic browsing [5]. Tags can also be embedded into certain knowledge structures, and then the tag visualization can help users see both individual tags as well as associated knowledge structures. For example, Klerkx and Duval used a pre-defined class schema to group and organize tags [6]. Gou et al. [1] developed an algorithm to cluster and aggregate tags and build a multiscale tag network. However, the visualization of such tag networks is still a challenge. Designs using a node-link approach can easily overwhelm users when networks become large, and more importantly, seeing knowledge structures as a flat network may not accurately reflect what users have in their minds.

Profile-based Recommendation Social recommendation system based on user profiles can offer a user suggestions on information (e.g., relevant articles) or products (e.g., movie titles) based on certain characteristics of the user. The system usually uses a collection of profiles of the user, and then calculates the similarity between the user and others based on their profiles. The profiles or preferences are constructed by either explicitly asking users to add descriptions or rate items or implicitly tracking users’ behaviors, such as web pages visited and purchasing history. There are mainly three approaches in recommendation systems: content-based filtering, collaborative filtering, and hybrid. Content-based filtering methods match the preferences or profiles derived from attribute information of users, such as their selfdescription and demographic data [7], and meta-data of items users visited or purchased [8, 9]. A user’s profile is represented as a vector of key terms, and similarity among user profiles can be calculated with machine-learning or information-retrieval techniques, such as naive Bayesian classifiers [10]. The limitations of this approach are two-folded: first, it only can deal with textual information and cannot work with non-text items, such as music or images; second, it ignores the information of social actions between users, such as the common interests of people from the same community Collaborative filtering makes automatic predictions (filtering) about the interests of a user by collecting taste information (ratings of items) from many users. For example, a collaborative filtering for books can predict in which book a user may be interested based on a partial list of the user's book tastes, as well as the book tastes of other users. Collaborative filtering algorithms can be either memory-based, which use pair-wised user-user similarity [11, 12] or item-item similarity [13, 14] to make recommendation [15], or model-based, which rely on data-mining or machine-learning techniques to build prediction morels with use rating data [16, 17]. One challenge collaborative filtering approaches face is the cold-start problem, which refers to the requirement for initial information of user ratings to make recommendation effectively [18]. Hybrid recommendation systems combine collaborative filtering and content information [19, 20]. For example, content-based predictors can be used as rating data to make recommendation with collaborative filtering [21].

Topology-based Recommendation Topology-based recommendation methods use intrinsic properties of network structure to determine similarity among nodes. Often seen topology-based approaches include Jaccard similarity [22] and cosine similarity [9], both of which are based on the assumption that two nodes are more similar if they have more common neighbors. Adamic and Adar [23] proposed an improved method by assigning different weights to different nodes (e.g., more weights to those nodes with fewer degrees). Recommendation can also be based on preferential attachment [24], a phenomenon that new links are more likely to connect to high degree nodes. Newman [25] proposed a method that links the probability of a new link to the product of the degrees of two nodes. Research by Zhou et al. [26] focused on using local topology to calculate node similarities. Katz [27] used global topology information (e.g., the total number of simple paths between nodes with lower weights to longer paths) to compare node similarity. More complex methods to use global topology measures include SimRank [28], Leicht-Holme-Newman (LHN) [29], and P-Rank [30], all of which define node similarity measures recursively, i.e., two nodes are similar if their directly connected neighbors are themselves similar. Although global topology-based methods leads to more complete node similarity measures by considering a boarder and better picture of the network, they are usually computationally expensive. Efforts have been made to reduce the complexity by using approximations. Gou et al. [31, 32] simplified the LHN method by clustering nodes hierarchically in a social network to reduce the graph size. Li et al. [33] approximated SimRank with an incremental updating method.

Visualization of social recommendation Some research has been done to use visualization to facilitate the understanding of social recommendation. FaceTag [34] is a tool that helps users see the relationship between user-generated tags and recommended facets in a formalized classification scheme. Kammerer et al. [35] designed a tag-based search browser to recommend relevant tags for further search. However, research on this stream only focuses on information and meta-information concerning documents, and ignores the users who contributed such information and relationships among those users. Several projects explore visualization of social recommendation in the context of social networks. PeerChooser [36] is a collaborative recommendation system in which users can not only explore a recommendation process visually, but also control at what level of granularity social filtering should be conducted. SmallWorlds [37] is an interactive system that allows users to see and manipulate their preferences and profiles used for social filtering, as well as recommendation results from different perspectives (e.g., recommended items, similar friends, and distant friends and items). These systems with graph-based visualization designs are not fully take advantage of the structure of social tags to navigate social recommendation in the visualization.

SFViz: A Hybrid Approach of Social Tags and Social Networks To support interactive exploring and seeking friends in a social network based on interest, we developed a SFViz (Social Friends Visualization) system. This system is based on a framework that

considers both social connections among users and user interests. Here, user interest information is derived from their tagging behaviors.

SFViz Framework SFViz framework is shown in Figure 1. This framework follows the idea of the reference model by Card et al. [38]. On the top, the data model takes social networks and social tags as input and then data model is transformed into visual forms to users for exploration and interaction. Users can manipulate both data model and visual forms on their demand. In data model, social tags are converted to a tag network (Section 3.1) and then a tag hierarchy is created with the tag network (Section 3.2). People in social networks are matched to the tag hierarchy. This results in a compound graph including both tree and network structure (Section 3.4). Then, actor similarity under a specified context of interest is calculated to recommend friends to users (Section 3.5). In the part of visual form, a compound graph is visually represented as an integration of RSF tree and circular network (Section 4). Various tools are designed to support user interaction with the compound graph. Recommended friends are visually encoded to the compound graph with similarity scores. Users can change a tag of interest to update recommended friends with the context information.

Figure 2. Tag-item network decomposition.

Generating Tag Hierarchy Hierarchical clustering of a tag network is a promising way to capture the main structure of knowledge [1]. Tags in a cluster usually have similar semantic meanings in a tag network. If we can hierarchically capture these clusters, a hierarchical knowledge structure can be generated. As shown in the previous work [1], a betweenness-based method, GN algorithm [41], to cluster a tag network hierarchically can generate a knowledge structure which is similar to the one in people’s minds. We used the same approach in [1] for this study. The idea is to apply GN algorithm to a network to generate the required number of clusters (sub-networks). Repeating this process to a cluster (sub-network) divides it into smaller clusters and generates a tree structure. For each cluster, we select the node with highest degree as the representative node (label). If a node has been selected the label for a higher level cluster, it cannot be selected again in lower level. For the details of implementation, please refer to [1]. Notice that the leaf nodes in a tag hierarchy are individual tags and the non-leaf nodes can be regarded as categories with broader meaning.

The Matched Compound Graph: Linking Social Network to Tag Hierarchy In SFViz, one design rationale is that using a tag hierarchy can help users explore and navigate their social connections. The task here is to create a matched compound graph linking social networks to the tag hierarchy generated in Section 3.3. We define a matched compound graph (MCG) in this study as follow. A compound graph is a graph consisting of two subgraphs of tree and network and a node matching schema linking network nodes to tree leaf nodes. An MCG is written as: Figure 1. SFViz framework: visualizing friend recommendation.

Constructing Tag Networks Representing Knowledge Structure Relations among social tags imply a knowledge structure share with people who generated them [1]. The tags in social tagging systems are used to describe users’ understanding of a labelled item. Tags assigned to the same items by users indicate an implicit semantic relation between them and can be linked together to form tag networks [39]. Thus, a tag network shows ontological structure of knowledge. We construct a tag network with relations among tags and items that these tags assigned to. This is similar to a 2-network decomposition approach in social network analysis [40]. As shown in Figure 2, if two tags are assigned to the same collection of items (Figure 2a), we create a link between two tags (Figure 2b). The links are weighted with the normalized number of shared items.

MCG = (T , N , ρ )

(1)

where T = (VT, ET) is a subgraph of tree with a node set, VT, and an edge set ET; N = (VN, EN) is a subgraph of network with a node set, VN, and an edge set EN; ρ is a mapping function: ρ (nT ) → nN , nT ∈VTL , nN ∈VN , and VTL is the set of leaf nodes in the tree T.

Matching Score A key issue here is how to assign actors in a social network to a tag tree, namely implementing the mapping function ρ . We first introduce a matching score, which measures how appropriate an actor is assigned to the smallest category (the deepest non-leaf node) defined in a tag hierarchy. Each actor from a social network has tags used to label items and each non-leaf node (category) in a tag tree also includes a collection of tags. The matching score is calculated by the overlapping of the tags from actors and a category. A social network node is linked to the category with largest matching score in the tree. Matching score, ms(u,C),

between the node u in social network and category c in the tree, is given by:

ms(u, C ) =



t ∈Tgs ( u ) ∩ allTgs ( C )



v∈allTgs ( C )

f (t ) ⋅ d (t ) (2)

f (v )

Figure 4. Node C1 and C33 are collapsed and the sub-nodes are aggregated and hidden in Figure 5a. This results in an aggregated view shown in Figure 5b. Notice that the edges between node C1 and node 6 are the aggregated results of the edge (1, 6) and edge (2, 6).

where f (t) is the frequency of a tag, t, used in the tag network, d(t) is the depth of the tag t, function Tgs (u) returns all tags the node u has, and function allTgs (C) returns tags from the sub-tree of C and all ancestors of C. An example to calculate a matching score is shown in Figure 3. Supposing we are calculating the matching score between node 3 and category C21, ms (3, C21). Node 3 has tags t1, t7, C21 and C2, namely Tgs (3)={ t1, t7, C21 , C2}, and all tags related to category C21 are allTgs (C21) ={ t1, t2, t3, C21 , C2}. Thus the overlapped tags are Tgs (u ) ∩ allTgs (C21 ) = {t1, C21 , C2 } . Suppose we have following frequency f (t1)=1, f (t2)=2, f (t3)=1, f (C21)=3, f (C2)=6, then the ms (3, C21)=( f(t1)d(t1)+ f(C21)d(C21)+ f(C2)d(C2))/((f (t1)+f(t2)+f(t3)+f(C21)+f(C2))=(1*3+3*2+1*6)/(1+2+1+3+6)= 15/13.

(b) (a) Figure 5. An aggregation view of a matched compound graph.

Actor Similarity Calculation The key component in a recommendation system is to calculate similarity scores of other objects to the object of interest. For friend recommendation, a recommendation algorithm is to calculate similarity scores of other people in a system to the current user, and then recommend people with Top-N similarity scores to the user. The actor similarity algorithm in SFViz considers both structure similarity in a social network and semantic similarity in a tag network. One important feature is that the semantic similarity is dynamically plugged into social actor similarity. Therefore recommended results are updated once a user changes the topic/category of interest.

Social Network Similarity Figure 3. An example to calculate a matching score.

With matching scores, we can assign the network node under the category with the largest score in a tree. Figure 4 shows an example of a matched compound graph, in which network nodes with numbers are assigned to a tree. For example, node 3 in the social network is under the category C12 in the tag hierarchy. Note that the category node (e.g. C12) which network node is linked to is the deepest non-leaf node in the tag hierarchy. The reason to use the deepest non-leaf node not leaf nodes to assign network node is because leaf nodes are too specific.

In social network, an important premise is “birds of a feather flock together” [42], which indicates that users tend to friend if they are similar in some way, such as shared interests, from the same school and collaborating on a project. With this premise, we can measure actor similarity by using topology of social work [40]. In SFViz, we use cosine similarity to calculate the actor similarity in social network. In this approach, two actors are similar if they share many neighbors in a social network. The actor similarity value AS (i, j) between actor i and j in a social network is defined as: AS (i , j ) = cos(i , j ) =

vi ⋅v j

|vi |⋅|v j |

(3)

, where vi is a row vector with 1 at if there is a connection with the other actor and 0 otherwise, and vi·vj denotes the dot-product of the two vectors. Thus, the numerator is the number of shared friends, and the denominator is the product of the Euclidean length of two vectors.

Tag Network Similarity (a) Figure 4. A matched compound graph.

(b)

Aggregation View We can generate an aggregation view of a matched compound by expending or collapsing non-leaf nodes in a tag tree. A collapsed non-leaf node is an aggregated node with aggregated edges by summing the edges of the nodes under collapsed nodes. Figure 5 shows an example of aggregating a matched compound graph in

Tag similarity is derived from a tag tree. A link in a tag network indicates semantic relation of a pair of tags assigned to items together. A tag tree is constructed by recursively detecting the closely-linked groups of tags in the tag network. Therefore, the structure of a tag tree also indicates semantic similarity of tags. In a tag tree, we have two intuitive observations about tag similarity. First, a tag is more similar with the tags from the same category (the same parent node) than these from other categories.

This is related to the number of hops between two tags in the tag tree. For example, for the tag tree shown in Figure 3, C11 is more similar with C12 (nhop=2) than C21 (nhop=3). Second, two tags with the same number of hops are more similar if the depths of them are lower. For example, in Figure 3, the similarity between C11 and C12 is larger than the one between C1 and C2. With the two observations, we have the tag similarity TS (i,j) between tag i and j: TS (i,j) =

1 ∑ k , k +1∈SP ( i , j ) 2 ( d ( k ) + d ( k +1))

(4)

graphs of tree and network, and social recommendation into visual forms. To help users explore and interact recommended friends in a compound graph, SFViz design should support: •

tag tree exploration and interaction, showing context and details information (parent-child relation, siblings nodes);



social network exploration and interaction, showing highly connected cliques, direct friends, a critical path to a friend and so on;



friend recommendation with context of interest, showing potential friends with specified interests with different granularities in a tag tree and how to reach these users.

, where SP(i, j) is a trail of nodes in the shortest path between node i and j, node k and k+1 are two consecutive nodes in the shortest path and d(k) is the depth of node k . Basically, the denominator in Equation (4) gives the weighted number of hops between node i and j in a tag tree, and the weight is the reverse of the average depth of two nodes. In Equation (4), we also define the tag similarity of a tag with itself (i=j) equals to d(i).

To meet these requirements, we design and implement SFViz with several key visualization techniques: a Radial, Space-Filling (RSF) technique to visualize a tag tree, a circle layout with edge bundling to show a social network, highlighted social recommendation views and several interactions.

Final Similarity Score

Layout Tag Tree with RSF

The final similarity score between two users is obtained by integrating social actor similarity and tag network similarity discussed above. Because each node in a social network is assigned to a category (which is a tag) in a tag tree, we can easily plug tag similarity into actor similarity. Let’s write the final similarity score, FSS (i, j), as:

The tag tree is represented with a Radial, Space-Filling (RSF) technique in RSFViz. The RSF uses nested circles to show the parent-child relationship: the root node in the centre of a circle and child nodes placed within the arc subtended by their parents. The RSF technique clearly illustrates the parent-child relationship in the tree and also node area to present node properties [43][44].

FSS (i, j)= α AS (i, j) +(1- α)TSnorm (Ci, Cj)

Figure 6a is an example of the RSF visualization of a tag tree structure shown in Figure 4a. The root node is placed in the center and shown in transparent. Tag nodes from a depth are assigned along a circle with color showing their depth. The tree hierarchy information is shown with inclusion relationship in the representation. We can also see that node width in a circle is proportional to the count of all its children and leaf nodes have a uniform size. Node width can be adjusted to show more or less details of this node and its descendants.

(5)

, where Ci is the parent tree node of the network node i, and α is control parameter with value 0.75 in the implementation. We can also obtain the similarity between node i and category Cj by aggregating the similarity of the people under this category: FSS (i, Cj)= ∑ j∈C j FSS (i , j )

(6)

This aggregation can be done at any level of a tag tree. After aggregating people under a category, the edges of people are also aggregated to generate an aggregate view of a social network. With formula (5) and (6), we can easily recommend Top-N people or tags to a user of interest by ranking similarity values.

VISUALIZATION DESIGN

Circularly Layout Social Network In SFViz, we use a circular layout to show a social network. Nodes in a social network are also the leaf nodes in the tag tree in a matched compound graph. To reflect the matched relationship, we use a RSF tree as the supporting structure and layout a social network over the RSF tree. The idea is to circularly arrange the

With SFViz framework shown in Figure 1 (Section 3.1), we need transform a matched compound graph consisting of two sub-

(a) (b) (c) (d) Figure. 6. A compound graph visualization: (a) Layout the tag tree in Figure 4a with Radial, Space-Filling (RSF); (b) A circular layout of a social network in Figure 4b overlaid on RSF tree; (c) RSF circular layout of the aggregated network in Figure 5; (d) A view after edge bundling with β =0.75.

Figure 7. Control points in hierarchical edge bundling.

Figure 8. Social recommendation view.

network nodes to corresponding positions in the circle outlined by the RSF tree, and then connects the node sectors within the circle. This design integrates both network and tree structures in a single graph without introducing extra nodes and links. Two examples of circular layout of social networks in SFViz design are shown in Figure 6b and 6c.In Figure 6b, the social network in Figure 4a is circularly placed along in a RSF tree. The aggregated social network of Figure 5 is also shown in Figure 6c. We show expanded parent tags as the context of child tags which are transparent and labelled with grey color. We use the color of tag with a higher level to encode an aggregated edge which collects tags from two different levels. For example, has the same the color of C1 is used for the edge between node 6 and C1 because C1 has a higher scale than node 6. The aggregated weight of an edge is shown with thickness. Both Figure 6b and 6c uses straight lines to show edges in circular layout, which results in a problem that some edges may be occluded by node sectors. For example, the node sector C1 interrupts the edge between node 6 and C1. To alleviate this issue, we introduce a technique of edge bundling in the following section.

Figure 9. Shared neighbors view.

chosen by a user, the similarity values are only from the social network similarity shown in Equation (3); otherwise we use Equation (5). Figure 8 shows an example with a center user “3” with purple color and top 4 recommended friends (user “6”, “8”, “9” and “1” from top to bottom in the ranking) with gradient colors from red to yellow. In this example, no tag/category of interests is selected, and thus the recommendation is only based on social network structure.

Shared Neighbors View Shared neighbors view shows shared friends between the current user and a recommended friend of interest. The common friends serve as an evidence to show why the current user may be interested in the recommended person. This view is activated by choosing a target recommended person. For example, in Figure 9, the shared friends between the current center user “3” (colored with purple) and a selected recommended user “6” are user “2” and “7”, who are shown in gray. The connections among them are also presented.

Interactions

Edge Bundling

Node Sector Distortion

One widely used approach used to improve the visual aesthetics and reduce the visual cluttering in graph-based visualization is edge bundling [45]. Edge bundling techniques coarse and abstract edges appearing in close area or edges related in some way. In SFViz, we use hierarchical edge bundling approach adapted from [45]. The idea is to bundle edges hierarchically with B-Spline curves by using control points of the centers of node sector areas in a RSF tree. Figure 6d shows the results of edge bundling with the soical network with bundling strength β =0.75 (a larger β , β ∈ [0,1] , leads to more closely bundled edges). Figure 7 shows the control points in the RSF tree with highlighted circles. The issue of occlusion among edge and node sector is alleviated after edge bundling.

Node distortion allows users to dynamically adjust the angular width of a node sector. By scrolling the mouse wheel to up, users can increase the width of a node, and down to decrease the width. The width of the children under a distorted node is changed proportionally. This distortion can provide increased details on nodes of interest. For example, in Figure 7, the sectors of collapsed nodes C1 and C33 are shrunk to decrease details compared with the two nodes in Figure 6d.

Ego-network View

Friend Recommendation View

An ego-network view shows the local network of a node which contains its direct neighbors and links among them. Users can activate a ego-network view by right clicking on a node. When this view is activated, only edges in the ego-network are shown and other edges are invisible. Examples are shown in our case study.

Recommendation View

Full Text Search Highlighting

Once a current user of interest is specified, a social recommendation view can be presented upon a user’s request. Friend recommendation results are encoded with a gradient color. The gradient color is calculated with the similarity score between the current user and the other one. If no tag category of interest is

The node labels (tags) are indexed with Lucene [46] and can be searched through a user query. Matched tags are highlighted in the RSF tree view. The search results can be controlled with DOI (Degree Of Interest) [48] specified by users. For example, with a DOI=0, only the matched nodes and their ancestors are shown in

results, but with a DOI=1, all nodes with a distance 1 to the matched nodes are presented.

Cross-scale and Multiscale Interaction A cross-scale view of the social network over the tag tree structure is obtained by expanding or collapsing a node. Users can double click a collapsed non-leaf sector to expand a sector to show the details of its children. Similarly, users can collapse a sub-tree by double clicking an expanded node. With control key down when clicking, users can expand or collapse several nodes. SFViz also supports multiscale exploration of a compound graph. If users want to get an aggregated view of network at a specific level on the tag tree, using clicking becomes tedious. A slider with scales of a tag tree is presented to users to control directly at which scale a network should be aggregated and displayed. This will present aggregated network patterns at different scales of interest.

Figure 11. Edge weight distribution plot in the tag network.

Implementation The implementation of our prototype system was guided by the SFViz framework. SFViz includes three modules: data access module, data model module and visualization module. The data access layer processes data retrieval, mainly on graph data accessing, and the data model builds a matched compound graph and addresses actor similarity computation, and a visualization layer transforms the data model into visualization and accepts users’ interaction. The application was built with Java, and the data access layer was implemented with MySQL and JDBC. The visualization layer was built upon Prefuse Java graph visualization package [47], a Java graph layout and visualization library. The implementation of RSF is built upon DocuBurst package [49]. Figure 12. A tag network of LastFm data after filtering.

Figure 10. Last.fm data schema.

CASE STUDY: MUSIC COMMUNITY OF LAST.FM Dataset The dataset in our case study is from a social music service Last.fm [50] retrieved by Multimedia Information Retrieval Group at Glasgow University in November 2008 [51]. The dataset of Last.fm we used include information about entities of users, tracks, tags and relationships among them. The relationships involve friendship of users, user-tag information indicating what tags a user have, tag-track information showing what tags are assigned to a track by users (Figure 10). The dataset has 3148 users, 30520 tracks and 12565 tags.

Data Preprocessing We selected a fully connected component of the Last.fm friendship network as a test dataset. The network includes 1111 users and 3114 connections. We retrieved 4217 tags and 23,314 tracks. Based on the retrieved tags and tracks, we built a tag network with 2200 nodes and 17807 edges by the approach introduced in Section 3.2. The edge weight distribution is shown in Figure 11 with mean 6.01.

Figure 13. Tag (category) tree with RSF representation.

To construct a tag category hierarchy, we filtered unimportant nodes and edges in the tag network. The rule is to filter out nodes with degree less than 5 and edges with weight less than 25 but

keep the tag network connected. The final tag network includes 971 nodes and 1519 edges (Figure 12). We then constructed a tag category hierarchy with the filtered tag network. The final tag hierarchy includes 150 non-leaf tags with a max depth of 4. The tag tree has 13 top categories, including “rock”, “hip hop”, “pop”, “electronic”, “soundtrack” and “female vocalists”. The tag category structure is shown in Figure 13. In Figure 13, the leaf nodes of tags are hidden and only non-leaf nodes are used as the hierarchical category of music interest. When we assigned the users in the social network to the tag category hierarchy, we found that about half users (566/1111=50.95%) cannot be assigned to any category. The reason is that only about half users (603/1111=54.28%) had tag information and some users with tags cannot be assigned to any category. Therefore, we assigned these users to a dummy category labelled as “unspecified”.

Tag-Based Multiscale Exploration

and

Cross-scale

With SFViz, users can examine their friendship patterns at different levels in the tag tree. The exploration of the friend patterns can be in two ways: multiscale and cross-scale. The multiscale exploration enables users to observe network

Figure 14. Friendship patterns at the top level in the tag tree.

patterns in which tag categories are from the same level. This helps users understand a general popularity music category and how people with different music taste make friends. For example, Figure 14 shows the friendship patterns at the top level of the tag tree in Last.fm community. The line thickness indicates the strength of aggregated connections. We find that there is a strong friending patterns among people with interest of “hip hop”, “pop” and “rock”. Cross-scale views first present friendship patterns of users from different levels in a tag tree. Figure 15 shows how people from the sub-categories under “rock” make friends with other categories at the first level. We can see that most people under this category of “rock” not only friend with each other in the same category, but connect to others outside this category. This view indicates people tend to make friends with diverse tastes of music. In Figure 16, the ego network view of “classic rock” category also confirms this phenomenon. In a cross-scale view, if a user is specified as a person of interest, we can observe this user’s friendship network with all other related categories. The related categories are controllable with DOI. Figure 17 shows a cross-scale view of a user’s social network with DOI=1, which shows all related categories with the distance =1.

Figure 15. A cross-scale view of category under “rock” with other category from the first level.

Figure 17. A social network of a center user Figure 18. Top 10 recommended friends without a category of interest. all levels with DOI =1.

Figure 16. An ego network view of “classic rock” category at a cross-scale view.

Figure 19. Top 10 recommended friends with a tag category of “hip hop”.

Friend Recommendation Exploration

shown in Figure 21.

SFViz can suggest friends to a user based on the user’s social network information and a given interest category in a tag tree. After the user is known (e.g., with log-on information or by specifying from a network), the Top-N recommended people are shown with gradient colors. Figure 18 shows the top 10 recommended friends for the center user who is purple at the left bottom. The recommended people are shown with colors from red to yellow based on their rankings. In this example, no category of interest is specified.

CONCLUSION AND FUTURE WORK

When a category of interest is specified in the tag tree, social recommendation will be adjusted dynamically. Figure 19 shows the 10 recommended friends with a tag category of “hip hop” for the same user in Figure 18. We can see that the recommended friends are narrowed down to the category of “hip hop”.

Figure 20. A view of share friends with aggregation.

In this paper, we present a novel visual system, SFViz, to support users to interactively explore and seek friends with an interest context in social networks. Our approach leverages both semantic structure in social tags and topological information in social networks to make social recommendations. SFViz also provides novel visualization and interactive tools to enable users to explore social recommendations in a single view. The case study of SFViz with the social tags and social network in a social music community, Last.fm, shows that our approach has potentials to enhance users’ awareness of their social networks under different interest contexts, and help them seek new friends with similar interests in an interactive way. This study has some limitations. First, category assignment of users is restrictive. In our current design, we assigned each actor in a social network to a single leaf node in a tag tree with the largest matched score. However, in practical case, a user in social network can have multiple affiliations in a tree. For example, a user in a music community may have different tastes. Secondly, if some actors in a social network do not have tag information, we cannot assign them to any category in a tag tree. Our current method is to use a dummy node in a tag tree to incorporate these actors. A possible solution is to leverage other information to classify users to a tag tree. For example, in the Last.fm case study, we can use information about users’ tracks to assign users to the tag tree. We will extend the work in two directions. First, we will conduct more experiments and user studies of our approach. The experiments will assess the accuracy our recommendation with some labelled dataset, and in user studies, we may ask real users to rate friends recommendation. Second, we will incorporate other methods and information to classify users to a tag hierarchy, such as user’s profile information.

REFERENCES [1] Gou, L., Zhang, S. K., Wang, J. & Zhang, X. L. (2010). TagNetLens: Visualizing knowledge structures with social tags. In Proc. of ACM VINCI’10, 18: 1-9.

[2] Marsh, D.R., Schroeder, D.G., Dearden, K.A., Sternin, J. & Sternin, M.(2004). The power of positive deviance. British Medical Journal, 329, pp. 1177–1179.

[3] Israel, B.A. (1982). Social networks and health status: Linking theory, research and practice. Patient Counseling and Health Education, 4(2), pp. 65-77.

[4] Davis, H, Vetere, F, Ashkanasy, et al. (2008) Towards Social Connection for Young People With Cancer. OzCHI, Queensland.

[5] Goswami, S., Köbler, F., Leimeister, J. M. & Krcmar, H. (2010). Using online social networking to enhance social connectedness and social support for the elderly. In Proc. of ICIS’10, pp. 109-120.

Figure 21. A view of share friends without aggregation.

SFViz also shows why the recommended people are similar to the center user with a shared neighbors view. In SFViz, we assume more shared neighbors two users have, the more similar they are. In the shared neighbors view, a user can know how he/she is connected to the recommended person. Figure 20 shows the shared friends view between the user and a recommended person. The shared neighbors are aggregated to the tag of “hip-hop”. The aggregated node can be expanded to the level of individual,

[6] Lampe, C., Ellison, N. & Steinfield, C. (2007). A familiar Face(book): Profile elements as signals in an online social network. In Proc. of CHI’07, pp. 435-444.

[7] Krulwich, B. (1997). Lifestyle finder: intelligent user profiling using large-scale demographic data. Artificial Intelligence Magazine, 18(2), pp. 37–45.

[8] Mooney, R. J. & Roy, L. (2000). Content-based book recommending using learning for text categorization. In Proc. of DL’00, pp. 195204.

[9] Salton, G. & McGill, M. (1983). Introduction to Modern Information

[31] Gou, L., Chen, H., Kim, J., Zhang, X. L. & Giles, C. L. (2010).

Retrieval. McGraw Hill, New York, USA.

Social Network Document Ranking. In Proc. of JCDL’10, pp. 313322.

[10] Pazzani, M. & Billsus, D. (1997). Learning and revising user profiles: the identification of interesting web sites. Machine Learning, 27(3), pp. 313–331.

[32] Gou, L., Chen, H., Kim, J., Zhang, X. L. & Giles, C. L. (2010). SNDocRank: a Social Network-Based Video Search Ranking Framework. In Proc. of ACM MIR’10, pp. 367-376.

[11] Hill, W., Stead, L., Rosenstein, M. & Furnas, G.W. (1995). Recommending and Evaluating Choices in a Virtual Community of Use. In Proc. of CHI’95, pp. 194-201.

[33] Li, C., Han, J., He, G., Jin, X., Sun, Y., Yu, Y. & Wu, T. (2010). Fast computation of simrank for static and dynamic information networks. In Proc. of EDBT’10, pp. 465–476.

[12] Dahlen, B.J., Konstan, J.A., Herlocker, J.L., Good, N., Borchers, A. & Reidl,J.(1998). Jump-starting movielens: User benefits of starting a collaborative filtering system with "dead data". University of Minnesota TR, pp. 98-017.

[34] Tonkin, E., Corrado, E. M., Moulaison, H. L., Kipp, M. E. I.,

[13] Sarwar, B., Karypis, G., Konstan, J. & Reidl,J. (2001). Item-based

[35] Kammerer, Y., Nairn, R., Pirolli, P., & Chi, E. H. (2009). Signpost

collaborative filtering recommendation algorithms. WWW’01, pp. 285-295.

Resmini, A., Pfeiffer, H. D. & Zhang, Q. (2008). Collaborative and Social Tagging Networks. Ariadne, (54).

In Proc. of

from the masses: learning effects in an exploratory social tag search browser. In Proc. of CHI '09, pp. 625-634.

[14] Linden, G., B. Smith & York, J. (2003). Amazon. com

[36] O'Donovan, J., Smyth, B., Gretarsson, B., Bostandjiev, S. &

Recommendations: Item-to-Item Collaborative Filtering. IEEE Internet Computing, 7(1), pp. 76 – 80.

[15] Breese, J. S., Heckerman, D. & Kadie, C.(1998). Empirical analysis

Höllerer, T. (2008). PeerChooser: visual recommendation. In Proc. of CHI '08, pp 1085-1088.

interactive

[37] Gretarsson, V. B., O'Donovan, J., Bostandjiev, S., Hall, C. &

of predictive algorithms for collaborative filtering. In Proc. of CUAI’98, pp. 43-52.

Höllerer, T. (2010). SmallWorlds: visualizing recommendations. In Proc. of EuroVis’10, pp. 833-842.

social

[16] Condli, M. K., Lewis, D. D., Madigan, D. & Posse, C. (1999).

[38] Card, S. K., Mackinlay, J. D., & Shneiderman, B. (1999). Readings

Bayesian mixed-effect models for recommender systems. In ACM SIGIR '99 Workshop on Recommender Systems: Algorithms and Evaluation.

in Information Visualization: Using Vision to Think. pp. 17: Morgan Kaufmann. Halpin, H., Robu, V. & Shepherd, H. (2007). The complex dynamics of collaborative tagging. In Proc. of WWW’07, pp. 211-220. Wasserman, S. & K. Faust. (1994). Social Network Analysis: Methods and Applications. Cambridge University Press. Girvan, M. & Newman, M. E. J. (2002). Community structure in social and biological networks. National Academy of Sciences, 99(12), pp. 7821-7831. McPherson, M., L. Smith‐Lovin, et al. Birds of a Feather: Homophily in Social Networks. Annual Review of Sociology 27(415): 444, 2001.

[17] Hofmann, T. (2004). Latent semantic models for collaborative filtering. ACM Transactions on Information Systems, 22(1), pp. 89– 115.

[18] Schein, A. I., Popescul, A., Ungar, L. H. & Pennock, D. M. (2002). Methods and Metrics for Cold-Start Recommendations. In Proc. of SIGIR '02, pp. 253–260.

[19] Balabanovic, M. & Shoham, Y. (1997). Fab: Content-Based, Collaborative Recommendation. Comm. ACM, 40(3), pp. 66-72.

[39] [40] [41] [42]

[20] Pazzani, M. (1999). A Framework for Collaborative, Content-Based,

[43] Stasko, J. & Zhang, E. (2000). Focus+Context Display and

and Demographic Filtering. Artificial Intelligence Rev., pp. 393-408.

Navigation Techniques for Enhancing Radial, Space-Filling Hierarchy Visualizations. In Proceedings of the IEEE Symposium on Information Visualization 2000 (InfoVis '00), pp. 57-65.

[21] Melville, P., Mooney, R. J., & Nagarajan, R. (2002). Contentboosted collaborative filtering for improved recommendations. In Proc. of AAAI’02, pp. 187–192.

[44] Yang, J., Ward, M. O. & Rundensteiner. E. A. (2002). InterRing: An

[22] Tan, P., Steinbach, M., Kumar, V., et al (2006). Introduction to data

Interactive Tool for Visually Navigating and Manipulating Hierarchical Structures. In Proceedings of the IEEE Symposium on Information Visualization (InfoVis'02), pp. 77-85.

mining. Pearson Addison Wesley Boston.

[23] Adamic, L. & Adar, E.(2003). Friends and neighbors on the web. Social Networks, 25(3), pp211–230.

[24] Barabasi, A. & Albert, R.(1999). Emergence of scaling in random networks. Science, 286(5439):509.

[25] Newman, M. (2001). Clustering and preferential attachment in growing networks. Physical Review E, 64(2):25102.

[26] Zhou, T., Lu, L. & Zhang, Y.-C. (2009). Predicting missing links via local information. European Physical Journal B, 71(4), pp. 623–630.

[27] Katz, L.(1953). A new status index derived from sociometric analysis. Psychometrika, 18(1), pp. 39–43.

[28] Jeh, G. & Widom, J. (2002). SimRank: A measure of structuralcontext similarity. In Proc. of SIGKDD’02, pp. 538–543.

[29] Leicht, E., Holme, P., & Newman, M. (2006). Vertex similarity in networks. Physical Review E, 73(2):26120.

[30] Zhao, P., Han, J. & Sun, Y. (2009). P-Rank: a comprehensive structural similarity measure over information networks. In Proc. of CIKM’09, pp. 553–562.

[45] Danny, H. (2006). Hierarchical Edge Bundles: Visualization of Adjacency Relations in Hierarchical Data. IEEE Transactions on Visualization and Computer Graphics, 12, pp. 741-748.

[46] Lucene. http://lucene.apache.org/java/docs/index.html [47] Heer, J., Card, S. K., & Landay, J. A. (2005). Prefuse: A Toolkit for Interactive Information Visualization. Proc. of CHI’05.

[48] G. W. Furnas. (1986). Generalized fisheye views. In Proc. of CHI '86, pp. 16-23.

[49] Collins, C., Carpendale, S., & Penn, G (2009). DocuBurst: Visualizing Document Content using Language Structure. In Proceedings of Eurographics/IEEE-VGTC Symposium on Visualization (EuroVis '09), 28(3), pp. 1039-1046.

[50] Last.fm. http://www.lastfm.com [51] Konstas., I., Stathopoulos, V.& Jose, J. M. (2009). On social networks and collaborative recommendation. In Proc. of SIGIR '09, pp. 195-202.