visualizing the impact of content-based similarity ...

VISUALIZING THE IMPACT OF CONTENT-BASED SIMILARITY AND SPATIAL DISTANCE ON BOOK RECOMMENDATIONS Sönke Knoch and Alexander Kröner German Research Center for Artificial Intelligence Saarbrücken, Germany {soenke.knoch, alexander.kroener}@dfki.de

ABSTRACT Starting from the assumption that an integration of digital services might help brick-and-mortar bookstores to compete with online sellers, we propose a recommender system which takes into account not only content-based similarity of books, but also local availability and spatial distance. This article addresses work in progress; its focus is on a particular question tightly connected to the user interface design of such a system: how to communicate results from these different information sources to a customer in a transparent way. We contribute to this topic with several design proposals and a discussion of feedback collected in a user study. KEYWORDS Keywords-Recommendations, spatial information, transparency, bookstore, requirements.

1. INTRODUCTION Nowadays, bookselling basically exists in two main forms – physically in a bookstore, or virtually online. While both forms seem to co-exist, the growing success of eBooks might become a serious challenge for brick-and-mortar bookstores [1]. The latter ones may address this challenge by combining the advantage of the books’ physical presence with features known from online retailers. For instance, kiosk systems such as [2] allow a store’s customer to search for a book, check if it is in stock, and get a print-out of the particulars of a book or a search. However, actually retrieving a book in such an extended store nevertheless has a very physical aspect: customers have to walk in order to reach a book of interest – a potential burden if, e.g., several books need to be investigated. A digital assistant could address this issue by recommending books which are not only of interest for, but also close to the user. The basis of such a service is the combination of a book’s content-related features with the given environment’s spatial features, such as locations of shelves, books, and supporting infrastructure. This article focuses on the presentation aspect of this issue – how to make the impact of both feature categories on recommendations transparent to the user. In the following, after a brief review of related work, we describe a kiosk prototype which recommends books on the basis of content-based similarity as well as spatial distance. For the presentation of recommendations, various design alternatives are proposed. Then, a small-scale study aiming at feedback concerning these alternatives is discussed. The article concludes with a discussion of the results and their impact on future work.

2. RELATED WORK Our approach to interaction aims at guiding the user from a fixed kiosk to books and shelves of interest. It could be combined with systems which support the user on the way to or at the shelf – e.g., an intelligent environment able to highlight relevant areas in the user’s surrounding (e.g., [3]), or a mobile augmented reality system for book spine recognition which allows retrieving information about books on a bookshelf by snapping a photo (see, e.g., [4]). Regarding the proposed recommendation approach, a study of Ducheneaut

Figure 1. The kiosk’s user interface and screen variations A, B, C, D for the display of recommendations.

et al. sheds light on benefits from combining collaborative filtering with other information sources, including spatial distance [5]. It emphasizes the need to make the impact of these sources transparent and customizable. For presenting such recommendations, various approaches are known from research concerning locationbased user support in mobile scenarios, especially if decisions about people, locations, and artifacts around the user are required. For instance, the SocialSearchBrowser exploits the user’s social network in order to enhance a (mobile) map-based display of queries related to the user’s location [6]. Our experiment includes a similar option; however, we were also interested in visualizations providing an abstraction from the environment’s spatial details. This is related to visual information seeking for literature, e.g., in the area of digital libraries [7]. Here, our special interest was to convey a combination of spatial and content-related features in an easy-to-understand and transparent way.

3. SYSTEM PROTOTYPE Infrastructure. Our prototype system relies on books tagged with visual codes or radio-frequency identification (RFID), and a virtual model of a store’s physical structure. A relational database maintains coordinates for each object in the store. The shelves are subdivided into sections to allow sub categorization in the store. The locally available books are arranged into sections. In this early prototype, the actual assignment of coordinates to objects is done manually. Interaction. The conceptual model of the interaction between user and kiosk follows the idea of direct manipulation. The user interacts via a touch screen and a code reader. Via the latter one, the customer may use a book at hand to start searching and browsing the shop’s inventory. Alternatively, he or she may start an inquiry from an entrance screen displaying popular or new books. Recommender. If the user requests similar books for a book at hand, the system computes a ranked list in two steps: (1) it determines the content-based similarity of books available in the store using the recommendations included in Amazon Web Services (AWS) [8], and (2) the resulting list of books is reranked using information about their spatial distance from the kiosk. Since the full book content was not available at the time of implementation, we decided to use the book’s sales rank to measure similarity among

books. In this process, the spatial distance is calculated by the Euclidean metric. Both factors are weighted accordingly to their importance as expressed by the user through a slider (default configuration: 0.5=neutral). Visualization. For presenting recommendations, a prototype was implemented which stays close to common list-based presentations (see Figure 1, screen A). Feedback of 5 users (IT experts) indicated a need for more transparency concerning the impact of spatial data. In order to address this request, we created mock-ups of three additional screens B, C, and D which might either replace or complement the original listbased display. Screen A provides an overview of recommended books in form of a single, cumulated list in descending order concerning relevance. To provide insight to the impact of spatial and content distance per book, a chartlike rectangle shows the allotment by painting the respective factor’s color. A slider allows the user to adjust the weighting between both factors. Shifting the slider changes the ordering of the books. Screen B basically presents the recommender's result before content and spatial distances are combined. It relies on two separate fixed lists, one for spatial distance to the kiosk, one for similarity of content to the book at hand. Screen C conveys information about the direction the customer would have to walk to obtain the recommended books. It uses a star-like graph with books as leafs and the kiosk as source, and indicates the viewing direction. Screen D aims at indoor navigation support. It relies on an abstract map-based display of the store’s entrance, stairs and shelves in form of rectangles together with the kiosk location. The shelves are numbered and books are faded in above the shelf they belong to. In summary, from A to D these screens provide an increasing accuracy of spatial information. The 1-D screens A and B show a simple (spatial and content) distance measure, whereas the 2-D screens C and D come up with information on product direction and location. However, the content-based similarity measure tends to disappear from A to D, while the spatial information is stated more precisely.

4. EXPERIMENT To verify the appropriateness of these screens for the given task – and eventually to identify the most appropriate combination of screens –, we conducted an experiment with a particular focus on answers to the following questions: Organization. What opinion has a customer on spatial distance vs. what kind of representation does he or she prefer, accumulated variable list A, split fixed list B, graph C or map D? Distance Visualization. Is a customer confident with a simple 1-D spatial distance measure or does he or she need a 2-D visualization which additionally informs about product locations? Classification. Is it possible to identify behavioral patterns that can be used to classify users according to their spatial preferences for the purpose of personalization? To find the answers, a seven-page questionnaire was used in a 45 minutes survey, which was divided into three phases. The paper-based questionnaires comprised closed and open questions. For closed questions check boxes and ranges in form of a five level Likert scale were chosen to gather the data. A supervisor guided the participants through the experiment.

4.1 Sample and procedure The study was conducted at a university. Altogether 21 people (11 f, 10 m) with an average age of 27 (median = 25) took part in the experiment. All participants were internet users and most of them (81%) buy online multiple times per year, once a month or multiple times per month. Phase 1 started with a short introduction in text form. The participants were asked to imagine a bookstore where an interactive kiosk system equipped with a touch screen can be found; the screen would represent the status after a book has been selected through browsing or code scanning. Then, the participants had to fill out a first questionnaire containing general questions and questions on the customer’s intention before and behavior during a bookstore visit. In phase 2, the screens were presented on a 15.4” laptop screen in the fixed order A, B, C, D. To check the participant’s ability to interpret the screens, the screen-related screen contained questions about layout

and functionality. Then, it focused on the meaning of the presented recommendation and the comparison of B, C, D with A. After the last screen, the participants were asked to declare the most and second most helpful screen. Additionally, we asked for the preferred screen under three different behavioral patterns: to buy one single book, multiple books or spontanously. At this point, the demographic data were gathered. In phase 3, for each of the screens B, C, D, the participants had to plan their favorite sequence of 3 recommended books on a route with a kiosk-to-kiosk connection. A corresponding questionnaire included numbered pictures of 3 book covers. The display comprised 5 books, including all of the 3 books in question. The experiment was conducted separately for each screen; the books were always the same. A supervisor measured the time the participants needed to define a sequence he or she found ideal to reach the books. The background for this test was that we wanted to test for B if the user chooses the correct distance list and for C and D if the abstract graph outranks the map.

5. RESULT AND DISCUSSION In the following, the feedback on the three questions concerning Organization, Distance Visualization, and Classification is presented.

5.1 Feedback related to Organization Asked about the personal importance ascribed to the spatial distance, the result was nearly balanced: 48% found that spatial distance is important. In contrast, according to answers to the question for the most helpful screen, the map-based representation D performed best, followed by B and A (see Table 1). In the route planning task, screen C performs best with an average planning duration of 14.18 sec, followed by screen D with 15.74 sec, and screen B with 18.92 sec. Table 1. Screen rating, the percentage of participants that chose two screens as 1st and 2nd choice. To indicate differences in choice between the genders, the result is split according to the participants’ sex. Rank 1 2 3

1st choice Screen D Screen B Screen A

2nd choice Screen B/C Screen D Screen D

Rating 65% 25% 10%

Female 0% 80% 54%

Male 100% 20% 46%

Barely half of the participants think that spatial distance is important, although the sample consists predominantly of online buying young people. This result indicates that people ascribe an importance to the distance between them and the desired product. On the other side, a list-based representation seems not to be the desired one: the graph and map representations beat the other screens, which was shown through the rating task. As expected, the route planning task was dominated by the information poorest screen C, closely followed by D. Finally, we can conclude that the participants ascribe an importance to spatial distance and the screen with the highest degree of spatial information performs best. The fact that on rank 3 A and D occur with the best balance between genders is an indicator for a good combination.

5.2 Feedback related to Distance Visualization Screen A. We wanted to check if the meaning of the slider is transparent for a user. The participants were asked about their assumptions concerning the effect of a shift to the outer left. 43% replied correctly that the ordering of the presented books would change. 38% were also able to identify the kind of change in which direction similar books move regarding content and regarding physical distance. Only 19% answered totally correct. Interestingly, 24% thought the most content-related books instead of the closest books would slide on top of the list. They changed the meaning of the two colors. 47% identified the 3rd book as the most appropriated book instead of the 1st book. When we asked for the factors which are used to calculate the ranking, 33% were able to identify them. Screen B. Regarding the list-like distance visualization, 48% thought that they could make a clear statement on product locations, which is an impossible task. 86% interpreted the leftmost and rightmost

books as spatially most distant, which is false. The content-based distance of the same books in comparison with their spatial distance was seen as relatively smaller by 76%. Solving the route planning task, 81% simply planned the route from the left to the right as expected and used the correct list. Screen C. 10% saw a content-based relationship between two of the recommended books, which was not intended. Vice versa only 19% saw no such relationship. The remaining 71% checked no box. The location was perceived well by 91% of the participants. Interestingly, 57% of the people thought that the two books in the upper right corner were close to the kiosk – despite the lack of a distance unit. 43% thought that both were located in the same shelf row. Screen D. The books were located as accurate as possible by nearly all participants correctly. At the route planning task, 86% planned to visit the two books in the same book shelf corridor successively. Explanation or the Same?

Substitude or Complement? Better than A

Explains A well

Good in addition to A

Visualizes the same as A

Figure 2. Screens B, C, D compared to A on a scale between 1=strongly disagree and 5=strongly agree.

Screen comparison: Figure 2 shows on the left that the screens are explaining A decreasing from B to D, while they visualize strictly decreasing the same as A. The diagram on the right shows that all screens are perceived as better than A and good in addition to A. From B to D, there is a tendency that the screens fit increasingly better in addition to A (complement). In parallel, the screens are decreasingly better than A (substitute). The feedback on 1-D screen A shows that a cumulated list representation with a slider to edit preferences and the current color coding is non-transparent to the user. The split list in B animates the user to interpret not intended coherences between the books, because they are connected through a colored line. The star-like graph in C is easy to understand, but some people tend to interpret coherences like content-based relationships, closeness and shelf locations although there is no connecting line. The map screen D bridges this gap and is well understood by a high majority of the participants. The screen comparison results that A and D complement one another best, which strengthens the indicator on this combination from the feedback on organization. Finally, we can conclude that the user has problems to understand the distance measure in a 1-D visualization. A 2-D visualization leads to less confusion and to a more precise interpretation. The map is the preferred visualization, because the least misunderstandings occurred, it is seen as best in addition and least the same as A (complement).

5.3 Feedback related to Classification The sample can be divided into online (62%) and offline buyers (38%). Many participants are goal-oriented customers: 57% investigate online before they visit a bookstore, 67% commonly visit a bookstore because of one single book and even 19% prepare a book list. Regarding shopping behavior, the majority of customers in our sample commonly just browse a bookstore (91%). At the end of a bookstore visit, 76% commonly buy one single book and 30% buy multiple books; multiple answers were possible. So far, two groups exist concerning buy behavior: single and multi book buyers. Many people buy more books than they plan to. Thus, a group of people who buy spontaneously exists. The question of the most suitable screen under three different conditions, the behavioural pattern, resulted in the dependency of the demand on different

behavioral patterns of the bookstore customer. Table 2 shows that single book buyers prefer D, multi book buyers are indifferent between B and D and spontaneous ones prefer B and refuse D. Table 2. Screen preferences under three conditions. Condition: behavioural pattern Buy one single book Buy multiple books Spontaneous

A 15% 19% 33%

B 0% 38% 48%

C 10% 10% 19%

D 75% 33% 0%

6. CONCLUSION This article addressed an approach to integrate spatial features of books into a book recommendation system which could be deployed in a brick-and-mortar bookstore. An experiment addressed various ways to visualize recommendations computed by such a system. Results from that experiment indicate that spatial information actually might help the customers of a bookstore to plan their shopping trip. Users benefited most from the integration of spatial data if visualization supported route planning inside the store. Furthermore, customers’ preferences concerning spatial data differed which could be exploited for classifying users according to their spatial preferences. These observations have several limitations, which might become subject of future work. For instance, participants didn’t experience any negative feedback after bad route planning which might have affected their perception of spatial support. Therefore, an experiment is needed where participants actually have to perform tasks similar to the reported ones in a store-like scenario with efforts needed to reach a book. Another limitation is the specific (narrow) user group in this experiment; here further experiments with other user groups (e. g., elderly people) might provide additional insights into the topic. Future work will have to address the potential and the limitations of these observations. Finally, a technical extension of the prototype could provide a visualization which adapts to behavioural patterns and thus spatial preferences of customers.

ACKNOWLEDGEMENT This work was developed in the context of the project Allianz Digitaler Warenfluss (ADiWa) that is funded by the German Federal Ministry of Education and Research under grant 01IA08006.

REFERENCES [1] [2] [3]

[4] [5]

[6]

[7]

[8]

International Digital Publishing Forum. Wholesale eBook sales statistics. http://idpf.org/doc_library/industrystats.htm, accessed on August 10, 2010. Phosphor Essence, “Book Finder Kiosk,” http://bookfinderkiosk.com/, accessed on August 18, 2010. Butz, A., Schneider, M., and Spassova, L., 2004. SearchLight - A Lightweight Search Function for Pervasive Environments. Proceedings of the Second International Conference on Pervasive Computing, LNCS 3001, pp. 351-356. Springer: Berlin, Heidelberg. Chen, D., Tsai, S., Hsu, C.-H., Singh, J. P., and Girod, B., 2011. Mobile augmented reality for books on a shelf. Proceedings of IEEE Workshop on Visual Content Identification and Search (VCIDS), to appear. Ducheneaut, N., Partridge, K., Huang, Q., Price, B., Roberts, M., Chi, E.H., Bellotti, V., and Begole B., 2009. Collaborative Filtering Is Not Enough? Experiments with a Mixed-Model Recommender for Leisure Activities. Proceedings of the 17th International Conference on User Modeling, Adaptation and Personalization (UMAP 2009), LNCS 5535, pp. 295-306. Springer: Berlin, Heidelberg. Church, K., Neumann, J., Cherubini, M., and Oliver, N., 2010. SocialSearchBrowser: A novel mobile search and information discovery tool. Proceedings of the 15th international conference on Intelligent user interfaces (IUI 2010), pp. 101-110. ACM Press, New York, NY, USA. Heilig, M., Demarmels, M., König, W.A., Gerken, J., Rexhausen, S., Jetter, H.C., and Reiterer, H., 2008. MedioVis: visual information seeking in digital libraries. Proceedings of the working conference on Advanced visual interfaces (AVI 2008), pp. 490491. ACM Press, New York, NY, USA. Amazon.com. Amazon Web Services. http://aws.amazon.com/, accessed on September 1, 2010.

visualizing the impact of content-based similarity ...

visualizing the impact of content-based similarity ...

Suggest Documents

Visualizing Similarity of Appearance by

research intern - Visualizing Impact

Bibliometrics: Visualizing the Impact of Nursing ...

Visualizing Program Similarity in the AC Plagiarism ...

Visualizing Musical Structure and Rhythm via Self-Similarity - fxpal

Visualizing the Impact of Geographical Variations on Multivariate ...

for immediate release visualizing impact of youth leadership for the ...

Data Through Others' Eyes: The Impact of Visualizing Others ...

ImpactViz: Visualizing Class Dependencies and the Impact of ...

for immediate release visualizing impact of youth leadership for the ...

Visualizing Global Impact - LITA 2016

The impact of perceived character similarity and identification on moral ...

The impact of perceived character similarity and ...

The impact of perceived similarity on tacit coordination ... - USC Dornsife

Visualizing Data, Visualizing Models - University of Michigan

Visualizing the Knowledge Domain of

The Similarity Measures and their Impact on OODB Fragmentation

Visualizing the City - CiteSeerX

Impact of self-similarity on wireless network performance - CiteSeerX

Impact of self-similarity on wireless network performance

Impact of Similarity Threshold on Arbitrary Shaped ... - Google Sites

Impact of Similarity Measures on Causal Relation Based Feature ...

Visualizing the Invisible

Analysis of Color Space and Similarity Measure Impact on ... - CiteSeerX