Browsing a Website with Topographic Hints
∗
S. Rossi, A. Inserra and E. Burattini Dip. di Scienze Fisiche University of Naples ”Federico II” Naples, Italy
[email protected],
[email protected]
ABSTRACT This work aimed to propose an adaptive web site in the field of cultural heritage that can dynamically suggest links, based on not intrusive profiling methodologies integrated with topographical information. A fundamental issue, typical in web sites that refer to real sites, is to help the user to orient himself geographically. Our system can support the user in its exploration of physical/virtual space suggesting new physical locations structured as a thematic itinerary through the excavations.
1.
INTRODUCTION
Web personalization is the process of adapting the information content and the format of an interface in order to meet the individual needs of a single user, starting from his browsing behavior and from his interests. A particular class of these systems is represented by the Recommendation Systems. Such systems are meant to suggest links to products, services and information which are considered to be attractive for the user. Such techniques are often considered an essential part for the customization process of web sites because they support the mechanism of adaptation to the characteristics of each user [5]. Recommendation Systems and Adaptive Hypermedia are intimately linked: they both have the goal to identify the contents that may correspond to the user interests, but while Adaptive Hypermedia aims to filter information, the goal of Recommendation System is to provide additional sources of information. Most of the Recommendation Systems require an active participation from the user. They require that the user makes his informative needs explicit (for example, by providing the system with series of keywords that better represent his interests), or require that the user expresses an opinion on documents (for example, by assigning a vote). While these methodologies are useful in the case of selling advices, in our opinion they cannot be easily applied in the case of ∗
This work was partially supported by the S.Co.P.E. project.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. AVI ’08, 28-30 May , 2008, Napoli, Italy. Copyright 2008 ACM 1-978-60558-141-5 ...$5.00.
347
navigation in websites. These methodologies are highly invasive and there is not only the risk to bore the user, but also to make the system not ductile to quick changes in the behavior of users over time. Faced with these problems, we draw attention especially to those systems that, using an implicit user profiling, implement adaptivity and recommendation simply by observing and analyzing the user interaction with the system [3, 6]. The goal of this work was to design and implement a recommendation system in the field of cultural heritage, with the aim of helping and involving the users during the visit of a web site, by identifying the different types of users and creating the appropriate suggestions. These recommendations are meant to help the user in finding interesting contents and, at the same time, to facilitate the user to orient himself in hyperspace integrating a physical space with a virtual one [7]. The system, in fact, starting from the user profile, integrated with the topographical information contained in the database, helps the user to find a “path” throughout the navigation. One of the problems, in large web sites, is not just of finding the right contents but also to locate and recover information one has seen before [2]. During the last years, spatial navigation has been proposed as a new metaphor of interaction in large portals, supported by the possibility of having geo-localized maps and portable GPS devices. However, this methodology has also been proposed as a way to enhance the interaction in common World-Wide Web sites, where the lack of apparent structure and the hyper-link navigation make the interaction problematic [8]. In the case of web sites connected to a real site, like a museum or, as in our case, an excavation site, the underneath spatial structure of the contents is already present and can be easily used to help the user to orient himself geographically, namely to understand the nature of information that surrounds him. In this way the user can perceive his location in the virtual space and consequently in the corresponding real space. More specifically, the objective of the system is not only to select relevant information for a user, but also to adjust the level of presentation of each topic to the physical space of the Herculaneum excavation. Therefore, we have the problem to choose what are the information to show (as in all personalization systems), but also to generate recommendations as much as possible appropriate to the space context, generating a personalized tour. So, our system can also support the user exploration of physical/virtual space, helping him to find what he is looking for and suggesting the new physical locations structured as a thematic itinerary through
the excavations. Finally, let us consider that a website related to a real site may be also used as a support before and during the visit [7]. To provide the user with virtual thematic paths on the web site, which reflect real possible paths, may be useful to remember contents, starting from their relative locations, and to train the user for the real visit. Human beings have naturally the ability to recall objects in space depending on their locations.
1.1
The Herculaneum Excavations Hypermedia
The Herculaneum Excavations Hypermedia1 comes from a similar system realized stand-alone, on CD-ROM support [9], and from an earlier on-line version presented in [10]. The knowledge base contains a huge set of information, including historical data and mythological tales, excavation reports, description of recovered buildings and objects, painting and so on. This information comes under the form of texts, videos, images, contemporary and old maps. The knowledge base was appropriately linked – following the experts’ suggestions – and the results of this analysis have been summarized in a tree where single chunks of knowledge are identified and displayed (see Fig. 1). The node labeled “building” represents one of the main points of view from which the knowledge on Herculaneum Excavations may be explored and sub-nodes show the ways in which such knowledge may be enhanced.
Figure 1: The knowledge tree for the Herculaneum Excavations Hypermedia. In [10] we proposed a context independent architecture composed by modules that can be reused and adapted. Our system has no a priori knowledge of the web contents, but is able to manage a high level representation (a knowledge tree) defined on this set. A content element in the database represents an instance of a node of the tree. For example a resource instance may be the “excavation report” of a particular building or the “description of a fresco”. Such instance gets the description of its correspondent class of the tree. The work presented here, starting from the purposes of his predecessors, progresses using a different technique for the user modeling and, thereafter, for the personalization of 1
http://www.ercolano.unina.it
348
the contents. Moreover, we integrated a recommendation facility based on an implicit user modeling and topographical information.
2.
MODELING USERS AND RESOURCES IN THE SAME SPACE
The browsing behavior and the selection of particular information content are a source of knowledge in order to select relevant contents and to modify the layout of an interface. The user model has to be modified by the choices made by the user. Moreover, the selection of resources “preferred” for the user depends on the state of the user model. This leads us to the need to represent the totality of resources through a resource model that can be easily linked to the user model. With the help of a group of architects, historians and archaeologists we detected the main properties that may characterize any information content of our hypermedia. Those features are related to the types of informative content that a resource may have and to the interests that a resource may match. For example, a feature is the “historical content” - i.e. how much a resource class contains information from historical data. We defined a resource class as a vector − → r = (w1 , . . . , wn ) in the space Rn , where n is the number of characteristics or features. The vector model of a resource class represents how much every specific characteristic is present in the class of resources. Differently from the common usage in data mining, the vector, which represents a class of resources, does not contain the frequency of occurrence of some specific words within the text [4]. It is representative for the whole class, it does not depends on a single text and represents how much a specific class can be classified according to some typologies of texts. Our approach is more similar to the “Concept Profiling” methodologies [11], in fact, all the resource classes represent abstract topics, rather than specific words or sets of related words. Moreover, they are organized in a hierarchical structure and the relationships between resources are implicitly specified. In the recommendation module developed for the Herculaneum Excavations we decided to implement a user model that takes into account only the current behavior of the user, starting from the specification of the behaviors of “ideal” users. Differently from the classical use of stereotypes [1] we do not try to classify a user in a specific class, but we start from the assumption that a real use may exhibit a behavior that is a combination of ideal classes. A user model is a vec→ tor − u in the space of ideal user classes. When a user session starts, the user model vector components are set to zero. The user model vectors are represented as attribute–value couples and such values represent the percentage of similarity of the user model to ideal user classes. This is to say → of the user j is a linear combination that the user model − u j → → →, β − →, ..., γ − of ideal user’s models (− ui ), such as − uj = α − u u u→ 1 2 n, where Greek letters represent percentages. It is fundamental that each ideal class of users does not share any behavior with other ideal classes, i.e. the vectors representing ideal classes are an orthogonal base. One of the main problems of systems that use keywords representations is that the user model and the resources are not directly comparable. In our system, however, the user model can be mapped in the same space of resources. In fact, each ideal user is characterized by an ideal resource, representing the optimum content for that particular ideal
→ → ui → − ri ). This set of optimal resources is an orthogouser (− → nal base in the space of resources. The vectors − r1 , . . . , − r→ n of optimum resources for ideal users represent a squared matrix U that we use in order to evaluate the ideal resource for the current user, to give recommendation (see Sec.3), and to modify the user model according to his browsing behav→ ior. Let us highlight that an optimal resource − ri may not − → correspond to any of the real resources rj in the database. Every time the user selects a resource, the user model has to be update. The selection of a resource can be either a click on a link or other actions enabled by the systems, such as saving personal notes, search on the database, and so → on. Every time the user selects a resource − rk we evaluate the correspondent user model for that particular resource: − →=− → u rk ×U −1 . This vector (u1 , u2 , . . . , u14 ) represents qualk itatively how much we have to modify the current user model in order to take into account his last action. In particular, we modify the current user model making a weighted average between the current values (uu ) of the user model and those arising from the last interaction with the system (uk ). The new components of the user model will be: unew =
(uu ∗ nclick ) + uk , nclick + 1
where nclick is the number of interactions with the system. In this way we can minimize errors because, if a user clicks only few times on different topics, this will have a little effect comparing to the whole interaction. Moreover, we are able, after few interactions, to have a well defined user model in order to give recommendations.
3.
SELECTING THE RESOURCE CLASS
In the previous section we discussed how the user model is modified according to his actions. In this section we will explain how the system selects the interesting resources according to the user model. Starting from the current user → → model, the system evaluates the ideal resource − ri = − ui × U for the user. As we already said, to an ideal resource may not correspond any of the real resources, and so, the system evaluates the “distance” from this vector to the real resources → (− rj ) in the web site. To evaluate the distance between real resources and the ideal resource the system evaluates the angle between each − → − ri ·→ rj couple of vectors as follows: cosθ = |→ . Once we fixed − − ri ||→ rj | a threshold angle, the system suggests to the user all the resource classes whose “distance” is less than the threshold. Let notice that a single interaction with the system will lead to have an ideal resource for the user model that overlaps with the selected resource. This means that, after a single interaction, the distance between these two vectors is equal to zero (i.e., smaller than the threshold). However, the system does not have to start his suggestion after only one interaction. In order to evaluate the minimum number of interactions needed before starting the recommendation process and the value of the threshold we performed a testing process. We conducted four set of tests. During the tests, the user was requested to browse the web site with a specific information goal (see Fig.2, tests 1.1, 1.3 and 1.4). In particular test 1.1 asked to find information about two different and not related given topics (for example about frescos and technological installations). Test 1.3 asked to find information about two
349
given related topics (for example doors and windows) and test 1.4 asked to find information about a particular given topic (for example frescos) representing a class of resources. Another test was conducted with a random interaction (see Fig.2 test 1.2). From the results obtained we fixed a threshold for the distance and a minimum number of interactions required for the recommendation process.
Figure 2: Value of the angle θ between the user model and the resource to recommend. Finally we have to consider the case of two or more resource classes that have a distance less than the threshold. Let us recall that the resources are structured as a tree. If the selected resources have the same resource father in the tree (for example both the classes “floors” and “balconies” has the same father “finishing elements”, see Fig.1), the system suggests the father, otherwise the system chooses the resource with a smaller distance.
3.1
Thematic Tours
The creation of a path through a hypermedia needs to solve two problems: to decide which information is interesting and to determine the modalities of visualization of the web pages. Concerning to the first problem, the representation by classes allows us to understand if and how much a user is interested to a particular class, and to select, in this way, a set of resources to suggest. To solve the second problem one has to characterize each single instance to decide the order in which the resources have to be suggested. In order to have a flexible system that does not depends upon the help of experts for adding new resource instances, we decided to have a characterization of resources at class level. Therefore, all the resources of a class are equivalent for the system. The only thing that characterizes a resource instance is its topographical information. In this way the choice for the recommendation does not depend only from the interests of the users, but also on the relative locations of resources in the virtual space. Moreover, even if we were able to have additional information on the single instances, ordering the presentation of resources according to their physical locations it is fundamental for the orientation of the user and for preparing the user to the real visit. The topographical information defines, for all contiguous buildings, their respective geographic positions, such as north, south, east and west. The resources are combined to form a graph. At each edge of the graph it is associated a distance in meters, that represents the cost to cover the edge, such as the distance from the entrance of one building to the entrance of the next one. During the first phase of the interaction, the suggestions proposed by the systems refer to buildings that are “near” to the current position of the user in the virtual space (see
Figure 3: The interface of the Herculaneum Excavation website. The navigation bar is on the top right while the recommendation bar is the right–bottom one. Fig.3). In fact, in this phase, the system does not have sufficient information on the user. Then, when the system has enough information on the user, it assists the user during the navigation, creating dynamically a tour of the buildings that have the properties the user is looking for. For example, the system may create a tour covering all the buildings that have frescos. This tour is created adding a list of links in the recommendation bar. This list contains links to the buildings that are interesting for the user and with which it is possible to define and to construct a thematic tour of buildings. The user is guided with links to web pages indicating the direction to follow, starting from his actual position in the virtual space. Moreover, the list of the links will be presented adding topographical information to get to all the interesting buildings. Examples of such information will be “turn on the right”, “walk down”, etc (like as the user is walking in the space). While planning the tour, buildings on the path may be classified as buildings of interest, and therefore they should be suggested, and buildings of no interest. If a building is of interest, the user is directly able to click on the relative link (the specific resource within the building, for example the frescos within the building) and he will be guided “to continue” the tour. On the contrary, if the building is not of interest, there is not a corresponding link. We decided to add a reference also to those building, while describing the path, in order to give more precise indications on the path to follow. In fact, to go from one interesting building to another, the user has to go over the not interesting ones. In the list of recommendations the system adds information like, for example, “come though the Building One”. That may constitute an help for the user, in fact, the suggested path directly introduces the link to the appealing buildings, separating them from those buildings of not interest that are only a mandatory passage in the path. Finally, let us recall that the thematic tour is presented to the user in an apposite area of the interface and does not directly constrain the normal browsing activity of the user.
4.
DISCUSSION
In this paper we presented a first approach aimed at an integration of topographical information within an informative web site. Moreover, the Herculaneum excavation hyperme-
350
dia is able to implicitly create user profiles without requesting any direct information to the users. Starting from this profiling activity and the topographical information about the corresponding real site of the excavation, the system is able to suggest to the users personalized tours of the contents of the hypermedia. Finally, the suggestions about the links to follow or the places to visit are given to the user as he is walking in the real space. In our opinion this approach will improve the involvement of the user during the navigation, and the recall of physical locations during the real visit. Finally, as future work, we will extend our portal in order to be easily browsed also using PDA and we will integrate GPS information. In this way, a common web site can be easily used both from home, to search information, and during the visit to the excavation as a personal mobile guide. Moreover, we will extend the level of details of topographical information in order to deal with objects within the building area.
5.
REFERENCES
[1] A. Kobsa. User modeling: Recent work, prospects and hazards. In Adaptive User Interfaces: Principles and Practice, pages 111–128. North-Holland, 1993. [2] A. Dieberger. Providing spatial navigation for the world wide web. In A. U. Frank and W. Kuhn, editors, Spatial Information Theory - A Theoretical Basis for GIS (COSIT’95), pages 93–106. Springer, 1995. [3] Y. Hijikata. Implicit user profiling for on demand relevance feedback. In IUI ’04: Proceedings of the 9th international conference on Intelligent user interfaces, pages 198–205, New York, NY, USA, 2004. ACM. [4] J. B. Schafer, J. A. Konstan, and J. Riedl. E-commerce recommendation applications. Data Mining and Knowl. Discovery, 5(1/2):115–153, 2001. [5] L. Terveen and W. Hill. Beyond recommender systems: Helping people help each other. In J. Carroll, editor, HCI in the New Millennium. Addison Wesley, 2001. [6] K. Sugiyama, K. Hatano, and M. Yoshikawa. Adaptive web search based on user profile constructed without any effort from users. In WWW ’04: Proceedings of the 13th international conference on World Wide Web, pages 675–684, New York, NY, USA, 2004. ACM. [7] E. Not, D. Petrelli, O. Stock, C. Strapparava, and M. Zancanaro. Person-oriented guided visits in a physical museum. In ICHIM, pages 69–79, 1997. [8] A. Dieberger and A. U. Frank. A city metaphor to support navigation in complex information spaces. Journal of Visual Languages and Computing, 9(6):597–622, 1998. [9] E. Burattini, F. Gaudino, and L. Serino. Hypermedia knowledge acquisition and a bdi agent for navigation assistance. a case study: Herculaneum excavations. In Europ. Conf. on Cognitive Science, pages 437–440, 1999. [10] S. Rossi, V. Scognamiglio, and E. Burattini. Web contents and structural adaptivity by knowledge tree: The herculaneum excavation hypermedia. In Proc. of the third Inter. Conf. on Web Information Systems and Technologies WEBIST 07, pages 270–275, 2007. [11] S. Gauch, M. Speretta, A. Chandramouli, and A. Micarelli. User profiles for personalized information access. The Adaptive Web, pages 54–89, 2007.