Convergence of Web and TV Broadcast Data for Adaptive ... - CiteSeerX

2 downloads 8376 Views 87KB Size Report
available as a Web application and a TV set-top box front-end, while a mobile ... programs), online movie databases such as IMDB in custom XML format (e.g..
Convergence of Web and TV Broadcast Data for Adaptive Content Access and Navigation Pieter Bellekens, Kees van der Sluijs, Lora Aroyo? , Geert-Jan Houben?? Technische Universiteit Eindhoven PO Box 513, NL 5600 MB, Eindhoven, The Netherlands {p.a.e.bellekens,k.a.m.sluijs,l.m.aroyo,g.j.houben}@tue.nl

Abstract. iFanzy is a personalized TV guide application aiming at offering users television content in a personalized and context-sensitive way. It consists of a client-server system with multiple clients and devices such that the user can ubiquitously use TV set-top box, mobile phone and Web-based applications to select and receive personalised TV content. TV content and background data from various heterogeneous sources is integrated to provide a transparent knowledge structure, which allows the user to navigate and browse the vast content sets nowadays available. Semantic Web techniques are applied for enriching and aligning Web data and (live) broadcast content. The resulting RDF/OWL knowledge structure is the basis for iFanzy’s main functionality, like semantic search of the broadcast content and execution of context-sensitive recommendations.

1

Introduction

Current Web-based applications are characterized by the fact that users use many different types of devices to access content. As a consequence, engineering such applications has to respect the different environments and capabilities to ensure that the application adapts to the circumstances. A big advantage for the purpose of personalization is that the user spends more time with these devices than with the PC alone which gives more information to assess and model the user’s situation. In the TV broadcasting domain the combination of multiple devices such as mobile phone and TV with multiple applications accessible via the Web shows how access to content will evolve [1]. At the same time we see the integration of background information that relates to TV content which can help the user in selecting and using this data. Much of this information is available via the Web, which leads to the significant trend that television viewers use their PC for the larger part of the TV content access process[2]. While advantageous this trend also implies an information overload that cries out for adaptation to the user’s knowledge, preferences and situation. Personalization in this setting can benefit from several different ? ??

also affiliated with Vrije Universiteit, Amsterdam, the Netherlands also affiliated with Vrije Universiteit Brussel, Brussels, Belgium

kinds of integration: friends or relatives watching TV content together, integrating (background) data from different connected applications and integrating temporal and spatial-specific viewpoints in context modeling. In this paper we present iFanzy - a personalized TV guide application aiming at offering users television content in a personalized and context-sensitive way (developed in collaboration with Stoneroos Interactive TV, Ltd.1 ). It is currently available as a Web application and a TV set-top box front-end, while a mobile version is under development. In a client-server model, iFanzy acts as a client that uses a server framework called SenSee [3] as underlying data source. This takes care of content integration, user modeling and content recommendation. Several other TV recommender systems exist, e.g. AVATAR [4] which has a focus on reasoning over TV content metadata and user preferences. iFanzy differs from AVATAR mainly because of our focus on combining and integrating the information from several large and live datasources. Many systems exist that focus on the recommendation part, for example the movie recommendation application MovieLens2 that uses collaborative filtering. For an overview of different recommendation strategies e.g. refer to [5].

2

Semantics-based Content Integration

The use of semantics is an important instrument in order to combine and integrate the content from different applications and in this way to enhance personalization. In this sense iFanzy represents a large class of multi-device applications with a high degree of interactivity where semantics is key to effective integration [3]. In our work we have applied a general strategy that supports this large class of semantics-based applications, illustrated here in terms of iFanzy. Step 1: Making TV metadata available in RDF/OWL As a first step we make the relevant metadata from various data sources available in RDF/OWL. In the current iFanzy demonstrator we use three live data sources, online TV guides in XMLTV format (e.g. 1.2M triples for the daily updated programs), online movie databases such as IMDB in custom XML format (e.g. 8M triples for 12K movies and trailers from Videodetective.com), and broadcast metadata available from BBC-backstage in TV-Anytime (http://www.tvanytime.org/) format (e.g. 92K triples, daily updated). Next to the live data we also use the W3C OWL Time Ontology3 to represent time information. Step 2: Making relevant vocabularies available in RDF/OWL Having the metadata available, it is also necessary to make relevant vocabularies available in RDF/OWL. In iFanzy we did this in a SKOS-based manner for the genre vocabularies (resulting in 5K triples), and for the TV-Anytime Genres, the XMLTV Genres, and the IMDB Genres. All these genres play a role in the classification of the TV content and the user’s likings (supporting the recommen1 2 3

http://www.stoneroos.nl/, http://ifanzy.nl/ http://www.movielens.org/ http://www.w3.org/TR/2006/WD-owl-time-20060927/

dation). We also used WordNet 2.0 (http://www.w3.org/2006/03/wn/wn20/) as published by W3C (2M triples) and the locations used in IMDB (60K triples). Step 3: Aligning and enriching vocabularies/metadata Here we did (1) alignment of Genre vocabularies, (2) semantic enrichment of the Genre vocabulary in TV-Anytime, and (3) semantic enrichment of TV metadata with IMDB movie metadata. First, aligning the Genre vocabularies was a small semi-automated exercise in which several translations were specified towards the TV-Anytime vocabulary, such as the associations between xmltv:documentaire and tva:documentary, between imdb:thriller and tva:thriller, and between imdb:sci-fi and tva:science fiction. Second, for the semantic enrichment of the Genre vocabulary, – based on the original XML Term hierarchy, skos:narrower relations are introduced, for example between tva:news and tva:sport news. – based on partial label matching, skos:related relations are defined, for example between tva:sport news and tva:sport. – background design knowledge has been the motivation for distinguishing skos:related relations between siblings, such as between tva:rugby and tva:american football. Third, in terms of semantic enrichment of the TV metadata (that can come from different grabbers in different languages) we use from IMDB the country AKA-titles to link each grabbed program to the associated concept in IMDB. Step 4: Using the resulting RDF/OWL graph for recommendations To recommend TV programs or movies, the resulting RDF/OWL graph is extended with the user model in a format such that the eventual RDF/OWL knowledge structure can be directly used for the recommendation. What happens is that when user rates a program P , implicitly program P is rated as well as all programs which are related in the knowledge structure. Moreover all programs with a genre that is related to a genre of P are rated, as well as the genres themselves via skos:related and skos:narrower relations. In imdb:persons all actors, directors and persons associated with P are rated. In this way, ratings are added to the user model, within the user’s context.

3

iFanzy Architecture

An important requirement for iFanzy is to provide this service in a ubiquitous and responsive way, e.g. independent of the platform used or the current location of the user. Therefore we opted for a client-server architecture, where the user uses the iFanzy front-end with different devices connected to the SenSee server. All heavy computation work is done at the server side. This ensures that virtually any machine (including mobiles and set-top boxes) that can connect through the Internet can be linked to the system. The server deals with very large data collections of browsable content - hundreds of thousands of programs from various sources, as well as knowledge structures used for recommendation and semantic search. Thus, SenSee should handle

the concurrent use of hundreds of potential users per server. Although we see many data-intensive Semantic Web applications, scalability is still an important research issue for truly real-time Web-applications. In order to reach the desired scalability we performed many optimization steps [6]. The recommendation part depends heavily on the quality of the system’s knowledge of the user. To cope with the cold start, we devised a statistical recommendation algorithm to find the most relevant programs based on a basic set of user registration data. Further, iFanzy’s training algorithm allows to refine the user data from the user’s behaviour and explicit feedback. The Web client, for instance, tracks the clicks made on specific content items and the search terms used. The set-top box on the other hand, monitors and stores the viewing behaviour. The user can also utter specific likings (explicit feedback) to inform the system what he/she appreciates.

4

Conclusion and future work

Different versions of the different clients and server systems have been implemented and successfully evaluated in collaboration with our commercial partner Stoneroos. As future work we are redesigning the iFanzy frontend and SenSee backend, based on our practical experiences, and we plan a next performance optimization step with parallel query evaluation and load-balancing strategies. Currently, an evaluation trial with 500 set-top boxes in Dutch households is prepared together with Stoneroos.

References 1. Bjorkman, M., Aroyo, L., Bellekens, P., Dekker, T., Loef, E., Pulles, R.: Personalised home media centre using semantically enriched tv-anytime content. In: EuroITV 2006 Conference. (2006) 156–165 2. Aroyo, L., Bellekens, P., Bjorkman, M., Houben, G.J.: Semantic-based framework for personalised ambient media. Multimedia Tools and Applications 36(1-2) (2008) 71–87 3. Bellekens, P., van der Sluijs, K., Aroyo, L., Houben, G.J.: Engineering semanticbased interactive multi-device web applications. In: Proceedings of the 7th International Conference on Web Engineering (ICWE’07). Volume 4607 of Lecture Notes in Computer Science., Como, Italy, Springer (2007) 328–342 4. Blanco Fernndez, Y., Pazos Arias, J.J., Gil Solla, A., Ramos Cabrer, M., Lpez Nores, M.: Bringing together content-based methods, collaborative filtering and semantic inference to improve personalized tv. 4th European Conference on Interactive Television (EuroITV 2006), (May 2006) 5. van Setten, M.: Supporting people in finding information: Hybrid recommender systems and goal-based structuring. Telematica Instituut Fundamental Research Series, No.016 (TI/FRS/016). Universal Press. (2005) 6. Bellekens, P., van der Sluijs, K., van Woensel, W., Casteleyn, S., Houben, G.J.: Achieving efficient access to large integrated sets of semantic data in web applications. In: Proceedings of the 8th International Conference on Web Engineering (ICWE’08), to be published

Suggest Documents