iiWAS 2003, The Fifth International Conference on Information and Web-based Applications & Services, Conference Proceedings, September 15-17,2003, Jakarta, Indonesia
Active Data Integration for Distribution of Up-to-date Tourism Media Sebastian Kornexl*, Franz Pühretmair**, Wolfram Wöß*, Roland R. Wagner* *Institute for Applied Knowledge Processing (FAW), Johannes Kepler University Linz, Austria {skornexl, wwoess, rrwagner }@faw.uni-linz.ac.at ** Competence network information technology to support the integration of people with disabilities (KI-I) Hagenberg i. M., Austria
[email protected]
Abstract. The importance of the World Wide Web to gather information on any topic in the field of tourism grows more and more. This is because of the vast quantity and broad spectrum of tourism information that are already distributed on various Web sites. The information requested by the tourist is diversified. Every information, which can be associated with the vacation, could be of interest. The efficient use oft the existing information is limited by time wasting locating and filtering of information in the Web. Missing options for defining personal preferences requires the tourist to scan the whole information provided. Beside tourists also provider of touristic products use information systems to make information of different categories available for their clients. Information technology support could achieve tremendous time saving, especially if the generation of media and the distribution have to be done periodically. The approach described in this paper demonstrates how automating the collecting, rehashing and distributing of the information can unburden information provider and consumer.
1
Introduction
Before buying a touristic product a huge amount of diversified information is requested by the tourist. Touristic products are immaterial and create needs on widespread and distributed information. The reason for the accumulated demand for information is that products in this economy are composed of many different parts. Consider that a vacation at least consists of transportation, accommodation and other relevant services. To gather all this information many different data sources have to be tapped [1]. The general trend to last minute booking and the tourists claim for more exact information and the ability to do these activities autonomous, anytime and anywhere, accounts the success of tourism information systems [2]. The quantity of information provided in the Internet is tremendous. The main problem is not the question whether the information is available or not but where to find the requested information and how much time is spent to find the information that fits the personal needs [3]. The availability of comprehensive and up-to-date touristic data is one of the main points for the success of a tourism information system. Each user has an own vision of a vacation and therefore an individual demand for information. The goal is to meet the individual needs as close as possible by providing as much appropriate information as possible. Already existing tourism information systems do not only differ in look and feel and usability but also in the kind and type of the provided information. If the data set of one information system is insufficient to answer the enquiry of a tourist this information
iiWAS 2003, The Fifth International Conference on Information and Web-based Applications & Services, Conference Proceedings, September 15-17,2003, Jakarta, Indonesia
has to be provided anyway and finally integrated from other sources. In many aspects this kind of integration is problematic. The paper is structured as follows. The first part of chapter 2 will point out categories where problems occur in using existing and designing new and efficient tourism information systems. Further, technologies and techniques are presented to eliminate these problems. The last part of chapter 2 specifies the architecture that was designed for implementing a later introduced tourism application. Chapter 3 discusses possible scenarios in the field of tourism, which benefit from this approach. Finally, chapter 4 gives a conclusion and an outlook on further work.
2
Active Generation of Personalized Tourism Media
To enhance the quality of an Internet application various scopes can be treated. As mentioned in the previous chapter the quality and quantity of the provided information is one of the key facts for efficiency, but the more extensive the information the harder for the users to filter out the information, which is relevant for them. Deliberate design of Web sites with an intelligent search and a clear structure simplifies the retrieval but does not always deliver satisfying results. The approach of personalization targets this problem. To make use of personalization content and layout have to be separated and the provided content has to be proper for a user-defined combination. An important point with respect to the quality of the provided information is the up-to-date ness. This demands for consequent maintenance of the own data source and has to be considered when choosing the way of integrating foreign data sources. The more activities for collecting, rehashing and distributing information the information system handles the more efficient is the use of this system. Automated generation and distribution aims to simplify a periodical or event driven delivery of information. Depending on the interests of the users they might want to be informed on different topics. Normally this is achieved by continuously visiting various Web sites and collecting the requested information. In the presented approach this task is undertaken by the integrated information system dispatching the requested information to the tourist. To automate this process it is necessary to be able to define medias in a way that the system can create the requested result without any manual input. The possibility to define schedules or events is a further feasible extension for information delivery. Furthermore the approach can also be used for the generation of printable offline media like up-to-date marketing materials. In the following sections a technology called Active Documents is presented that is developed among other things to automate workflows of generating media in different formats and layouts from up-to-date online data and distributing these media triggered by certain criteria if needed. The general purpose of Active Documents developed in the EC-funded projects XML-KM [4] and WebSI [5] is the event driven automated visual preparation of XML data and its automated distribution. An Active Document consists of two parts. The XML data part describes the data structure to be included in the generated media. The XML-flow defines the workflow that will be automated. In order to be able to trigger the Active Document Manager, a publisher/subscriber event-handling pattern is implemented. Different components can act as event-publisher. An application can subscribe for a certain event, which is released on well-defined criteria. If the criteria are fulfilled the application gets notified by the event-publisher. After its generation the media is ready to be delivered e.g., as mail. The automated delivery will be handled by a mailing component. The architecture for a tourism application based on the concepts of active data integration and
iiWAS 2003, The Fifth International Conference on Information and Web-based Applications & Services, Conference Proceedings, September 15-17,2003, Jakarta, Indonesia
distribution is described in Figure 1. Most components of the architecture for active data integration and distribution are implemented as Web Services. A key advantage of Web Services is the communication that is based on SOAP (Simple Object Access Protocol) and HTTP (Hypertext Transfer Protocol). SOAP bases on XML and is therefore platform independent, which is a big advantage for the integration in unequal developed applications [6]. This special type of a component-oriented architecture enables effortless extensions of the architecture and reuse of certain components for other purposes.
Figure 1: Architecture of the tourism application
To harmonize access to the different data sources the Data Integration Web Service was designed as central component in the Data Integration Layer [7]. For effortless extension and integration in other applications this component is developed as a Web Service. Data, which is used for media generation as well as application data, like user details and preferences are accessed and modified using the SOAP interface of the Data Integration Web Service. The advantage of this approach is that components using this service do no have to consider the underlying data structures and internal data organization. Each data source can be queried using one uniform interface. Additionally, it is possible to replace a data source by another one without any consequence concerning the query functions of the accessing application. The central Web Service of this architecture is the Document Manager. This service is responsible for administration and initialization as well as processing of Active Documents. Each Active Document consists of a number of settings regarding content, schedule, destination format, type of delivery etc. Active Documents can be generated once, periodically or event driven. For periodical
iiWAS 2003, The Fifth International Conference on Information and Web-based Applications & Services, Conference Proceedings, September 15-17,2003, Jakarta, Indonesia
and event driven generation the trigger starting the generation of a document is controlled by the Event Manager and its event publishing components. The Scheduler Web Service is responsible for a periodical triggering of generation, while the Change Monitor controls the update of data to identify data changes, which will cause a generation of an event driven Active Document. For document transformation and delivery, the Document Manager is supported by Resource Web Services. The Document Converter transforms the transferred XML data into destination formats like PDF, RTF or other file formats. The SOAP part of the request message contains information regarding transformation. The attachment itself is an XML file containing the data the media has to be generated of. The result is returned as attachment of a SOAP message or an URL for downloading the document is sent back. For the email distribution of documents the Mail Web Service is used. The SOAP message for sending an email contains the list of recipients, the subject and the response address in the message body as well as the content of the email as attachment.
3
Application scenarios in the field of tourism
The following scenarios give real life examples to show the advantages and possible fields of application to implement the explained approach. 3.1
Personalized newsletter and electronic media
To disburden a tourism information system user from periodically searching through Web sites some systems already provide the functionality of newsletters. Such general newsletter may contain some interesting information for each user but also contain overhead information. The introduced technology overcomes this shortcoming and provides more powerful newsletter functionality. Due to the fact that each personalized newsletter is generated individually the content and the time schedule the newsletter is delivered can be perfectly harmonized with the needs of the recipient. Additionally, data integration allows the generation of newsletters containing data of various sources. The way of generating media allows the recipient to define the content based on a set of criteria and the time of delivery. A possible application scenario might be as follows: a tourist would like to receive a newsletter when the snow level of a specific skiing region exceeds 1.5 meters. The generated newsletter should contain information about available rooms in 4* hotels including prices and hotel descriptions, events within the next two weeks and information about operating skiing lifts and prices. 3.2
Printable offline media
The generation of different data formats allows the definition of media designed and intended to be printed out. Also for this application the integration of data from different source into one media offers several advantages. The fact that the system collects, rehashes and visualizes the information in definable layouts significantly disburdens the user. A possible scenario is the so-called Morning Post, which is a daily generated and printed offline medium that informs hotel guests. Layout and type of information are often fixed, but the content has to be updated every day. Hence, structure and layout of the Morning Post are initially defined. Thereafter, generation and printing of the Morning Post is triggered every day, thus, providing a print medium covering the most up-to-date data possible without any further effort. Such a Morning
iiWAS 2003, The Fifth International Conference on Information and Web-based Applications & Services, Conference Proceedings, September 15-17,2003, Jakarta, Indonesia
Post could contain data like weather forecast, snow conditions, lake temperatures, events, etc. For defining the layout of the medium predefined settings can be selected or own specifications can be made like defining fonts and including logos to create a personally styled medium.
4
Conclusion and Further Work
For many cases the required data is available anywhere in the Internet, thus making efficient use of distributed data difficult. The approach presented in this paper shows how already existing but heterogeneous data can be reused efficiently to create and deliver personalized up-to-date tourism information. Providing extended data sets extracted from diverse data sources with possibilities for personalization unburdens the user from searching information. The integration of data is processed at the time of document generation and therefore providing most up-to-date content is ensured. The individual media design and its automated distribution extends the application fields and significantly increases the quality of information providing. The status in application development can be summarized as follows: A first preliminary version of periodical newsletters that base on Active Document functionality has been developed in the XMLKM project [4]. The follow-up project WebSI [5] will extend the prototype with functionality like printable versions of medias, extended possibilities for triggering the generation, and direct layout adaptations of the documents will be implemented.
Acknowledgement This work has been partially funded by the European Union’s Fifth RTD Framework Program (under contracts WebSI IST-2001-35458 and XML-KM IST-1999-12030).
References [1] Birgit Pröll, Werner Reschitzegger, Roland Wagner, "Tiscover eine generische Plattform für webbasierte Tourismusinformationssysteme", Informatik Forschung und Entwicklung (2001) Heft 16: 1-13, Springer Verlag [2] Anton Dunzendorfer, "Tourism Information System – Query Language, Einheitlicher Datenaustausch zwischen heterogenen Tourismus-Informationssystemen", PhD thesis, Institute For Applied Knowledge Processing, Johannes Kepler University, Linz, 2001 [3] M. Haller, B. Pröll, W. Retschitzegger, A.M. Tjoa, R.R. Wagner, "Integrating Heterogeneous Tourism Information in Tiscover – The MIRO-Web Approach", ENTER2000, Information and Communication Technologies in Tourism, Barcelona, 2000 [4] IST-1999-12030 XML-KM project: XML Knowledge Mediator, http://www.ipsi.fraunhofer.de/oasys/projects/xmlkm/index_e.html, Accessed: June 2003 [5] IST-2001-35458, Web-SI project: Data Centric Web Service Integrator, http://www.ib-ia.com/websi/html/home.htm, Accessed: June 2003 [6] Andreas Schmidt, "Web Services", http://www.fzi.de/dbs/applAreas/Web_Services.pdf, Accessed: June 2003 [7] Gardarin G., Ceri S., Gomez J.: System Architecture, Project: Web-SI, Data Centric Web Service Integrator, IST2001-35458, Deliverable D5 V1.0, IBERMATICA S.A., Politecnico di Milano, e-XMLmedia, 2002