Integration of distributed data sources for mobile services Gianpietro Ammendola, Alessandro Andreadis, Giuliano Benelli, Giovanni Giambene Dipartimento di Ingegneria dell’Informazione, Università di Siena Via Roma 56, 53100, Siena, Italy Tel.: +39 0577 234608, email: {andreadis,benelli,giambene}@unisi.it,
[email protected] ABSTRACT
This paper deals with the Personalised Access to Local Information and services for tOurists (PALIO-IST 20656) Project within the fifth research framework of the European Commission. The focus is on tourists that need to access information on the city or on the region they are visiting by means of a cellular phone or at specific locations (kiosks). The PALIO project is committed to produce a prototype showing the potentialities of combining user position and user preferences in order to provide context-aware information. This paper presents the envisaged architecture and proposes a solution to map and to retrieve information contents from distributed databases.
deployment of new systems and interfaces (if needed). The right compromise in the number of layer is three: in this case we have separation between User Interface (UI), business logic and DataBase Management System (DBMS). Each tier of the application has specific responsibilities [3]: •
The Client tier allows users to interact with the application.
•
The Middle tier (middleware) is the intermediate software level with the task of processing user requests and accessing the data contained in Enterprise Information Systems (EIS) tier.
•
The EIS tier integrates the application with information systems (DBMS); it provides data storage and other information services to the application.
I. INTRODUCTION
In recent years Internet has provided a good environment for the development of all kinds of applications and has changed the development method from standalone PC and traditional client/server architectures to multi-layered ones. Many systems are transferred to Web application accessible through thin clients. Users do not need to install any particular software on their PC, but a Web browser [1]. On the other hand, the possible clients of Web applications are not only the normal browsers, but also new media-like wireless phone, Web-TV, PDA, car radio, hi-fi, that can access the Internet (directly or through a gateway) with different scopes and that can use several presentation markup languages like HTML, XHTML, CHTML, WML, or other new ones. A natural choice for developing accessible and portable mobile applications is the use of Java and XML technologies. These technologies are the basis for the realisation of an integrated environment for information retrieval and presentation in the IST PALIO project. Java is a powerful language for server-side applications, it supports servlets and Java Server Page (JSP) [2-3], two efficient approaches for creating Web applications. Any enterprise application of realistic size requires a consistent architectural vision. Enterprise applications, with no clear requirements and poor design are destined to failure. A layered architecture provides natural access points for integration with existing systems and for the
In large and important applications, developed with Java enterprise technologies (J2EE), the middleware is decomposed (see figure 1) into two tiers [3]: •
The Web tier makes application functionality available on the World Wide Web. It accesses data and business functionality from other tiers and manages the screen flow.
•
The Enterprise JavaBeans (EJB) tier provides portable, scalable, available, and high-performance access to enterprise data and business rules. It offers object persistence and access to business logic implemented as enterprise bean components.
In this paper, we focus on the PALIO system, based on the three-tier approach, and we describe the adopted solutions to provide personalised information to mobile client devices. PALIO is an innovative system that implements novel adaptivity functions well suited for the access of mobile terminals to information distributed in the Web. In particular, adaptation is based on both knowledge available to the system (usually prior to the initiation of interaction) and knowledge acquired by the system during interactive sessions. Examples of content adaptations are: content filtering (keep only relevant contents) and content variants (e.g., with or without images; images of different sizes and resolutions).
Whereas, an example of presentation adaptation is personalised and tailored views of the information. Mobile Device
Client Tier
WEB Tier
EJB Tier
EIS Tier
Middleware Servlet EJB
DB
DBMS
Figure 1: Three-tier architecture.
In what follows, we first give a brief presentation of the general PALIO architecture; second, we focus on the interactions among system modules and, in particular, at the lowest tier, that is in charge of retrieving information contents through the integration of distributed data sources. II. MVC PATTERN
In addition to being divided into three tiers, the application architecture of the PALIO system is based on a Model-View-Controller (MVC) organisation [2]. The MVC abstract architectural pattern is a possible and realistic solution to deploy extensible, maintainable, scalable and portable mobile applications. The MVC architecture organises an interactive application project by separating data presentation, data representation, and application behaviour (see figure 2). State Query
State Change
Model Change Notification View Selection
View
Controller User Gestures
Figure 2- MVC Model.
The Model represents the structure of the data in the application, as well as operations on those data. The View(s) presents data in some form to a user, in the context of some business function. The Controller translates user actions and user input into business object calls on the Model and selects the appropriate View based on both the user preferences and the Model state. In other words, a Model abstracts application state and functionality, a View abstracts application from presentation, and Controller selects application behaviour in response to user input. MVC provides many benefits to a design. Separating Model from View (that is, separating data representation from presentation) makes it easy to add multiple data presentations for the same data and facilitates the inclusion of new types of data presentation as technology develops. Model and View components can vary independently, enhancing maintainability, extensibility, and testability. Separating Controller from View (application behaviour from presentation) permits run-time selection of appropriate Views based on workflow, user preferences, or Model state. Separating Controller from Model (application behaviour from data representation) allows configurable mapping of user actions on the Controller so that the Model can perform appropriate actions. A Web tier Controller has the following tasks: •
Translates HTTP actions (POST, GET, PUT) into actions on the MVC model.
•
Selects the next view to be displayed, based on user activity and model state.
•
Delivers View content to the client.
MVC Views display retrieved data from or produced by the MVC model. View components in the Web tier are usually WML or HTML pages, JSP pages, or servlets. HTML and WML pages contain static content, while dynamic content is usually generated by Web-tier JSPs and servlets. JSP pages are well suited for generating text-based content, often WML, HTML or XML. servlets are appropriate for generating binary content (e.g., images, PDF and PostScript files). MVC Model classes implement business logic without reference to any specific presentation technology. Applications may implement the Model as a collection of conventional Java objects, often in the form of JavaBeans components used directly by JSP pages. Larger applications frequently replace these Web-tier model beans with enterprise beans, which offer scalability, concurrency, load balancing, and automatic resource management. JavaBeans components provide quick access to local data, while enterprise beans
provide remote access to concurrent data and shared business logic.
the Internet. In particular, GIS permits to retrieve information in distributed databases and, then, provides the obtained contents in an XML format to the SSC.
III. SYSTEM ARCHITECTURE
The high-level PALIO system architecture shown in Figure 3 encompasses the main building blocks described below. Communication platform
Distributed Information Centres
INTERNET AVC
4.
The adaptation infrastructure is responsible for content and interface adaptation. This is a crucial building block of the PALIO system since information must be adapted on the basis of user preferences and the access device. In fact, since mobile devices have reduced display, storage and processing capabilities, special adaptation rules must be employed to tailor the contents provided to users. In particular, ad hoc XSL stylesheets are employed.
WAP/SMS Gateway
Localization Systems
SMS gateway
MMS gateway
ISDN - PSTN
GSM - GPRS UMTS
1
CL
Server
PDA
Kiosk
Figure 3: PALIO high level architecture. The communication platform comprises the Internet, network interfaces and gateways necessary to integrate the distributed components of the PALIO system.
2.
The Augmented Virtual City Centre (AVC) contains adaptation and control functions.
3.
The Distributed Information Centres in the territory.
4.
The Localisation Systems comprises the GPS solution and the mobile network-based solution to identify the position of the tourist in outdoor environments.
The AVC centre consists of the following elements (see figure 4): 1.
2.
3.
Response trasformation
10
Workstation
GPS device
1.
WEB server
Response handling
Request handling
Laptop
WAP/SMS device
WAP gateway
The Communication Layer (CL), an “abstract” entity allowing the communication of the AVC with the other servers (i.e., Web server, WAP and SMS gateway, and localisation server) in the common HTTP format. The Service Control Centre (SCC), the heart of the PALIO system. It provides the runtime platform for the system information services. The Generic Information Server (GIS) allows the access to the distributed information space through
HTTP request XML encoded data
HTTP response XML encoded data
2
SCC service response
9 XML
XML
Adapter 3
service response
SOAP
4
ontology response
SOAP ontology query
8
5 GIS
7
Mappamondo DB
6 DB
DB
Figure 4: Components and communication protocols in the PALIO framework. The PALIO system implements the MVC architecture concepts as follows: (i) the XML format permits a general data representation, independent of visualisation; (ii) the Controller functions are implemented in the SCC that receives client inputs
through the CL module and calls the appropriate Model action; (iii) the Model functions are shared between GIS that access the requested information and the SCC that manages contents; (iv) suitable visualisation formats1.
2.
Several different databases that cover the same types of data, can employ the same ontology. This fact simplifies the problem of database integration, i.e., of processing queries across multiple databases. The existence of different ontologies for the same types of data creates a semantic mismatch problem that complicates the multidatabase query problem; a unified ontology is necessary.
3.
The use of ontologies coded in a formally specified, readily understandable form, helps database developers who wish to make their database schemas available to their user communities.
IV. FLOW CONTROL
The operations performed by the AVC are detailed below, with the steps represented in figure 4. 1.
The request coming from the client (i.e., the tourists) reaches the SCC by means of the HTTP protocol through the CL. The CL transforms the request in XML format, before passing it to the SCC. The CL is also responsible for requesting the position information, according to the selected localisation method.
2.
The SCC acquires from the client request all the necessary information (i.e., user identification, user agent type, location and type of request).
3.
The Adapter receives the information regarding the request, the user, the client device, the client agent and the location and it provides the necessary adaptations using its decision making engine.
4.
The request is forwarded to the GIS.
5.
The request is analysed in order to formulate the necessary query to interrogate the database that contains the desired information.
6.
Queries are forwarded to the data sources.
7.
The response is generated.
8.
The responses are forwarded to the SCC.
9.
All the information (provided by the GIS in XML format) is collected within the SCC, that builds the response.
10. Final construction of the response. V. ONTOLOGY
The definition of ontology given in [5] – shared understanding and communication between people with different needs and viewpoints arising from their particular contexts – will be used in this paper. The ontology concept is about definitions of entities, their attributes and relationships that exist in some domain of interest. Ontology is important for several reasons. 1.
The ontology development is a prerequisite for database design and for the development of knowledge-based systems. This design is difficult and time-consuming. Adopting an existing ontology is a faster solution than developing a new one.
The original XML format is transformed in WML or HTML by means of translation (XSL stylesheets). 1
The XML-Based Ontology Exchange Language (XOL) [6] is a language for exchange of ontologies definition. The XOL authors have chosen XML because is simple and powerful. XOL is designed to provide a mechanism for encoding ontologies within a flat file that may be easily published on the Web. The language is designed to be readable by humans, to be easily analysed by programs of modest complexity. It is also designed to be expressive to capture a rich variety of ontologies. The ontology concept has been applied in the PALIO Project for distributed data source representation. In particular, at the present level of PALIO implementation, the ontology approach has been employed to categorise data for tourist services such as: transportation, accommodation and sightseeing. The simple example below describes the taxonomy adopted for the accommodation service in XOL format. Accommodation Type Accommodation Area Accommodation FacilitiesGeneral Accommodation PaymentMethod Accommodation ... Hotel Type
Camping Type Address Accomodation RoomDescription Accommodation ...
proceeds to the construction of the query in SQL language. This SQL query is directly sent to the DB that contains the desired information through JDBC Java API. The result of the response (i.e., the extracted objects) is then processed and organised in XML format according to the PALIO ontology.
GIS
Mappamondo DB
AVC
Driver Manager JDBC
VI. INFORMATION RETRIEVAL
The mapping of the information contained on distributed data sources in the Internet is a task of the GIS module within the PALIO system. This operation is based on the “Mappamondo” database contained in the GIS (see figure 5). By using the Mappamondo database it is possible to know which data sources contain the information requested by the user, the necessary parameters for the access (URL, Driver, User-ID, Password, etc.) and how this data sources are structured (i.e., their relational entities). Mappamondo DB structure follows the onthology described by an appropriate XOL document. Data are extracted by means of SQL queries and JDBC connections. Then, data are inserted in XML tags and the XML response is sent to the SCC, through the Simple Object Access Protocol (SOAP). SOAP [7] is a lightweight XMLbased protocol for the exchange of information in decentralised, distributed environments. The interfaces for accessing information are depicted in figure 5, from the high level of the GIS to the low level of the data sources; this structure allows the system to be open and independent of the DB architecture, thus unifying the access to distributed information sources. In fact, it is possible to use some drivers, at different levels for interfacing to different DB architectures. Hence, there is no need to modify the PALIO server and the query/answer operations for extracting the data stored in the available DBs. All these operations are implemented through a system of Java Servlets and simple Java object or JavaBeans that perform the process of connecting to a DB and generating the interrogation. The GIS server knows the necessary parameters to establish the connection (database URL, DBName, driver, UserId, Password, tables and field names that contain the requested data, etc.), consulting the Mappamondo database. In the PALIO system, the phases preceding the access to DBs (i.e., adaptation, ontology and metadata consulting) allow acquiring all these necessary parameters to access distributed databases. After this phase, the GIS server
Bridge ODBC/JDBC
Driver JDBC
Driver ODBC
DB ODBC
DB JDBC
Figure 5: Interfaces to distributed databases. VII. CONCLUSIONS
A solution is here proposed to integrate various data sources that contain information for tourists. A three-tier architecture is presented, according to which adapted pages are dynamically built and delivered to mobile clients. The system we have described reflects present achievements of the ongoing IST PALIO project. REFERENCES
[1] H. Maruyama, K. Tamura, N. Uramoto, “XML and Java: developing Web applications, AddisonWesley”, 1999. [2] M. Hall, Core Servlets and JavaServer Pages, Prentice Hall, 2000. [3] Java Blueprints “Designing Enterprise Applications with the Java 2 Platform, Enterprise Edition” http://www.java.sun.com. [4] The PALIO Consortium, “PALIO project”, (WWW page) URL: http://palio.dii.unisi.it. [5] M. Ushold, M. Gruninger, “Ontologies: Principles, Methods and Applications”, February 1996. [6] XOL: An XML-Based Ontology Exchange Languagehttp://www.ai.sri.com/pkarp/xol/xol.html [7] Simple Object Access Protocol (SOAP) 1.1 http://www.w3.org/TR/SOAP.