An Approach to Lightweight Deployment of Web Services J. Gergic, J. Kleindienst Voice Systems and Technologies IBM Czech Republic Murmanska 4//1475 100 00 Prague 10 Czech Republic +420 2 72131715, +420 2 7213 1436
Y. Despotopoulos, J. Soldatos, G. Patikis, A. Anagnostou National Technical University of Athens, 9 Heroon Polytechneiou Str GR-15773 Zografou, Greece +30107721479, +30107721475
1
L. Polymenakos IBM Hellas 284 Kifisias Avenue., GR-15232 Chalandri, Greece +30106881635
[email protected]
{jankle,jaroslav_gergic}@cz.ib {jsoldat,ydes,gpatikis,aanag}@t m.com elecom.ntua.gr such as CORBA[13], RMI[10], DCOM. Such technologies allow ABSTRACT Web Services is gradually becoming the most popular distributed computing paradigm for the Internet. Although several vendor and research efforts are in progress, fully-fledged deployment of Web Services in a wide scale has not been accomplished yet. The present contribution describes a framework for lightweight deployment of Web Services. This framework can be seen as a contribution to a smooth transition step towards a complete largescale deployment. The paper starts with a description of data access techniques supporting multiple data providers, devised and used in the scope of a European research project. It is illustrated how schemes employed to provide remote transparent access to the data providers evolved to a lightweight Web Services framework.
Categories and Subject Descriptors D.2.2 [Software Engineering]: Design Tools and Techniques Modules and interfaces
General Terms Design, Languages.
Keywords Web Services, HTTP, XML, WSDL, UDDI, SOAP, EIRI, CATCH-2004
1. INTRODUCTION During the last few years we are witnessing a development of networks and the Internet in an alarming rate. This has given rise to the concept of distributed computing, which led to the introduction of several distributed programming technologies Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SEKE ’02, July 15-19, 2002, Ischia, Italy. Copyright 2002 ACM 1-58113-556-4/02/0700…$5.00.
1
for the distributed completion of tasks based on the concept of remote invocation of computing functions. Even though such models are in place for some years, few mature products exist. This is because these technologies are quite complex, requiring high communication costs and overheads, while at the same time providing low degrees of scalability and interoperability.
Recently it was envisaged that the use of lightweight and highly interoperable protocols such as the Hypertext Transfer Protocol (HTTP), could support the distribution of services more efficiently. Based on this idea the concept of Web Services has emerged as the next generation of distributed computing. Web services can be described as autonomous, modular applications, which can be run over the Internet towards performing a specific business task. Moreover, these services conform to a specific technical format easing their invocation and use over the WWW, as well as their combination towards completing more complex business processes. Illustrative examples of a Web services are a ticket reservation service, a yellow pages directory service, a language translation, or even a service calculating medical claims. The advances in the Internet infrastructure and the rapid evolution of the WWW are definitely the major enablers of Web Services. Web Services deployment is expected to have several positive implications on corporations, end users, operators and service providers [12]. This is chiefly due to the fact that this paradigm provides platform and vendor independent descriptions of services, thus enabling interested parties to discover and exploit services, independently of enabling technologies and hosting platforms. A rich set of resulting benefits is straightforward: Integration efforts are minimized due to the common service format, while the tasks of publishing, registering, finding and utilizing business services become easier. Also, since web services act as individual business pieces, the task of constructing new business applications can be completed in a timely and cost effective manner. As a result, innovative products and value added services featuring minimal time-to-market can be developed and delivered. Operational costs are lower because corporations need to operate and administer only a small portion of services. Development costs are also minimized, since several components are developed only once and reused in the scope of various business applications. Furthermore, adaptation and customization of business applications to new requirements and
Work partially funded by the European Union, through the research project IST-1999-11103 CATCH2004 (Converse in Athens, Cologne and Helsinki) [4][2], in the IST (Information Society Technologies) research program [16]
- SEKE '02 - 635 -
specifications is rendered easier that ever before. Finally, corporations can enjoy increased service reliability by simply deploying multiple instances of the necessary Web services [15]. Intensive work is carried out towards standards, protocols, APIs and tools for creating, discovering, registering, managing, integrating, deploying and testing web services [1][9][14]. Characteristic examples are: (a) Universal Description, Discovery, and Integration (UDDI) [17] and its associated specification as a means of discovering web services, (b) The Web Services Description Language (WSDL) [19], which provides for the definition of the interface to a service and for the identification of actual service providers on the network, (c) Simple Object Access Protocol (SOAP) [18], which offers a way for applications to communicate with one another over the Internet. Note that SOAP uses HTTP to penetrate firewalls, relies on XML to define the format of the information and then adds the necessary HTTP headers to send it, (d) The ebXML specifications, which define registry services, as well as the model for a Web services registry, (e) XAML (Transaction Authority Markup Language), an initiative launched by Bowstreet, HP, IBM, Oracle and Sun. XAML is a set of XML interfaces that enable Web service providers to conduct business transactions involving multiple, distributed Web services. The requirement for many and complex specifications, tools and techniques provides an explanation why Web services deployment is still in its infancy. We believe that there is a need for providing a smooth transition path to the web services deployment. The present paper aims at contributing to a smooth transition step, through introducing a framework for lightweight deployment of web services. The term lightweight implies that the presented approach and respective implementation adopt the web services concept without however addressing all the aspects of the Web services deployment. For examples issues such as security, registration and discovery are not addressed. Also, instead of relying on complex standards the proposed framework adopts simplified implementations of information exchange mechanisms, which are however similar to those introduced by technologies such as SOAP, WSDL and UDDI. It should be noted that the proposed framework is primarily designed to enable information retrieval and thus features a key difference to SOAP: Instead of relying on the generalized symmetric XML messaging exchange paradigm, it is highly asymmetric allowing retrieval of XML data in response of simple queries (requests) encompassing sets of key-value pairs. The motivation for creating this framework emerged in the scope of the CATCH2004 project. This project will deliver a novel prototype information retrieval system enabling users to access information through multiple modalities and their combinations, in multiple languages and using several terminal devices (e.g., telephones, PCs, Info Kisoks, PDAs, WAP phones etc.). The system comprises quite complex content acquisition and transformation components, which interface to several data providers. The latter interfaces are implemented as web services according to the proposed framework. The paper has the following structure: Section 2, following this introduction refers to the CATCH2004 project and explains the motivation for implementing WWW Services. Special emphasis is put to describing the specification and standardization of a Web
interface (namely the EIRI interface) to the system’s back end, which was implemented as a Web service. It is also illustrated how this interface evolved to a Web services deployment framework. Section 3, is devoted to describing this ‘lightweight’ framework. Finally section 4 concludes the contribution.
2. CATCH2004 data tier as a Web Service 2.1 CATCH2004 Overview and the EIRI concept CATCH 2004 stands for Converse in AThens, Cologne and Helsinki and is a research activity co-funded by the European Union in the scope of the Information Societies Programme. The main goal of CATCH-2004 is to develop a multilingual, conversational system enabling retrieval of information and simple transactional services. The CATCH2004 system is based on a novel multi-tiered unifying architecture across devices and services [3], which supports multiple (input/interaction) modalities (e.g., speech, GUI modalities, and their combinations) and multiple client terminals (such as PCs, kiosks, phones, mobile phones, PDAs and smart wireless devices). The project has already developed two prototype demonstration systems, one in Helsinki and another in Athens. The city of Athens is the central site for the demonstration of the CATCH-2004 technology. Athens has been selected to be the city where the 2004 Olympic Games will be held and the Organizing Olympic Committee for the Athens 2004 Olympic games is the participating government agency in this project. Athens will be offering a variety of information and transaction services and the millions of participants and visitors of the Olympic Games. Such services include scores and other info about the Olympic Games, cultural and entertainment information and events, and phone directory and banking services. These services will be available in kiosks around the city and over the telephone (standard or wireless). Athens provides an excellent testing ground for the deployment of a large scale system combining HLT and other technologies to facilitate visitors, participants and organizers. Further, Athens is a major tourist and cultural center with millions of tourists every year and the system developed in CATCH-2004 could become part of the permanent infrastructure identifying the city almost uniquely. The city of Helsinki with one of the largest wireless communications penetration and services provided, will be the celebrated Cultural Capital of Europe in 2000. Helsinki is therefore, an excellent candidate for demonstration of services for smart wireless devices and standard wireless telephones. Access to information and services will be available from local databases and from information presented by other demonstrator cities like Athens or Cologne. Demonstration applications will be developed in the two participating cities, Athens and Helsinki. A third city, the city of Cologne, will be subcontracted as an additional tester of the demonstrators developed in the above participating cities and will provide input on the usability of the developed architecture and systems. We expect that the demonstrators and testers of the developed systems will address the needs of the both the cities and the public, since each demonstrator and tester site represents different sectors of public service (operator, city, international athletic event) and will be directed to different groups of
- SEKE '02 - 636 -
society/public (citizens, immigrants, tourists). We describe below briefly how the two demonstrators and the additional tester will take advantage of the developed system. Each demonstrator provides several retrieval and/or transactional applications, such as city information retrieval, sports information retrieval, Olympic information retrieval, information about TV/Radio/Restaurants etc.
Figure 1 depicts the use of the EIRI concept towards interfacing to multiple data providers and/or applications. Observe that HTTP serves as a global method for information delivery to the various devices. This is because all device types (PDAs, WAP phones, PCs, Kiosks etc.) support HTTP towards accessing dynamic content. CATCH2004 implements HTTP access (based on Java Servlets technology [11]) to services of the data tier.
HTTP Request
The goal of supporting multimodal, multi-device access to multilingual content imposed certain requirements from the application logic and the backend infrastructure. In particular, the information access and retrieval system had to support: •
Information transformation and presentation to various formats corresponding to the different modalities
•
Adaptation of retrieved information to various terminal types and models
•
Information personalization based on user preferences
•
Different database systems and schemas corresponding to diverse demonstrators (i.e. content providers). Such support is a boost to the scalability and expandability of the system
•
Independency between presentation and business logic
Fulfilling these requirements brought into foreground: (a) the XML language for device and database independent information representation and (b) the HTTP protocol as a universal communication protocol, supported by all access devices / modalities used for remote access to information. In particular content is stored in back-end RDBMS systems. There are no restrictions regarding the DBMS system or the design of the database schema used for organizing the data. Given the RDBMS repositories, the system uses XML as an adaptation layer of the databases content among the different modalities and localities. A special interface to the back end was specified and implemented towards hiding the database details from the business tier of the system. This interface specifies the operations allowed on the data, as well as the structure of the XML document to be returned. Having such an interface presents two distinct advantages: •
•
The ability to use a common presentation and business logic across different data sources (i.e. database systems and schemas). Seamless integration of different data repositories into a common application (i.e. a data service providing access to several content providers).
This interface, which constitutes one of the core components of the CATCH2004 system’s consolidated architecture was named after EIRI (Extensible Information Retrieval Interface) and comprised: • • •
Specification of allowed operations, as a set of method signatures (i.e. API) Specification of the structure of the returned XML document(s) Specification of a set of metadata for conveniently manipulating the XML-ised query results
Presentation& Business Logic Elements
E I R I
Data Provider#1
SQL/JDBC Data Provider#2
QueryResults &Metadata XML Figure 1: The EIRI concept
2.2 EIRI Evolution The EIRI specification underwent several revisions. In the beginning (version 1.0) the allowed operations were described as a Java library of interfaces, while a DTD specification was used to describing the structure of the returned documents. Another DTD represented the structure of the metadata information. Version 1.1 [6], exported an HTTP interface to this library thus endowing EIRI with a Web services flavor. Nevertheless, early EIRI versions [20] were oriented towards particular applications (i.e. a City Information Retrieval Applications [5]). This orientation was not at all desirable, since a single DTD and a specific API cannot cover a wide range of possible applications. Therefore it was thought expedient to revise the EIRI through providing an extension mechanism using external DTD entities. The major objectives of the next EIRI version (1.2) were: • • •
To develop and use modularized re-usable DTDs To achieve application independence To use a formal XML-based application interface definition
The idea of using modular DTD’s stemmed from the need to support diverse applications featuring different content structures. This effort took into account concepts and techniques developed within previous versions of the EIRI. The resulting solution was to modularize the existing DTDs and re-use application independent content structures, towards supporting all potential CATCH-2004 applications (i.e. city event information, sport events information, Olympic games information, restaurants, TV and radio info, and other). To this end, application agnostic DTD subsets were identified through examining DTDs of previous EIRI versions [20]. Each of these DTD modules is designed to describe a small amount of information as accurately as possible. Examples of such subsets are those denoting location/address information, date/time information, price information elements. Apart from identifying elements, it was possible to define new ones with a view to using them in a variety of applications. Such
- SEKE '02 - 637 -
new elements were used to construct a common reusable DTD library. Elements from this library can then be linked with specific applications. Based on such a linkage, it is quite easy to specify a DTD covering the need of a specific application. The later are application specific DTDs, which are built however from a set of reusable components (Figure 2).
DTD Library (application agnostic dtds)
DTD
DTD
DTD
ModularApplication specific DTD
Figure 2: Modular DTDs Application independence was further enhanced through extending the API defined on the back end system. Such an enhancement took into account the philosophy behind previous versions of the EIRI. In particular EIRI 1.1 for a specific application, consists of a set of Web accessible methods (e.g., for the City Events Information Application getEventList(), getStatusList()), along with a DTD template specifying the format of the reply for each one of the methods in the above mentioned example). Note that each of the methods is associated with a list of allowable parameters, each of the later having specific type, format and semantics. Recall also that EIRI 1.1 supports a Web Interface to the Java interface. EIRI 1.2 aimed at generalizing EIRI1.1 properties for any CATCH2004 application. Specifically, EIRI1.2 provides a methodology for producing and using application specific EIRI1.1. According to this, each application has a specific EIRI instance featuring a certain specification, which is however produced according to a set of given rules. These rules are described in a separate DTD specification comprising the EIRI1.2 mandatory methods (Figure 3). Taking into account the later specification, a specific EIRI1.2 instance can be produced by advertising a set of application specific available methods. Such an advertisement allows the business logic to select and use the appropriate method. As a result, building an EIRI 1.2 compliant application demands that: •
•
An EIRI (1.2 compliant) instance identifies itself, using an EIRI 1.2 mandatory method (getEIRIId()). This method is part of the EIRI 1.2 specification and returns a string (identifier string) of the format: http://www.catch2004.org/spec/. The identifier string (URI) points at an XML file that contains the method signatures for the target application specific EIRI. Note that this XML conforms to the DTD specification that is used (in the scope of EIRI1.2) to describe EIRI1.2 compliant interfaces. Note also that this XML file is sufficient to identify a particular EIRI instance.
The identifier XML file provides a list of method implemented by the specific EIRI. Each of these methods is described in a way that exposes: (a) a list of mandatory parameters for the method, (b) a list of optional parameters, (c) a DTD or data type
corresponding to the returned result(s). The back end system should always ensure that full records are returned, which means that not requested optional fields appear as empty XML tags. As an example, consider a simple web based service that reverses a given string. The following XML file is a sample application specification for such a service: