... by design, neverthe-. 1http://msdn.microsoft.com/library/en-us/ ... as data from messages, but also e. g. digital signature informa- tion, in-band control data etc.
GSIP – An Alternative Web Service Invocation Protocol ˇ Ondˇrej Kraj´ıcek, Petr Holub Institute of Computer Science Masaryk University Brno, Botanicka´ 68a, 602 00 Brno, Czech Republic {krajicek,hopet}@ics.muni.cz
Abstract Generic Service Invocation Protocol (GSIP) aims to provide an alternative communication infrastructure for applications based on the Web Services technology. It is designed for environments where traditional SOAP based approach imposes drawbacks on performance and scalability, leads to wasting resources and/or processing power. Such problems may emerge in various environments, e. g. large-scale information systems, Grid computing, high-performance or mobile computing applications. The GSIP is designed to address the shortcomings, by introducing service-specific invocation model and corresponding infrastructure and by using alternative data encoding schemes. This allows for extensibility and implementation of several more advanced modes of operation as well, such as publisher/subscriber, producer/consumer etc. We present the GSIP design and architecture, some performance studies and some implementation and deployment scenarios. 1 Introduction Today the Web Services are becoming more and more established technology in the area of multi-tier applications, mainly in middleware [6] tier. The initial technical specification has been proven to be open and scalable enough for most applications of moderate complexity in various fields of computing. However, for the large scale applications, such as university or enterprise information systems, Grid or high-performance Computing applications, the current Web Services technology may impose significant bottlenecks in terms of both performance, scalability and robustness. The traditional Web Services architecture also enforces pull model which is clearly not suitable for a whole class of applications. A publisher/subscriber model is often more appropriate as already implemented by Grid services as specified in OGSA [1]. To address real and potential drawbacks in Web Services technology, and especially those introduced by the SOAP messaging protocol [2], we are designing an alternative communication infrastructure for Web Services, namely GSIP – Generic Service Invocation Protocol. The GSIP is meant to be an alternative Web Services invocation infrastructure and protocol in environments, where using XML-based SOAP for communication means intolerable limits in scalability or wasting networking resources and processing power [3]. GSIP protocol uses binary encoded messages and is designed from the scratch with existing performance and scalability problems in mind.
The rest of the paper is organized as follows: Section 2 briefs related approaches, Section 3 gives overview of the GSIP architecture, Section 4 highlights GSIP advanced operation scenarios that are supported by GSIP, Section 5 describes prototype implementation and gives prototype performance evaluation results, Section 6 summarizes directions of future development, and Section 7 finishes with some concluding remarks. 2 Related Work In last four years, we have witnessed a broad adoption of Web Services technology. However, in various applications, potential and real drawbacks (mainly performance issues) emerged as the technology deployment base grew. This lead to development of several projects/technologies aiming to provide a solution. We show some examples and their brief description, with references to relevant sources. First group of approaches tries to optimize SOAP manipulation to make it as efficient as possible. For example, the gSOAP is a high-performance SOAP toolkit for C/C++ [8] used in highperformance and Grid computing applications. It generates SOAP parsers based on deterministic finite state automata generated specifically for each application. Second group comprises other approaches that are not bound to SOAP any more. REST (REpresentational State Transfer) architecture is an example of this approach as it is based solely on existing WWW technologies, such as HTTP protocol. It has been created by Roy Fielding and published in [7]. Common Language Infrastructure (CLI) Remoting is a toolkit for remote object invocation. It is standardized in [9]. However functionally similar to Java RMI, it allows for use of multiple communication protocols and message encoding schemes. CLI remoting can be used for Web Services invocation. DIME – Direct Internet Message Encapsulation is a new binary format for messages and their encapsulation similar to MIME. DIME standard was proposed by IETF draft1 , which has expired. Nevertheless, several implementations of DIME are available. 3 GSIP Architecture and Design GSIP design is object oriented, centered around the service invocation model concept. Service invocation model identifies the components which take part in service invocation, clearly states their interface, interaction and responsibilities. The service invocation model is service specific, by design, neverthe1 http://msdn.microsoft.com/library/en-us/ dnglobspec/html/draft-nielsen-dime-02.txt
less the fundamental parts are common to all services. Service invocation model is generated by the specialized tool which should be part of the GSIP implementation.
Client
• Performance – The GSIP is designed for high performance, and thus efficient data representation, low transmission overhead, as well as low memory requirements are priorities. • Extensibility – The GSIP design and architecture has to be easily extensible.
Serv ice
Serv ice Container
Response
Figure 2: Traditional Web Service operation
0..*
Session
0..*
0..1
Data Units
• Simplicity – An architecture of the GSIP should be kept simple, as CORBA is available for more complex cases.
Request
Response
For implementation, GSIP neither dictates nor requires particular (object oriented) programming language, environment or libraries. Our intention is to provide standard invocation infrastructure which may interoperably implemented in majority of existing environments. GSIP is designed with the following goals:
Request
Proxy
0..*
Data Units 0..*
1
0..1
0..*
Invocation
1..* Data Unit
Channel Data Units
0..1 Message
Figure 3: GSIP Service Invocation Model
3.1 Service Interaction
• Scalability – The GSIP implementations should scale well in terms of number of requests and concurrent users.
GSIP provides message based communication infrastructure, service consumers (clients) and services (servers) interact by exchanging messages from predefined set. Service interaction is implemented by means of Service Invocation Model, the client stub and server skeleton.
• Smooth coexistence with other technologies – The GSIP should coexist effortlessly with other technologies (particularly SOAP) and used only where appropriate.
3.2 Service Invocation Model Components
The GSIP architecture is divided into logical layers, as shown in Fig. 1. The design is inspired and motivated by traditional service-oriented architectures which employ remote invocation of services (such as Web Services, Remote Procedure Call, CLI Remoting, etc.) to enable easy integration into and interoperability with existing applications. Web Service Application WSDL Service Invocation Model Communication Channels HTTP[SG]?,... TCP
TCP
UDP
IP
Figure 1: GSIP Web Services Architecture The design of the service invocation model is based on the web service operation scheme as shown in Fig. 2. The flow of control and data between service invocation model components is shown in Fig. 3.
Service Invocation Model defines the following components: Service Proxy, Service Dispatcher, Session, Invocation, Channel, Message and Data Unit. At runtime, the components of Service Invocation Model naturally form a tree structure. By using this tree-based approach to service invocation, more precise and robust error handling and recovery may be implemented in easy and convenient way. Session. Session component represents a particular interaction between service consumer and a particular service instance. We may see such interaction as a sequence of messages (requests and responses) exchanged between the two entities. The session is usually initiated by the consumer, but it is not compulsory. Invocation. Invocation component represents a particular request (service invocation), done by the consumer. It is used to construct GSIP messages (via Data Units and Message components) and to handle received responses. Invocation component (class) exists for each operation the service provides (as specified in its WSDL contract) and it is instantiated for each particular call made to it; this means that a particular Invocation instance is bound to exactly one Session component. Message. Message component represents a transmission unit and the communication between the two parties is done solely using Messages. Whole information contained in a Message is represented using Data Units.
Data Unit. Data Unit represents a particular piece of data, such as data from messages, but also e. g. digital signature information, in-band control data etc. Data Unit component (class) exists for each service message definition (as specified in its WSDL contract) for instance. All Data Unit classes have consistent data encoding implementation, which defines the message encoding scheme.
the client to server (and vice versa) when the network connection becomes available. The disconnected operation may be an elegant solution for distributed applications in environments where bandwidth/network connectivity is expensive, limited, unstable or not available at all – such as mobile computing environments.
Channel. Channel component is an abstraction over transportation mechanism. Channel provides services to send and receive messages and hides all other implementation dependent details. The Channel interface cohesion is high, so that it is generic enough for various transportation mechanism, such as TCP/IP protocol family, higher-layer protocols (such as SMTP, HTTP), etc.
4.3 Publisher-Subscriber Model
Multiple communication channels may exist in Service Invocation Model for particular Session. Also, the underlying (network) connection is not required to be persistent (e. g. it may be established on demand and closed when idle). Service Binding. Service Binding reflect the fact, that the client is bound statically to a particular type of service (i. e. client expects certain service interface). However, at the application design-time, it is not important which particular instance of the respective service will be invoked and where it runs. At runtime then, the client may (and often does) obtain the service reference dynamically and then connects to this particular service. By introducing Service Binding component to the Service Invocation Model, GSIP could handle transparently service reachability problems and completely hide such error handling logic from the application logic itself. However, this component is not available in current prototype (Sec. 5).
Traditional Web Services operate in pull model—a client sends a request to a server, the server processes the request and sends a response in return. This is communication model is also used e. g. in traditional World Wide Web. However, it may not be sufficient for all applications. GSIP protocol design allows implementation of pull or push communication models, such as publisher/subscriber. In the publisher/subscriber model, the client registers for some event notification and disconnects. When the particular event occurs the client is called back with notification. When implemented using traditional pull model, the client has to poll the server for the event occurrence in regular intervals, which leads to wasted computing resources and bandwidth and also to unavoidable delays in event occurrence at the server side and event notification processing at the client side. The GSIP protocol design also introduces a notion of persistent channels. The client stack is able to negotiate a persistent channel with the server stack, which may be used for pushing data from the server side to the client side. Such channel may be transparent to the application and handled completely by the protocol stub, the application just provides a callback functions to be called, when the notification arrives.
4 Example Operation Scenarios
4.4 OGSA Integration
To demonstrate various advantages of GSIP design compared to the traditional Web Services approach based on SOAP, several possible operation scenarios are provided in this section.
To address some shortcomings of the Web Services model, the Grid Services based on Open Grid Services Architecture (OGSA) [1] introduced several extensions to the WSDL language and Web Services technology itself, such as notification interfaces similar to our publisher/subscriber model. It also provides soft-state operation to support unreliable environments that can be easily integrated into GSIP as well.
4.1 Migration Transparency
GSIP design allows implementation of systems based on the Web Services which are capable of the migration transparency. The protocol stack with its Service Invocation Model is naturally able to detect or negotiate service migration (by means of Service Binding and Session components) and to reconstruct the invocation tree up to the appropriate service consumer binding. Such reconstruction may be directed by the service or controlled using client configuration information.
One of our primary goals for the GSIP protocol is to fit in the OGSA framework and to be a viable replacement of the SOAP protocol for Open Grid Services Infrastructure (OGSI)2 . The GSIP should provide efficient communication between OGSAcompliant Grid services allowing application of the OGSA architecture on services for which it currently means unacceptable overhead of communication.
4.2 Disconnected Operation
The same mechanism as in migration transparency scenario allows implementation of disconnected client operation. When the client and server operate in disconnected mode, they store the messages locally, instead of attempting to deliver them immediately. The locally stored messages are transmitted from
2 The OGSI has been obsoleted by WSRF recently which is planned to be implemented in Globus Toolkit version 4. However, we believe that GSIP integration will remain applicable for WSRF-enabled Grid Services as well.
5 Prototype Implementation GSIP testing and evaluation has been done by prototyping. Current GSIP prototype code named Gondor, which was used also for performance testing for this paper, is implemented in C#. Performance tests were run on Microsoft Shared Source CLI called Rotor [5] and Microsoft .NET Framework version 1.1. The Gondor prototype consists of invocation framework (called the stub on the client and skeleton on the server side) and Service Invocation Model Generator (SIMGEN utility). Invocation framework implements Session and Channel components of the GSIP invocation model. SIMGEN utility is used to produce service invocation model implementation in the form of source code. This source code complements the abovementioned invocation framework and provides implementation of all invocation model components for particular service. SIMGEN utility operates by constructing an object oriented representation of service WSDL contract, called Service Model. This service model is than processed and transformed into object oriented representation of Service Invocation Model source code (the Service Code Model). This Service Code Model is used to generate the actual source code for the service. SIMGEN utility is implemented using Microsoft CodeDom technology (which is part of Rotor and Microsoft.NET Framework). 5.1 Prototype Performance Evaluation
For prototype performance evaluation, two simple Web Services have been implemented using GSIP. To compare the prototype performance, each service has also been implemented using CLI Remoting with SOAP transport. SOAP based remoting was chosen since it better exposes the overhead of SOAP based message encoding by eliminating the overhead burden of Web Service containers (such as Apache, Internet Information Service, etc.). Both implementations (GSIP and Remoting/SOAP based) have been executed in two environments: Rotor and Microsoft .NET Framework v1.1 on single-processor machine with Windows XP. Time per one handled request and total spatial overhead of messages have been measured. Echo Service is a Hello World incarnation in Web Services world, at least when it comes to simplicity. Echo service description is shown in Fig. 4. The service implementation echoes the received number (echoRequest) in the response. This service demonstrates the SOAP and GSIP overhead of message encoding because of its trivial functionality. Newton Service implements numerical solving of polynomial equations with the polynomial order up to nine using Newton’s method. For demonstration purposes, a polynomial of order of nine for which the method converges was chosen. Each request runs ten iterations of the method. The purpose of this service is to demonstrate the SOAP and GSIP overhead on non-trivial messages which contain floating point numbers. The service description is shown in Fig. 5.
Figure 4: Echo service description (simplified and shortened)
Figure 5: Newton service description (simplified and shortened)
Measurement Results The measurement results and comparison between GSIP and Remoting/SOAP based services are summarized in Tab. 1. This summary table provides results for services running only on the Microsoft .NET Framework as it proves to be far outperforming the Rotor CLI implementation. All measurements were gathered using 10,000 requests to provide large enough statistical sample for reasonable evaluation. For Echo Service, the detailed results for both MS .NET and Rotor hosted services are given in Fig. 6. For Newton Service, the corresponding detailed results are shown in Fig. 7. It follows from the results that the GSIP is almost an order of magnitude better compared to the Remoting/SOAP in per request processing time for both Echo and Newton services. The advantage of GSIP becomes even more obvious when comparing spatial overhead (Fig. 8) as the GSIP performs almost two orders of magnitude better than Remoting/SOAP and it
GSIP - SOAP Performance Measurement Service for Solving Roots of Polynomial
376,600±6,100 222.2±3.5 µs 12 bytes 12 bytes 472,500±6,500 278.82±3.8 µs 92 bytes 20 bytes
1e+09 MS .NET GSIP MS .NET SOAP Rotor GSIP Rotor SOAP
1e+08 Processor Ticks per Request
GSIP, Microsoft.NET Framework 1.1 Echo – avg. processor ticks per request Echo – avg. time per request Echo – request message size Echo – response message size Newton – avg. processor ticks per request Newton – avg. milliseconds per request Newton – request message size Newton – response message size SOAP, Microsoft.NET Framework 1.1 Echo – avg. processor ticks per request Echo – avg. milliseconds per request Echo – request message size Echo – response message size Newton – avg. processor ticks per request Newton – avg. milliseconds per request Newton – request message size Newton – response message size
2,375,000±11,000 1,401.9±6.5 µs 570 bytes 584 bytes 2,767,000±12,000 1,632.8±7.3 µs 1001 bytes 693 bytes
1e+07
1e+06
1e+05
0
1000
2000
3000
4000
5000 6000 Request No.
7000
8000
9000
10000
Table 1: Selected results – summary Figure 7: Newton service – CPU ticks per request GSIP - SOAP Performance Measurement Echo Service 1e+09 MS .NET GSIP MS .NET SOAP Rotor GSIP Rotor SOAP
SOAP, Newton Service SOAP, Echo Service GSIP, Newton Service GSIP, Echo Service
1e+08
1e+07
1e+07 Data Transferred [bytes]
Processor Ticks per Request
1e+08
Message Encoding Spatial Overhead
1e+06
1e+06
1e+05
1e+05
0
1000
2000
3000
4000
5000 6000 Request No.
7000
8000
9000
10000
Figure 6: Echo service – CPU ticks per request
10000
2000
4000
6000 Number of Requests
8000
10000
Figure 8: Spatial overhead of SOAP compared to GSIP. never drops below one order of magnitude performance improvement. 6 Future Work To the date of writing, functional prototype written in C# is available. Currently, we are testing and evaluating its performance in comparison with various remote invocation and particularly Web Service toolkits. The current prototype represents an experimental branch written in C# hosted by CLI (Rotor Shared Source CLI and Microsoft.NET Framework v1.1). Another branch of prototypes, written in portable plain C is planned. This stable prototype line will merge in features tested and confirmed in the experimental branch to produce release candidates suitable for production evaluation and use. Currently, message encoding in GSIP is generic. We plan to replace this encoding with ASN.1 based mechanism using BER (and possibly DER and PER) encoding rules [4]. Then Service
Invocation Model Generator would then transform the message patterns found in WSDL contract into ASN.1 Type Notation and use ASN.1 compiler to produce data encoding routines, which would in turn be used by Data Unit implementations. 7 Conclusions The GSIP infrastructure seems to provide efficient replacement for the SOAP as Web Service transport protocol. It features functionality, which is not provided in current SOAP-based Web Services and promises superior performance in comparison to the SOAP. The main goal now is to provide productionclass implementation (in C) and deploy it for production evaluation.
References [1] Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. Open Grid Service Infrastructure WG, Global Grid Forum, June 22, 2002. http://www.globus.org/research/ papers.html#OGSA [2] Mitra N. (editor), “SOAP Version 1.2 Part 0: Primer,” W3C Recommendation 24 June 2003, http://www.w3.org/TR/2003/ REC-soap12-part0-20030624/. [3] Chiu K., Govindaraju M., Bramley R., “Investigating the Limits of SOAP Performance for Scientific Computing,” Proceedings of The Eleventh International Symposium on High Performance Distributed Computing, IEEE Computer Society Press, pp. 246-254, Edinburgh, Scotland, 23-26 July, 2002. [4] Larmouth J., “ASN.1 Complete,” http://www.oss. com/asn1/bookreg.html. [5] Stutz D., “The Microsoft Shared Source CLI Implementation,” Microsoft Corporation, March 2002, http://msdn.microsoft.com/library/ default.asp?url=/library/en-us/ Dndotnet/html/mssharsourcecli.asp. [6] Booth D., Haas H., McCabe F., Newcomer E., Champion M., Ferris C., Orchard D., “Web Services Architecture,” W3C Working Draft 8 August 2003, http: //www.w3.org/TR/ws-arch/. [7] Fielding R. T., “Architectural Styles and the Design of Network-based Software Architectures,” PhD dissertation, University of California, Irvine, 2000, http://www.ics.uci.edu/˜fielding/pubs/ dissertation/fielding\_dissertation. pdf. [8] van Engelen R. A., “gSOAP Documentation,” Department of Computer Science and School of Computational Science and Information Technology Florida State University, Tallahassee, FL 32306-4530 http://www. cs.fsu.edu/˜engelen/soap.html. [9] “Common Language Infrastructure – CLI”, Ecma International, http://www.ecma-international. org/publications/standards/ecma-335. htm.