Enabling Post-Invocation Parameter Transmission ... - Semantic Scholar

2 downloads 0 Views 436KB Size Report
Hans-Meerwein-Str., D-35032 Marburg, Germany. {mathes, heinzl, friese, freisleb}@informatik.uni-marburg.de. Abstract—This paper addresses two drawbacks ...
1

Enabling Post-Invocation Parameter Transmission in Service-Oriented Environments Markus Mathes, Steffen Heinzl, Thomas Friese, Bernd Freisleben Dept. of Mathematics and Computer Science, University of Marburg Hans-Meerwein-Str., D-35032 Marburg, Germany {mathes, heinzl, friese, freisleb}@informatik.uni-marburg.de

Abstract— This paper addresses two drawbacks associated with using SOAP RPC to invoke services in service-oriented environments. First, overlapping of parameter production, parameter transmission and service execution is not possible, since all parameters of a service call have to exist prior to service invocation; until all parameters are available, a service caller has to defer service invocation. Second, the XML format of SOAP is not suitable to transfer large binary parameters, because encoding consumes a considerable amount of time. In this paper, a new approach to invoke web services is presented, which enables post-invocation parameter transmission and efficient transmission of binary parameters, thus enabling the overlapping of parameter production, parameter transmission and service execution to reduce the overall processing time. To realize post-invocation parameter transmission, an extension of WSDL is proposed. It is shown how post-invocation parameter transmission enables the efficient implementation of streambased production/consumption of parameters and pipelining. Furthermore, measurement results are presented demonstrating a noticeable performance gain.

service and all necessary parameters. Often, the parameters are matrices with floating point numbers, which consume a high amount of time for encoding [6]. Therefore, many Grid service developers choose an alternative approach to transfer binary parameters, as shown in figure 1. A Grid service, which processes binary parameters, is designed as a wrapper service. This wrapper service calls a simple batch processing script, consisting of two steps: parameter transmission (typically based on GridFTP [7]) and parameter processing (done in an application specific manner). Alternatively, Reliable File Transfer (RFT) [8] can be used, which acts as service-oriented front-end for GridFTP. This approach reduces the encoding time, but still all parameters have to be available prior to the actual service invocation.

I. I NTRODUCTION SOAP RPC [1] is the standard way to invoke web services (WS) in service-oriented environments. The parameters of a service call have to be encoded, put into a SOAP message which must be transferred completely, and decoded before service execution starts. Overlapping of parameter transmission and service execution is not possible, and encoding/decoding of parameters consumes a remarkable amount of time. The following two examples suffer from this lack of flexibility: (1) The scientific analysis of digital videos includes shot boundary detection [2], face and text detection [3] and camera motion estimation [4]. These tasks rely on basic algorithms, such as video decoding, wavelet transformation, or feature extraction, which typically are executed as part of filter chains. Each algorithm in the chain takes the output of its predecessor as its input and forwards its own output to the succeeding algorithm. If each algorithm is able to work sequentially on its input data, overlapping of parameter transmission and service execution would save a significant amount of time. (2) In Grid computing [5], sometimes huge amounts of scientific data have to be transferred. A Grid service is normally invoked via a SOAP message containing the name of the target This work is partially supported by the German Research Foundation (DFG) (SFB/FK 615, Project MT), the German Ministry of Education and Research (BMBF) (D-Grid Initiative, In-Grid Project), Siemens AG (Corporate Technology, M¨unchen) and IBM (Eclipse Innovation Award).

Fig. 1.

Transferring binary parameters in a Grid service invocation.

In this paper, we present a novel approach to transfer parameters of arbitrary size and type to a WS after its invocation. Post-invocation parameter transmission is based on outports at client-side and inports at service-side, which are connected through an external channel. A client uses an outport to send parameters to the remote service, whereas a service uses an inport to receive parameters from a client. Post-invocation parameter transmission solves the problems mentioned above: parameters can be sent to a service after its invocation, and wasting time for encoding/decoding can be omitted. Clients and services can be divided into two groups: clients/services supporting and clients/services not supporting post-invocation parameter transmission. Services not supporting post-invocation parameter transmission are simple WS and handled as before, whereas services supporting post-invocation parameter transmission have to be marked as such. Clients must be able to distinguish between services supporting and services not supporting post-invocation parameter transmission. For this reason, we propose an extension of the Web Services Description Language (WSDL) [9] by a new attribute for message parts.

2

The main advantage of post-invocation parameter transmission – reducing the overall processing time – is investigated by means of two use cases: stream-based production/consumption of parameters and pipelining. Additionally, implementation issues and some performance measurements will be presented. The paper is organized as follows. Section 2 describes the basic ideas of post-invocation parameter transmission. An extension of WSDL to mark the usage of post-invocation parameter transmission is proposed in section 3. Two use cases are described in section 4. Implementation issues are discussed in section 5. In section 6, performance measurements are presented. Section 7 discusses related work. Section 8 concludes the paper and outlines topics for future work. II. PARAMETER TRANSMISSION IN SERVICE - ORIENTED ENVIRONMENTS

In this section, the difference between traditional parameter transmission in service invocations and post-invocation parameter transmission is discussed. A. Traditional parameter transmission Traditional parameter transmission is shown in figure 2. A WS client first composes a request message, containing the remote operation and all necessary parameters and sends it to the WS (1). The WS processes the request, i.e. the desired operation is invoked with the received parameters (2). After the processing of the request is finished, the WS composes a response message containing a return value. This response message is sent to the WS client (3).

Fig. 2.

... ...

Fig. 3.

Simple WSDL service description.

parameter transmission enables the overlapping of parameter production, parameter transmission and service execution, the overall processing time is reduced. Post-invocation parameter transmission is based on inports, outports and external channels. An outport enables a client to send parameters to a WS after its invocation, whereas an inport enables the service to receive parameters from its clients. Typically, outports and inports are related 1 : 1 and are connected through a so called external channel, since a client only invokes one WS. However, it would be possible to relate outports and inports 1 : n, if a client likes to invoke n WS concurrently. An external channel is an abstraction of a transmission channel which can use arbitrary transfer protocols like HTTP, FTP, byte streams, etc. After invoking a service, the client is able to obtain an outport from the runtime environment. This outport can be used to repeatedly send parameters to the remote service. As soon as the client has sent all parameters, it closes the outport, and the interaction with the service ends. At the service-side, the WS obtains an inport from the runtime environment to receive parameters from the client. In figure 4, the general correlation between outports, inports and the external channel is shown. The WS client uses a SOAP RPC request message to invoke the remote WS (1). The request message only has to contain the operation to invoke, but not the parameters for the operation. After the WS obtains an inport and the client obtains an outport from the runtime environment, the external channel is established (2). Now the client is able to send parameters to the WS. After all parameters have been transferred, the WS sends a SOAP RPC response message to the client (3). The external channel is no longer needed and can be released.

Traditional parameter transmission.

To invoke a service, the client needs to know the endpoint where the service resides. The endpoint can be obtained by the WSDL description of the service, as shown in figure 3 for a service named ExampleService. The address element specifies an URI in its location attribute, which describes the endpoint of the service. B. Post-invocation parameter transmission The goal of post-invocation parameter transmission is to make parameter transmission in service-oriented environments more flexible and efficient. It relaxes the correlation between service invocation and service execution. Since post-invocation

Fig. 4.

Post-invocation parameter transmission.

3

III. E XTENDING WSDL To create an outport for the client, the runtime environment needs to know the endpoint where the WS would like to receive the parameters, i.e. the endpoint of the inport. The client needs to know which parameters to embed in the request message and which parameters to send via an external channel. Following the service-oriented approach, a WSDL description is used to publish all necessary information about a WS. A first approach to embed the information in the WSDL description is to use the documentation element. A documentation element is used to embed human readable information in a WSDL description. The content of a documentation element is arbitrary data or XML elements. Hence, it can be used to embed endpoint information. A main advantage of this approach is compatibility with clients non-aware of post-invocation parameter transmission. A client aware of post-invocation parameter transmission parses the service description, encounters the documentation element and uses the specified endpoint to establish the external channel. All clients non-aware of post-invocation parameter transmission simply ignore the element. Figure 5 shows the WSDL description of the ExampleService mentioned above. The ExampleService offers an operation named echo, which takes two parameters: an integer and a string. Each parameter sent via an external channel is marked with a documentation element. The integer parameter i is sent via an external channel to the endpoint defined in the location attribute. The string parameter str is not marked with a documentation tag and therefore embedded in the SOAP RPC message.

... ...

Fig. 6.

Proposed WSDL extension.

IV. U SE CASES The advantages of the proposed invocation approach will be demonstrated by investigating two use cases: stream-based production/consumption of parameters and pipelining. Streambased production/consumption of parameters is based on the use of a streaming protocol for data transmission. Pipelining enables concurrent data transmission and computation. A. Stream-based production/consumption of parameters If a streaming protocol like the Real-time Streaming Protocol (RTSP) [10] is used, the transmission of a parameter stream during parameter production is enabled. The parameters transferred can be partially received and processed by the target service. Consider, for example, a matrix consisting of the columns ~c1 to ~cn . Each column ~c1 to ~cn can be transferred in its own data package d1 to dn one after another. Figure 7 shows the situation when the data packages d1 to dn−3 have already been processed by the WS, while dn−2 and dn−1 are currently transferred and dn is just being produced by the client application.

... ...

Fig. 5.

WSDL description using documentation element.

The re-use of the documentation element violates its intentional semantics to encapsulate human readable information. Therefore, we propose an extension of WSDL to specify whether a parameter is embedded into the SOAP RPC request message or sent via an external channel. If the parameter is sent via an external channel, an endpoint must be provided. Hence, a location attribute is introduced for each parameter sent via an external channel. Figure 6 shows the proposed extension. The integer parameter i is still sent via an external channel. The location attribute added to the part element specifies its endpoint.

Fig. 7.

Example of stream-based production/consumption of parameters.

Since SOAP messages must be transferred completely before service execution starts, the overall processing time (ttotal ) of a message is the time for producing parameters (tprod ), transmitting parameters (ttrans ) and executing the service (texec ): ttotal = tprod + ttrans + texec Stream-based production/consumption of parameters allows sending parameters partially (e.g. only one column of a matrix at a time) and processing of already received parameters. Thus, receiving parameters overlaps with service execution, and the overall processing time reduces to t0total = tprod + αt · ttrans + αe · texec < ttotal where αt , αe ∈ (0, 1).

4

B. Pipelining The stream-based production/consumption of parameters enables building a pipeline between different algorithms [11]. Consider n algorithms a1 , a2 , . . . , an with processing times of t1 , t2 , . . . , tn . Sequential processing leads to an overall processing time of tseq = t1 + t2 + . . . + tn =

n X

ti

i=1

To reduce processing time, an algorithm can send partial results to its successor, which already processes them. Parallel processing of algorithm ai saves (1 − ki ) · ti units of time, ki ∈ (0, 1), 2 ≤ i ≤ n. The overall processing time is reduced to n X tpipe = t1 + ki · ti < tseq i=2

There are two important prerequisites for pipelining: First, each algorithm has to produce partial results which can be processed by its successor. Second, an algorithm has to know where to send its partial results to, i.e. where its successor is located in the filter chain. This is achieved by modifying the WSDL description as shown in figure 6. V. I MPLEMENTATION This section presents details of a first implementation to realize post-invocation parameter transmission. After presenting the general architecture of Apache Axis, data transfer after service invocation is discussed. Since post-invocation parameter transmission is based on custom serialization/deserialization, a subsection shows how to implement these mechanisms. A. Apache Axis Our implementation of post-invocation parameter transmission is based on Apache Axis 1.2.1. Axis’ main purpose is to handle the transfer and processing of SOAP messages. It provides a server and a client both consisting of a set of handler chains. Each handler is capable of changing an incoming or outgoing message and passing it to the next handler in the chain. This enables pre- and postprocessing of incoming or outgoing messages. Handlers pass messages in a so-called message context that holds additional information. A chain is a composition of handlers and other chains. When a message arrives at the server, it is passed through the transport, global and service chain. First, it is put into a message context which is forwarded to the transport chain. The transport chain handles transport specific issues, like the protocol being used to send the SOAP message (HTTP by default). Then, the message context is forwarded to the global chain implementing, for example, security policies. If the processing in the global chain took place without errors, the message context is passed to the service specific chain. Handlers in this chain may manipulate the message before it is passed to the actual service. A reply message from the service is passed along a response handler chain to the client. Axis provides a Call object to invoke a service. The Call object handles creation of an invocation message and contains

the message context passed through the chains. Moreover, Axis offers a standard serialization/deserialization facility, which enables serialization/deserialization of primitive data types (e.g. int, float, etc.) and JavaBeans. Additionally, arrays of primitive data types and JavaBeans are serializable/deserializable. The process of serialization/deserialization is based on the factory pattern. To realize an individual serialization/deserialization mechanism, the implementation of a serializer/deserializer factory is necessary. These factories produce specialized serializer/deserializers for a specific class. To associate a class with a serialization/deserialization mechanism, a so-called type mapping is needed. All type mappings are managed by a type mapping registry. If the Axis engine encounters an unknown class during serialization/deserialization, it looks up the matching type mapping in the type mappping registry and retrieves the needed factories from this type mapping. B. Implementation example The implementation of post-invocation parameter transmission is shown by means of a concrete example. Consider a client that wants to send the elements of a Vector object to a service. Not using post-invocation parameter transmission, the client may act as follows: (1) calculate each element, (2) add all elements to the Vector and (3) pass the whole Vector as argument during service invocation. These steps have a main disadvantage: the whole Vector has to be constructed before service invocation. Using post-invocation parameter transmission, a client is able to send data to a service after its invocation. Consider again the ExampleService mentioned above. The ExampleService offers an echo method, which takes a Vector parameter v. To enable post-invocation parameter transmission for the ExampleService, three steps are necessary: (1) deploy the ExampleService with a modified deployment descriptor; (2) deploy the PortHandler to the global request flow; (3) deploy the WSDLHandler to the response flow of the Axis transport chain. The modified deployment descriptor defines a parameter named location, which marks this service to enable post-invocation parameter transmission. Additionally, a custom type mapping is registered. The PortHandler checks the deployed service parameters to determine whether the post-invocation parameter transmission is supported or not. If not, the PortHandler does nothing. Otherwise, a server is concurrently started, which receives the parameters from the client and processes them. The WSDLHandler is plugged into the response flow of the transport chain and modifies the WSDL description of each service supporting post-invocation parameter transmission. An ExampleClient supporting post-invocation parameter transmission retrieves the WSDL description for the ExampleService. This description contains the endpoint of a server which reads parameters transferred over the external channel. The endpoint to be used after service invocation is registered at the EndpointRegistry. Then, a Call object is constructed, the type mapping for the Vector is registered

5

Fig. 8.

Implementation of client (left) and service (right) supporting post-invocation parameter transmission.

and the WS is invoked either in one-way or RPC style. During serialization at the client, an Outport is registered at the OutportRegistry for each Vector serialized. In parallel, the client may start to obtain an Outport from the OutportRegistry, and send data over it to the server’s Inport. At service-side, the PortHandler starts the server to receive parameters from the client’s Outports. Next, the VectorDeserializer registers a new Inport for each Vector deserialized. After deserialization, the server is informed that it may initialize each Inport with connection information obtained from the client Outports’ connection requests. After initialization of the Inports, the service may start to read from the Inports. Figure 8 shows the course of action for the client and the service. C. Deployment of a custom type mapping The implementation of post-invocation parameter transmission depends on Axis’ custom serialization/deserialization mechanism. The mechanism requires implementation of a custom serializer factory, serializer, deserializer factory and deserializer. For each component, Axis offers an interface defining all necessary methods. To associate a custom serializer/deserializer with a specific class, a new type mapping has to be defined. Defining a type mapping at client- and server-side slightly differs. At client-side, a TypeMappingRegistry is used to register a new type mapping, at server-side, a deployment descriptor is used. Figure 9 shows a deployment descriptor which associates the HashableVector class with the VectorSerializerFactory and the VectorDeserializerFactory. Since in Java an empty Vector always has the same hashcode, we used a HashableVector which simply overrides the toString() method. By this customization, it is easily possible to distinguish different Vector objects in the registries. VI. P ERFORMANCE EVALUATION To investigate the performance of our proposal, we measured the overall processing time for 100 double values, i.e. the

... ...

Fig. 9.

Deployment descriptor for a custom type mapping.

time needed to produce, transfer and process the parameters. Production and processing time were varied (1s, 3s, 5s per parameter). An Intel Pentium M notebook with 1.5GHz and 1024MB of RAM running MS Windows XP Professional SP2 acted as the client. The WS resides on an Intel Pentium 4 with 2.8GHz and 2048MB of RAM running MS Windows XP Professional SP2. We used Apache Tomcat 5.5.15 in combination with Apache Axis 1.2.1 as SOAP engine. To connect client and server, a standard 100 Mbps fast-ethernet LAN was used. Using traditional parameter transmission, the client produces the parameters, puts them in an array and sends the array to the remote WS. The WS iterates over the array and processes each parameter. Hence, parameter production, transmission and processing take place sequentially. In contrast, using post-invocation parameter transmission leads to an overlapping of parameter production, transmission, and processing. Figure 10(a) shows the overall processing time for traditional parameter transmission under different parameter production and parameter processing times. Figure 10(b) shows the same results for post-invocation parameter transmission. The measurement results demonstrate that parameter transmission after service invocation can save a remarkable amount of time. This advantage will increase with an increasing amount of data and production/processing time. Our performance evaluation does not investigate Quality of Service (QoS) criteria, since QoS of post-invocation parameter

6

For stream-based production/consumption, performance measurements were presented showing that the use of postinvocation parameter transmission significantly reduces the overall processing time. There are four main areas for future work: (1) the postinvocation parameter transmission implementation has to be generalized to work with arbitrary Java collections; (2) we plan an integration of post-invocation parameter transmission into the Marburg Ad-hoc Grid Environment (MAGE) [16], (3) further performance experiments varying the transfer protocol used are planned, (4) investigation of QoS. Fig. 10. Results of experimental comparison of traditional and postinvocation parameter transmission.

transmission is mainly associated with the transfer protocol used by the external channel. VII. R ELATED W ORK There are several approaches leading to more flexibility and better performance compared to standard SOAP. Allcock et al. [12] introduce GridFTP as a high performance data transfer protocol. GridFTP starts an extra process to efficiently transfer data from one node to another. It is completely decoupled from the service and thus violates service-orientation. Furthermore, GridFTP is not very flexible, since data cannot be dynamically transferred after service start. RFT [8] is a front-end Grid service which executes GridFTP. RFT is service-oriented but lacks the flexibility of dynamically transferring data during data production or service execution, which our approach provides. Abu-Ghazaleh et al. [13], [14] introduce differential serialization/deserialization to improve the performance of serializing and deserializing SOAP messages. Since we avoid transferring large amounts of data over SOAP altogether, we circumvent the performance loss resulting from the serialization and deserialization of large amounts of data. Ying et al. [15] measure the performance of SOAP Messages with Attachments (SwA). Though SwA is significantly faster for large amounts of data than SOAP, it is not as flexible as our approach, since dynamic transfer of data during data production or service execution is not possible. VIII. C ONCLUSIONS In this paper, we have presented a novel approach to invoke web services, which enables post-invocation parameter transmission. Our approach offers more flexibility and efficiency for communication in service-oriented environments, by enabling parameter transmission after service invocation via an external channel. We have shown how the WSDL description of a web service has to be modified to embed information which parameters can be communicated via the external channel and which endpoint(s) should be used. Furthermore, we have presented two use cases for post-invocation parameter transmission: streambased production/consumption of parameters and pipelining.

R EFERENCES [1] M. Gudgin, M. Hadley, N. Mendelsohn, J.-J. Moreau, and H. F. Nielsen, “SOAP Version 1.2 Part 2: Adjuncts,” June 2003. [Online]. Available: http://www.w3.org/TR/soap12-part2/ [2] R. Ewerth and B. Freisleben, “Video Cut Detection without Thresholds,” in Proc. of the 11th Workshop on Signals, Systems and Image Processing, Poznan, Poland, 2004, pp. 227–230. [3] J. Gllavata, R. Ewerth, and B. Freisleben, “Tracking Text in MPEG Videos,” in Proc. of the 12th ACM Conf. on Multimedia, 2004, pp. 240– 243. [4] R. Ewerth, M. Schwalb, P. Tessmann, and B. Freisleben, “Estimation of Arbitrary Camera Motion in MPEG Videos,” in Proc. of the 17th Int’l Conf. on Pattern Recognition, 2004, pp. 512–515. [5] I. Foster, C. Kesselman, and S. Tuecke, “The Anatomy of the Grid: Enabling Scalable Virtual Organization,” The International Journal of High Performance Computing Applications, vol. 15, no. 3, pp. 200–222, 2001. [6] K. Chiu, M. Govindaraju, and R. Bramley, “Investigating the Limits of SOAP Performance for Scientific Computing,” in Proc. of the 11th IEEE Int’l. Symposium on High Performance Distributed Computing, 2002, pp. 246–254. [7] W. Allcock, J. Bester, J. Bresnahan, S. Meder, P. Plaszczak, and S. Tuecke, “GridFTP: Protocol Extensions to FTP for the Grid,” April 2003. [Online]. Available: http://www.gridforum.org/ [8] The Globus Alliance, “GT Data Management: Reliable File Transfer (RFT),” November 2005. [Online]. Available: http://www.globus.org/ toolkit/data/rft/ [9] E. Christensen, F. Curbera, G. Meredith, and S. Weerawarana, “Web Services Description Language (WSDL) 1.1,” W3C Note, March 2001. [Online]. Available: http://www.w3.org/TR/wsdl [10] H. Schulzrinne, A. Rao, and R. Lanphier, “Real Time Streaming Protocol (RTSP),” April 1998. [Online]. Available: http://www.ietf.org/ rfc/rfc2326.txt?number=2326 [11] M. Spencer, R. Ferreira, M. Beynon, T. Kurc, U. Catalyurek, A. Sussman, and J. Saltz, “Executing Multiple Pipelined Data Analysis Operations in the Grid,” in Supercomputing ’02: Proceedings of the 2002 ACM/IEEE conference on Supercomputing. Los Alamitos, CA, USA: IEEE Computer Society Press, 2002, pp. 1–18. [12] W. Allcock, J. Bester, J. Bresnahan, A. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnel, and S. Tuecke, “Data Management and Transfer in High Performance Computational Grid Environments,” Parallel Computing Journal, vol. Vol. 28 (5), pp. 749– 771, May 2002. [13] N. Abu-Ghazaleh, M. J. Lewis, and M. Govindaraju, “Differential Serialization for Optimized SOAP Performance,” in Proc. of the 13th IEEE International Symposium on High Performance Distributed Computing, 2004, pp. 55–64. [14] N. Abu-Ghazaleh and M. J. Lewis, “Differential Deserialization for Optimized SOAP Performance,” in Proc. of the Int’l. Conference for High Performance Computing, Networking, and Storage, 2005, p. 21. [15] Y. Ying, Y. Huang, and D. W. Walker, “A Performance Evaluation of Using SOAP with Attachments for e-Science,” in Proc. of the UK eScience All Hands Meeting, 2005, pp. 796–803. [16] M. Smith, T. Friese, and B. Freisleben, “Towards a Service-Oriented Ad Hoc Grid,” in Proceedings of the 3rd International Symposium on Parallel and Distributed Computing, Cork, Ireland. IEEE Press, 2004, pp. 201–208. [Online]. Available: http://ds.informatik.uni-marburg.de/ de/publications/index.php

Suggest Documents