Draft of paper that appeared in the Proceedings of the IEEE 2007 International Service Computing Conference
A Service Discovery Framework based on Linear Composition Andrea Zisman Department of Computing City University London, EC1V 0HB, UK
[email protected]
Khaled Mahbub Department of Computing City University London, EC1V 0HB, UK
[email protected]
Abstract Service discovery has been recognised as an important aspect of service oriented computing. This is even more the case when developing service centric systems in which software systems are constructed based on the identification and composition of web services that together can fulfil the functionality of the system being developed. In this paper we present a framework that supports the discovery of services that can provide the functionality and satisfy the properties and constraints of service-based systems during their design phase. Our framework makes use of linear composition of service operations in which more than one web service operations can be combined to fulfil a functionality of the system when no single operation can be identified. The discovery process is based on a graph-matching algorithm. A prototype tool has been developed to demonstrate and evaluate the framework. 1. Introduction
Service discovery is considered an important aspect of service oriented computing. Many approaches and techniques have been proposed to support service discovery. These approaches range from semantic matchmaking approaches based on logic reasoning of terminological concepts [1][12][13][16][18][22], to approaches in which queries are checked based on string matching [14], and approaches that use behavioural signatures [30] and full behavioural models [11]. Another group of service discovery approaches have been proposed to assist with the engineering of service centric systems [15], including requirements-based [39], architecturebased [19], and run-time [34] service discovery. The work presented in this paper is part of a large programme of research in an integrated European project focusing on service centric systems engineering (SeCSE [31]). In this paper, we present a framework to assist with the design phase of service centric systems (SCS), i.e. software systems that are composed of web services. The work described in this paper extends the work in [19] and [40]. Contrary to this previous work that supports the identification of a single service operation that can provide a certain functionality of a service centric system being developed, here the framework considers service composition. More specifically, it supports the identification of various service operations from distinct web services, that together can fulfil a functionality of the
George Spanoudakis Department of Computing City University London, EC1V 0HBUK
[email protected]
system, when no single service operation can be identified. The framework is based on an iterative process for designing service centric systems in which the discovered services are used to amend and reformulate design models of the system and the design models are used to trigger new service discovery iterations. The work described in this paper advocates a specific form of service composition that we call linear composition. In this form of service composition, candidate service operations are composed sequentially by a search process with the objective of identifying sequence of service operations such that (a) the input parameters of the identified service operations collectively match the input parameters of the service operation request, (b) the output parameter of each service operation in the composition can be matched to input parameters of the subsequent service operation in the composition, and (c) the output parameter of the last service operation in the composition subsumes the output parameter of the service operation request. In addition, the number of service operations to be composed for a certain service operation request can vary. This number is specified in the service operation request. Our work is based on best matches between the input and output parameters of service operation request and service operations in the composition. These best matches are calculated based on a graph-matching algorithm that verifies graph-subgraph isomorphism between two graphs representing the data types of the input and output parameters of the service operation request and service operations in the composition. The approach returns more than one set of candidate services with service operations that can be composed to match a service operation request. The sets of candidate services are identified based on an extension of the similarity analysis algorithm presented in [33] that calculates distances between (i) the names of the service operation request and the identified service operations, and (ii) the data types of the input and output parameters of the service operation request and identified service operations. The system designer may select a set of services from the returned set of candidate services or accept the best candidate set (the one with the smaller distance measure). The selected or accepted set of services is used to re-formulate the design models of the system. The remainder of this paper is structured as follows. In Section 2 we present an overview of our service discovery framework based on linear service composition. In Section 3 we describe the linear service composition approach and
present its algorithms. In Section 4 we illustrate our work through some examples. In Section 5 we discuss some related work. Finally, in Section 6, we summarise our work and discuss future work.
2. Service Discovery Framework Our framework is based on an iterative process in which the discovery process is triggered by the design models of a service centric system being developed and the discovered services are used to amend and reformulate the design models of the system. New versions of the design models are created based on the discovered services. These new versions may trigger new iterations in the process. The result of the process is a specification of the structural and behavioural design models of the system. The process can be terminated at any time by the designers of the system, or when further requests cannot identify services that can match the design models. In the framework, we propose to use structural and behavioural models of the system being developed as UML class and sequence diagrams, respectively. The rationale for adopting a UML-based approach is given by the fact that UML is, in general, the de-facto standard for designing software systems and can be used to support the design of service centric systems as argued in [9][10][21]. In addition, UML can represent the necessary types of design models and can provide a basis for specifying service requests. Service requests are derived from the design models and are operations specified in the sequence and class diagrams with their respective signatures. More specifically, the interactions of the sequence diagrams and the classes and interfaces in the structural diagrams are used to specify these requests. We use a UML 2.0 profile to specify a service operation request and its results. The profile defines a set of stereotypes for different types of UML elements found in the design models and in the results of the discovery process. The profile also defines stereotype properties that are used to specify parameters and constraints for the elements to which the stereotypes containing these properties are applied. These parameters and constraints are used to limit the search space, to limit the number of services returned by the discovery process, and to define selection criteria for choosing services based on their characteristics (e.g. services from a specific provider). A detailed description of the profile can be found at [19]. The discovery process is executed by searching for services in different registries. The search is based on a graph matching linear composition algorithm that computes similarities between service operation requests and service operations in the service specifications. We adopt an approach in which the service specifications are composed by parts named facets, describing different aspects of the services. These facets include service interface specifications expressed in WSDL [35], behavioural service specifications expressed in BPEL4WS [3], context service information expressed in context ontologies [6], semantics service specifications expressed in OWL [25], WSMO[37], or WSML [36], quality of service information, and other information types described
in XML (e.g. textual description). The linear composition framework being presented uses only WSDL facets. Figure 1 presents an overview of the framework. As shown in the figure the discovery process is executed by the query execution engine implemented in Java and available as a web service. The execution engine accepts service requests (queries) expressed as UML models annotated with stereotypes from the UML profile, represented in XMI. The results of the engine are represented in XML. These results can be incorporated in the UML design models and displayed by using UML case tools. Details of the framework implementation and instructions of how to use it can be found at [32].
UML 2.0 Design Model (XMI)
ASD Web Service Query Execution Engine
UML CASE tool Result (XML)
Service Registries
Service Specifications
Figure 1: Overview of the Framework Figure 2 shows a graphical view of the XML schema used to represent the results of service discovery based on linear composition. As shown in the figure, the result consists of (a) one or more composition elements which includes the operation required by the request (element queryOperation); (b) sequence of service operations constituting the candidates of linear service compositions for the service request (element serviceOperations); and (c) input, output, and total distances between the operations in a service composition and service request for each distinct composition (elements inputDistance, outputDistance, and totalDistance, respectively). The input, output, and total distances represent the distances between the input parameters, output parameters, and sum of the input and output distances, respectively, for the service operation request and the service operations in the composition.
Figure 2: XML Schema of Service Discovery Results
3. Service Composition The linear composition approach used in our framework identifies different service operations that when combined can fulfil the functionality of a requested service operation. The maximum number of service operations to be composed for a certain service request can vary and is specified in the request. We call the number of service operations to be composed as composition level. Intuitively, given a service operation request with an arbitrary number of input parameters and one output parameter, and a composition level l, a linear composition is defined by l service operations such that: (a) the linguistic distance between the name of the service operation request and the name of each operation in the composition is less than 1 (i.e. the operations in the composition have names which are not totally dissimilar to the name of the requested operation), (b) the union of the input parameters of the l service operations represent a best match with the input parameters of the service operation request, (c) the output parameter of the last service operation in the composition may have the best match with the output parameter of the service operation request, and (d) the output parameter of each operation Oj() in the composition can be matched to an input parameter of service operation Oj+1(), where 1