managed service oriented architectures in which clients are both self-heal to ... ing technologies for integrating enterprise applications, lever- aging electronic ...
Towards Self-Managed Integration of Web Services Davide Tosi University of Milano-Bicocca Via Bicocca degli Arcimboldi, 8 I-20126 - Milano, Italy {tosi}@disco.unimib.it
ABSTRACT The integration of third-party web services helps software architects solve complex business problems and reduce risks, costs and time-to-market. However, the task of the developers is challenged by the difficulty of guaranteeing interoperability with target web services for two reasons: first, because web services that are maintained by different organizations can evolve dynamically and autonomously; second, because of the lack of information about the interaction protocol of dynamically discovered web services. This, may lead to unexpected client-side failures. My thesis investigates a novel approach to designing selfmanaged service oriented architectures in which clients are both self-heal to potential integration mismatches, and selfadaptive to sets of compatible web services that expose different interaction protocols. The solution we are investigating in this thesis, is based on a runtime infrastructure that automatically tests web services and adapts client applications to overcome potential problems without user intervention.
Keywords Self-managed Systems, Web Services, Service-Oriented Architecture, Testing, Integration Faults
1.
WEB SERVICE INTEGRATION CHALLENGES
Web services and service-oriented architectures are emerging technologies for integrating enterprise applications, leveraging electronic B2B and B2C solutions, and extending the life of legacy software. In a nutshell, web services are remote programs invoked over the Internet that allow enterprises to export functionality outside the enterprise bounds, thus enabling stakeholders of different domains to rapidly and seamlessly integrate third-party expertise into their applications. A service-oriented architecture is characterized by bro-
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Copyright ACM ...$5.00.
kers, providers, and clients. A broker is responsible for matching client requests with available services published by providers. Providers publish machine-readable interface descriptions of their services to the broker using a suitable language such as the Web Service Description Language (WSDL) [5]. Clients use WSDL descriptions to look up the available services that can match their needs. Service-oriented architectures face the difficulty of guaranteeing the interoperability between clients and dynamically discovered web services. The integration of third-party web services is challenged by the difficulty of keeping consistency between software systems that are maintained by different organizations and may evolve dynamically and independently, because of both changes in the services and the dynamic discovery of new services. Service providers may change the implementation independently from clients, and clients may use different web services in different invocations depending on the dynamic choice of the service broker. In principle, different services or service implementations that can be invoked to satisfy a given request, must comply with the same WSDL contract, but in practice contracts tend to specify little more than the service syntax and parameters, leaving many semantic details unspecified and thus implementation-dependent. For example, we have been using web services for obtaining the weather temperatures in US districts on the basis of a contract that required the target location to be indicated with the zip-code and the temperature to be returned as a floating point value, but did not indicate the measurement unit of the return temperature, e.g., Fahrenheit and Celsius, thus leading to client-side failures. Moreover, WSDL descriptions of the web services leave generally unspecified the sequences in which the operations of a target service can be invoked (aka the interaction protocol ). Whenever a client tries to use a service with an interaction protocol different from the expected one, the interaction may fail, even if syntax and parameters are correct. Since clients may use different (even though compatible) web services in distinct invocations, they may be connected to web services that work with different interaction protocols, although providing the same functionality. As an example, we have been using a web service provided by Amazon to handle user shopping carts during online buying sessions: the interaction protocol of this web service requires at least one selected item for creating a shopping cart. However, other shopping cart web services may rely on different assumptions e.g., create an empty shopping cart before adding the first item. This thesis focuses on integration problems that derive
from both dynamic changes of the invoked services, and unknown interaction protocols of the services. High availability requirements and dynamically discovered web services exclude the possibility of traditional stop-update-testredeploy-restart approaches to the integration of new or modified services. Self-managed applications have been recognized as viable solutions for dealing with systems where size and complexity increase beyond the ability of humans to respond manually, coherently and timely to environmental and system changes [7]. Self-managed solutions are being experimented in several application domains, but not in service-oriented applications yet. The massive reuse of services and the frequent updates of implementations corresponding to compatible interfaces are typical of serviceoriented applications, and allow for defining efficient domain specific self-managed solutions for the service-oriented domain. This thesis investigates a solution that exploits ideas of self-managed systems. First of all, we are working on a mechanism for revealing possible mismatches between requested and provided services, and for dynamically adapting the client application accordingly. We are defining an integration fault taxonomy that helps service integrators identify possible integration mismatches, generate test cases for revealing integration problems, and design recovery actions. Integrators can code test cases and adaptation strategies in separate modules, which we refer to as adaptation aspects. Adaptation aspects are automatically weaved into the client applications, so that the modified client application executes test cases whenever a new service implementation is detected to reveal possible mismatches, and triggers suitable adaptation mechanisms accordingly. We refer to the proposed approach as Self-Healing Integration of Web Services (SHIWS). Second, we are working on an approach that enables clients to automatically adapt their behavior to compatible web services with different interaction protocols. We define an enhanced service oriented architecture (SOA+), in which the interaction of service requestors, brokers and providers is facilitated by a new entity, which we refer to as Interaction Protocol Service Extension (IPSE). An IPSE automatically and incrementally monitors the interactions of a web service with clients, derives an approximated model of the interaction protocol, and delivers it to the clients which want to use the web service. The clients can then use the model to check whether the delivered web service fulfils its interaction requirements and if so it can adapt its own interaction behavior to the interaction protocol supported by the web service.
2.
THE APPROACH
In this thesis we propose a self-healing, self-adaptive approach to the development of service-oriented applications that combines novel techniques into a traditional sense-planact control loop, where the subject system is connected to a controller that in turn feeds commands back into the subject system. We suggest to instantiate feedback control loops at two different abstraction layers: first, a specialized senseplan-act control loop must be instantiated for each set of services that comply with a specific contract to address problems that rise due to dynamic changes of the invoked services, thus guaranteing self-healing abilities; second, a specialized sense-plan-act control loop must be instantiated for each web service to address problems that rise due to unknown interaction protocols, thus guaranteing self-adaptive
abilities.
3.
SHIWS DETAILS
The invocation of a web service triggers monitoring mechanisms. At this abstraction layer, such mechanisms identify changes in the invoked services that may depend on serverside implementation updates or dynamically discovered services. The detection of a new service implementation triggers diagnosis mechanisms that run test cases on the target web service to reveal possible mismatches. If the diagnosis mechanisms reveal any mismatches, associated adaptation strategies update the structure and the behavior of the client application to solve the identified problems and optimize the interaction with the target service. The customization of the control loops, consists of defining a set of test cases that can reveal mismatches between different implementations of the same contract and a set of adaptation strategies for the classes of possible mismatches. The control loop instantiated with automatic test cases and adaptation strategies will enhance all client invocations of the target web service. Enhanced client applications intercept all calls to the web service, dynamically check potential service mismatches, and adapt their local behavior accordingly. The customization of the sense-plan-act control loop is based on a methodology and a design framework that help designers identify the customization requirements for control loops. The methodology is composed of three steps that are executed when a new web service category is selected for integration in the target application. Firstly, in the set-up phase, software architects analyze the WSDL description of the new category to identify ambiguities that may result in implementation dependent mismatches. The analysis is driven by domain specific fault taxonomies. The proposed taxonomy is composed of 7 mismatch categories. Currently, we are refining the taxonomy with new entries. Secondly, in the diagnosis phase, software architects design test cases to reveal the occurrence of the potential mismatches identified in the setup phase. Test cases are defined according to guidelines that describe possible test strategies for different fault categories, together with examples of applications that refer to the samples of the taxonomy. Finally, in the adaptation phase, software architects design adaptation strategies. When the application senses possible changes in the services, it automatically executes the test cases defined in the second phase. If the tests reveal service mismatches, the application tries to adapt to the new service implementation by executing the adaptation strategy corresponding to the identified problem, and checks if the adaptation succeeds. Adaptation strategies are defined according to domain specific adaptation patterns for recurrent types of mismatches. Currently, we are thinking about new adaptation strategy based on code relocation. To experiments our approach, thus identifying limitations and open issues of it, we developed the SHIWS framework that facilitates the instantiation of the control loop, and the implementation of self-healing service-oriented applications. The current implementation of SHIWS automatically builds the runtime infrastructure that links the client application to the target web services, incorporating the diagnosis and adaptation strategies provided by designers. The runtime infrastructure both handles the logics of the cus-
tomized sense-plan-act control loops at runtime, and allows for dynamically selecting and enacting different adaptation strategies. We implemented a SHIWS prototype as a plugin of the Eclipse open platform, an extensible development IDE widely used in many domains. Since Eclipse is increasingly used to support software development throughout all phases of the development process, it is an ideal joint-point for introducing our novel approach in the software process of service-oriented applications. The current SHIWS prototype works for web applications written in Java.
4.
IPSE DETAILS
In SOA+, the interactions of service requestors, brokers and providers are facilitated by the IPSEs new entities. An IPSE automatically and incrementally monitors and records the interactions of a web service with clients, derives an approximated model of the service interaction protocol, and delivers it to the clients which want to use the web service. The clients can then use the protocol to check whether the delivered web service fulfils its interaction requirements and if so, it can adapt its own interaction behavior to the interaction protocol supported by the web service. The customization of the control loops consists of defining interaction requirements of the client application in the form of needful operations (i.e. the operations the client wants to perform while interacting with a web service), and a set of constraints. Constraints are given as a partial order over service invocations. The control loop instantiated with the client-side interaction requirements will enhance adaptability of the client applications. Client applications interact with the IPSE, retrieve the model, and use it in two ways. First, based on the model they can verify whether the web service can satisfy their interaction requirements, thus rejecting non-satisfying web services. Second, they can dynamically reconfigure the sequences of interaction to comply with the model, thus achieving adaptation. The instantiation of the control loop starts every time a new web service publishes its description. At this point, the broker spawns and associates a new IPSE to the web service. The new IPSE starts to monitor all the interactions between client applications and the associated web service, and incrementally synthesizes the model of the web service interaction protocol. In particular, the model is captured in a finite state machine hereafter called interactionFSM. Our approach generates the interactionFSMs by using the k-Tail algorithm. This algorithm was proposed by Biermann and Feldman [4] and infers a FSM from a set of positive samples. The initial FSM is built by considering all the single sample traces and by creating a branch for each trace. Then, the algorithm merges pairs of states, if they generate the same k states in the future, i.e. a pair of states can generate the same set of substrings of length k. Iteratively, the algorithm tries to identify and merge all the possible states. Client applications then interpret and use the model by applying an algorithm that we developed (called PO-Safe algorithm). The PO-Safe algorithm requires the interactionFSM synthesized by the IPSE and the partial order of the client application as input. As output the algorithm delivers: a boolean value telling whether the interactionFSM fulfills the partial order; an annotated FSM in which transitions are marked with safe or unsafe, depending whether the transition is part of a path fulfilling or not the partial order. In this way, clients receive support to enact local adaptation
strategies, with they become able to use different, previously unknown, web services.
5.
FIRST EXPERIMENTS
We experimented our approach to a simple industrial benchmark: a virtual store application. VirtualStore is a web portal for a set of e-commerce applications. VirtualStore enables the search for best offers through the catalogs of backend e-stores, provides the abstraction of a shopping cart that can collect products from different backend e-stores, and handles payments by means of credit cards to multiple backend e-stores in a single user transaction. We experimented VirtualStore with three backend applications: the e-commerce web application of Amazon 1 , the PetStore, a web application provided by Sun to demonstrate J2EE technology 2 , and the ComputerStore, a web application developed in our lab. VirtualStore manages the communication with the backend applications through a common set of web services for browsing product catalogs and handling remote shopping carts. Moreover, VirtualStore uses third-party web services for credit card validation. We applied the SHIWS methodology to the VirtualStore application. For the target categories of web services, we identified 11 possible mismatches based on the mismatch taxonomy. Next, we identified a total of 24 test cases that are able to disambiguate all mismatches. Finally, we implemented 11 adaptation modules, one for each mismatch, to satisfy the adaptation requirements. We also applied the IPSE approach to the VirtualStore application. For the shopping cart web services, the approach is able to synthesize the different models of the interaction protocols for the three shopping cart implementations. The experiments highlighted important differences (e.g., only the Amazon shopping cart requires the creation of an empty cart before adding the first item, and only the ComputerStore shopping cart always requires an item search before adding an item. This experience with VirtualStore points out important aspects of our approach: • Separation of concerns: in the SHIWS approach, the adaptation issues of each web service are handled in separate modules. They are designed and developed independently of other services and the main functionality of the client application; • Automatic generated infrastructure: much of the infrastructure for controlling the dynamics of sense-planact loops is automatically generated by SHIWS. This eliminates many of the technical difficulties in the integration problem; • Fault taxonomy support: the integration fault taxonomy guides designers through the analysis of the adaptation requirements. It works as a checklist that focuses on typical mismatches, and suggests candidate test cases and adaptation patterns; • Reusable adaptation modules: the loose dependency between adaptation modules and client applications seems to indicate that the same adaptation modules 1 2
www.amazon.com/aws www.java.sun.com/developer/releases/petstore
can be (at least partially) reused across applications that integrate similar web services; • Tool support: the experiments show that the process of identifying mismatches, defining test cases and writing adaptation modules is well supported by SHIWS; • Loose-coupled SOA+: developing IPSE as a new entity, leaves providers the task of instrumenting their web services with inference abilities; • Algorithms support: the experiments show that the synthesis of the models, and the adaptation of the clients are well supported by our implementation of the k-Tail and PO-Safe algorithms; • Adaptation granularity: the IPSE approach provides two levels of adaptation. The models of the web service interaction protocols get updated and refined over time, and clients have adaptation abilities to use different, previously unknown, web services.
6.
RELATED WORK
Feedback loops applied to monitoring characteristics of a computer system and also the keeping of application parameters under control, relate to recent initiatives on autonomic computing and self-managed systems [7]. This thesis exploits these ideas to safely guarantee the integration of third-party web services. Unwanted web service interactions can be monitored and diagnosed by means of assertions. For example, Baresi, Ghezzi and Guinea [1] use assertions (in combination with orchestration languages) to monitor whether or not a web service complies to a predefined functional contract: the developers of client applications specify the expected functional contracts with suitable assertions; the assertions are checked at runtime; the identified contract violations trigger the adaptation strategies. The approach of Baresi et al. aims to increase fault-tolerance to contract violations, while our approach allows the application to adapt to the diverse valid implementations that are underspecified elements in the service description. Other approaches to solve semantic ambiguity in web service descriptions focus on ontologies and semantic web technologies, e.g., [8]. Ontologies and semantic web technologies aim to build common reference frameworks to share and reuse data across applications, enterprises, and community boundaries. Currently, the main difficulty in using these approaches in practice is the difficulty of defining generally agreed domain ontologies. In addition, these ontologies do not help to disambiguate if several services have the same interface but different implementations. Approaches using automata to describe the interaction of a web service with client applications are proposed in [6, 2, 3]. Automata are used to reason about an interface checking whether these protocols fulfil a certain set of temporal formulas. Additionally, they check whether two interfaces are compatible, and whether one application can be substituted by another one without affecting the properties fulfilled by the overall system. While we infer the interaction model by observing the web service during everyday use, these approaches assume that the automata for a web service are published by the service developer.
7.
FUTURE PLANS
During the first two years of my phd thesis, we investigated the problems of integrating third-party web services, and we proposed a self-health / self-adaptive solution to address these problems. Preliminary experiments indicate the efficiency of the approach to overcome non-trivial mismatches, and unknown interaction protocols. However, experiments suggest the real need of new and more powerful adaptation mechanisms. Currently, for the SHIWS issues, we are designing a first set of general standard adaptation modules to populate an early adaptation library, and we are also investigating synergies with complementary approaches based on assertions and ontologies. For the IPSE issues, we are extending the approach by adding input/output invariants as parameters of the FSM transitions, to enrich the behavioral information of a service. Moreover, we are evaluating new adaptation strategies to adapt the client behavior to unsupported interaction protocols. We are currently conducting extensive experiments on industrial benchmarks to investigate in depth the benefits of our approach, and we are working on the integration of the SHIWS and IPSE approaches, to provide a general framework able to support the development of self-managed service-oriented architectures.
8.
ACKNOWLEDGEMENTS
I would like to thank Mauro Pezz`e, my thesis advisor, and Giovanni Denaro for the time spent in giving important advice and suggestions for my work.
9.
REFERENCES
[1] L. Baresi and S. Guinea. Towards dynamic monitoring of ws-bpel processes. In ICSOC, pages 269–282, 2005. [2] D. Beyer, A. Chakrabarti, and T. A. Henzinger. An interface formalism for web services. In Proceedings of the 1st International Workshop on Foundations of Interface Technologies. Elsevier, 2005. [3] D. Beyer, A. Chakrabarti, and T. A. Henzinger. Web service interfaces. In Proceedings of the 14th International World Wide Web Conference WWW, pages 148–159. ACM Press, 2005. [4] A. W. Biermann and J. A. Feldman. On the synthesis of finite-state machines from samples of their behaviour. IEEE Trans Computers, 21:591–597, 1972. [5] E. Christensen, F. Curbera, G. Meredith, and S. Weerawarana. Web Services Description Language (WSDL) 1.1. Technical report, World Wide Web Consortium, March 2001. [6] H. Foster, S. Uchitel, J. Magee, and J. Kramer. Compatibility verification for web service choreography. In Proceedings of the IEEE International Conference on Web Services (ICWS’04), June 6-9, 2004, San Diego, California, USA, pages 738–741, 2004. [7] J. O. Kephart and D. M. Chess. The vision of autonomic computing. IEEE Computer, 36(1):41–50, January 2003. [8] W3C. Web Ontology Language (OWL) - Reference Version 1.0, 2002. Available at http://www.w3.org/TR/2002/WD-owl-ref-20021112/.