Automated Functional Conformance Test Generation for ... - CiteSeerX

1 downloads 93145 Views 160KB Size Report
e-mail : {paradkar, avisinha, clayw, susano1, robertdj, shriverc}@us.ibm.com, [email protected] ... Automated test generation for web services for the.
Automated Functional Conformance Test Generation for Semantic Web Services∗ Amit M. Paradkar†, Avik Sinha, Clay Williams I.B.M. Thomas J. Watson Research Center Robert D. Johnson, Susan Outterson, Charles Shriver EIS IBM Software Group Carol Liang IBM Software Group Toronto e-mail : {paradkar, avisinha, clayw, susano1, robertdj, shriverc}@us.ibm.com, [email protected] Abstract We present an automated approach to generate functional conformance tests for semantic web services. The semantics of the web services are defined using the Inputs, Outputs, Preconditions, Effects (IOPEs) paradigm. For each web service, our approach produces testing goals which are refinements of the web service preconditions using a set of fault models. A novel planner component accepts these testing goals, along with an initial state of the world and the web service definitions to generate a sequence of web service invocations as a test case. Another salient feature of our approach is generation of verification sequences to ensure that the changes to the world produced by an effect are implemented correctly. Lastly, a given application incorporating a set of semantic web services may be accessible through several interfaces such as 1) Direct Invocation of the web services, or 2)a Graphical User Interface (GUI). Our technique allows generation of executable test cases which can be applied through both interfaces. We describe the techniques used in our test generation approach. We also present results which compare two approaches: an existing manual approach without the formal IOPEs information and the IOPEs-based approach reported in this paper. These results indicate that the approach described here leads to substantial savings in effort with comparable results for requirements coverage and fault detection effectiveness. Keywords: Semantics Web Services Test Generation, Model-Based Test Generation, Services Fault Models.

∗ The opinions expressed herein are the collective ones of the authors and do not represent any official position of IBM. † Contact Author

2007 IEEE International Conference on Web Services (ICWS 2007) 0-7695-2924-0/07 $25.00 © 2007

1

Introduction

Service oriented architectures (SOA), and web services in particular, offer the promise of easier system integration by providing standard protocols for data exchange. Semantic Web Services are the foundations that enable the dynamic discovery and binding of web services. Semantic Web Services are specified using standards such as OWL-S and WSDL [6] and consists of service descriptions in terms of Inputs, Outputs, Preconditions, and Effects (IOPEs). Testing of such web services poses challenges because the persistent state of the world (in terms of domain instances) of the web services needs to be accounted for in order to derive the test suite. Functional conformance testing is an activity which ensures that the implementation conforms to some functional specification. It is distinct from standards conformance testing which is concerned with the issues of compliance with standards such as WS-I[1]. A functional conformance test case for a web service is a concatenation of three parts: 1) A Set Up sequence of invocations of web service operations from a given initial world state S0 to bring the web service into a suitable world state S, 2) a test invocation of a single web service operation in state S to produce a new world state S  , and 3) a Verification Sequence which verifies that the world state S  is reflected in the implementation of the web service. An expected value for each output parameter of the web service operation is associated with each step of the test case. During the test execution phase, each web service operation is invoked with values chosen for each input parameter, and the actual values returned by the implementation of the web service are compared with the expected output to produce a verdict of success or failure. Testing of semantic web services consists of at least three stages: 1) Development testing - which is typically performed by the producer of the web services, 2) Repository

Testing - which is performed by an agent on behalf of the repository where the services will be made available from, and 3) End User Testing - which is the contextual testing of the web services performed by the consumer of the web services to ensure that the services still function correctly. Each test case in a properly developed test suite during Development and End User testing stages is a concatenation of the three aforementioned subsequences. Automated test generation for web services for the repository testing is an emerging area of research. Heckel et al. [7] describe a set of techniques which generate a suitable set of test cases based on web services described using a graph transformation for each operation in the web service. They use typical boundary value testing to generate test cases for single operation, and define a dependency relationship between two operations based on data flow analysis to derive test cases for a sequence of multiple operations. Bertolini et al. [4] model operations in a web service as a symbolic labeled transition system (STS), and define the conformance between an implementation and its model in terms of traces on the STS. At each step, the technique randomly selects from the available set of operations (along with the data values for its parameters) to apply to the system under test. However, these techniques do not address the issue of generating verification sequence - a sequence which enables verification of world state obtained as a result of applying each input test sequence. Furthermore, none of these techniques address the issue of End User Testing and how the information in the semantic web services can be exploited to reduce the effort of consumer testing. In this paper, we describe a novel test generation technique which addresses these two issues. Our test generation technique exploits the IOPE information in the operations of web services to generate a small yet effective set of test cases. Based on the pairs of precondition and effect in the IOPE, our technique generates testing goals which reflect the best practices in black box testing - such as boundary values, cardinality of collections, fault sensitization (to be described later) etc. Each of these testing goals is then input to a AI planner component (which significantly extends the current state of art in AI planning [9]) to derive a sequence of operations that satisfies the testing goals. To generate the verification sequences, our technique uses mutations of the world state obtained as a result of applying the aforementioned sequence of operations. A sequence which distinguishes each resulting mutant world state from the original world state is generated through a suitable formulation and solution of another planning problem. To address the issue of end user testing, our technique elicits another artifact, called test template, which represents the manner in which the end use deploys and navigates the web services in its presentation layer. These test

2007 IEEE International Conference on Web Services (ICWS 2007) 0-7695-2924-0/07 $25.00 © 2007

templates are used to transform the test cases generated through the planning process into an executable set of test cases which can be run using available GUI test execution automation tools such as Rational Function Tester [8]. In particular, the specific contributions of this paper are: 1. A novel technique to generate a set of effective test cases for web services defined as IOPEs. 2. An approach to leverage the semantic web service during end user testing 3. Results of an extensive case study in an industrial environment to illustrate the applicability of our technique. These results indicate substantial effort savings using our approach over the traditional practice without loss of fault detection effectiveness. The rest of this paper is organized as follows. Section 2 reviews other work in the area of testing web services. Section 3 describes our test generation technique through an example. Section 4 describes the results of our case study. Finally, Section 5 contains our conclusions and provides directions for future work.

2

Related Work

Web services are gaining industry-wide acceptance and usage. Subsequently, testing of services is also gaining importance. Zhu [11] describes the issues and problems in the testing of web services and proposes a framework to address some of the issues. This framework enables collaboration between service providers and requestors through provision of special testing services. Our work is targeted towards the conformance testing issues of semantic web services. Research in conformance testing of web services mainly addresses the issue of Repository Testing. For instance, [7] propose a service discovery framework that automatically generates conformance test cases from the provided service description, then runs the test cases on the target Web Service, and only if the test is successfully passed, the service is registered. Here the service operations are modeled as Graph Transformation Rules. Precondition of each rule is used as the basis for generating testing goals. Some of the best practices in testing - such as boundary values are used in deriving the testing goals. However, verification sequences are not generated. The authors also provide results of a preliminary case study on a simple web purchasing web service. Bertolino et al. [4] describe a web service audition framework for repository testing. The authors assume that the web services are modeled as UML 2.0 Protocol State Machines (PSMs). The authors translate the PSMs into a

Symbolic Transition System (STS). The authors have previously defined a conformance relationship between an implementation and its STS based on the traces of the STS. To demonstrate such conformance, the authors have developed an on-the-fly testing algorithm, which randomly generates the next input stimulus based on the current state of STS, applies it to the System Under Test, observes the results, and then starts the process over. This process can potentially run for ever, and the authors suggest some well understood stopping criteria such as transition coverage. However, in the absence of state verification sequences, fault detection effectiveness of such a process is not known. The authors do not provide any empirical results to demonstrate such effectiveness. Some other studies suggest use of simple WSDL specifications without the semantics information for automated test case generation. For instance [10, 3] discuss a method wherein the WSDL file is first parsed and transformed into the structured DOM tree. Then, test cases are generated from two perspectives: test data generation and test operation generation. Test data are generated by analyzing the message data types according to standard XML schema syntax. Operation flows are generated based on the operation dependency analysis. Finally, the generated test cases are documented in XML-based test files called Service Test Specification. However, evaluation of this approach is not provided.

3 3.1

Test Generation for Semantic Web Services Overview

Our test generation approach uses the process shown in Figure 1 for generating test cases from the IOPE description of the web services. To facilitate the testing process, an initial state of the world is also specified. First, our test generation approach derives test objectives for each PE pair for each operation (called testing goals). The derivation of testing goals leverages a set of fault models each of which represents a best practice in the testing literature. These fault models include boundary values for world state attributes and the input parameters; cardinality constraints for each collection valued world domain model attribute and the notion of fault sensitization. Fault sensitization is a technique based on the intuition that each testing goal should target exactly one interesting aspect of the operation’s behavior. Otherwise, potential faults may be masked. In semantic web services, the need for fault sensitization arises because of the overlapping preconditions between two pairs of PEs for a given operation. These testing goals, along with the web service IOPE description, and an initial state of the world are input to planner component for generation of the set up sequences.

2007 IEEE International Conference on Web Services (ICWS 2007) 0-7695-2924-0/07 $25.00 © 2007

Legend

Start

Control Flow Data Flow Process

Initial World State

< >

Goal Generation Fault Models

Implemented by Input Knowledge base

Planner Goals Artifact IOPE models

Set-up sequence generation

New world state

Set up sequence Verification Sequence Generation

Verification sequence Test Sequence Generation Template

Logical TestCases Reification

Stop

Test Script

Figure 1. Test Generation Process

The planner and other components of in our testing approach also rely on a suite of constraint solvers for various domains including linear arithmetic, String, Boolean, Enumerated, Set, and List. The planner extends the well-known Graphplan planning algorithm [5] to address the creation and deletion of instances in the world. Other enhancements to the Graphplan algorithm include ability to handle numeric, string, and collection valued parameters on the operations (as is typically required in a web service world). This planning step results in the generation of the Set Up subsequence. Recall that, the testing goal also consists of the constraints on input parameters for the corresponding operation (which is the operation under test). At the end of the planning process, the only unresolved variables in the constraint system are the input and output parameters of the operation under test. Solution of this constraint system leads to the generation of the test invocation. Simulation of the test sequence generated so far results in a modified world state. In the next step, our test generation approach generates a verification sequence to validate that this world state is also reflected in the internal system state of the implementation. To do so, our approach derives a set of mutant world states based on the expected world state using various fault models such as 1) No Instance Creation - which negates the effects of instance creation, 2) No Attribute Update - which negates the effect of modifying an attribute of an instance, and 3) Normal Effects in Exception - which augments the effects associated with an exceptional behavior with each of the individual effects in the successful behavior of the same operation. This last mutant enables us to generate test cases which ensure that no unwarranted

behavior is exhibited by the implementation. We exploit the planner to generate a sequence which distinguishes the mutant world state from the expected one. The resulting logical test case is in a execution harness independent format and cannot be executed directly. In order to produce executable test case, we allow the user to specify a template which contains the necessary code fragments for each web service operation. This template contains placeholders for input and output parameters so that the corresponding actual values can be substituted from the logical test case.

3.2

pair for Crete Order states that if the specified customer and product exists, then a new order is created (we do not describe the PEs which handle the situations when the customer or the product does not exist). Similarly, Read Order accepts an order id, and returns the complete order details if one exists. The error situation, when the order does not exist, is also modeled for Read Order. The domain ontology for OrderMgmt can be illustrated using a UML class diagram (with appropriate attributes) as shown in Figure 2. The world state for OrderMgmt consists of

Example Semantic Web Service

As an example let us consider a web service used to manage orders of an E-trading site [2], called OrderMgmt. OrderMgmt has operations like: Create Order, Read Order, and Delete Order. In OWL-S description, each operation is defined using an Input, Output, and a set of pairs of Precondition and an Effect. An abstract IOPE description for the operations in OrderMgmt is given below: Create Order: • I = {accN o :: Integer, pID :: Integer, n :: Integer}; • O = {oID :: Integer}; • {(P, E)} = {(∃c : c :: Customer ∧ c.acctN o == accN o ∧ ∃p : p :: P roduct ∧ p.productId == pID, ∃o : o :: Order ∧ o.acc == oID)} Read Order: • I = {oID :: Integer}; • O = {ord :: Order}; • {(P, E)} = {(∃o : o :: Order ∧ o.orderId == oID, ord == o), (∀o : o :: Order ∧ o.orderId = oID, ord == N U LL)} DeleteOrder: • I = {oID :: Integer}; • O = {result :: String}; • {(P, E)} = {(∃o : o :: Order ∧ o.orderId == oID, ∀o : o :: Ordero.orderId = oID ∧ result == “Success ), (∀o : o :: Order ∧ o.orderId = oID, result == “F ailure )} For example, operation Create Order takes 3 inputs to identify the Customer, Product and the number of items for the chosen product. It produces one output: the reference number for the Order created. The only PE

2007 IEEE International Conference on Web Services (ICWS 2007) 0-7695-2924-0/07 $25.00 © 2007

Figure 2. Domain Model for DPSpec instances of this class diagram, along with suitable values for attributes.

3.3

Generating Testing Goals

The IOPE model is analyzed to generate testing goals for each PE pair in each operation. Our approach applies several refinements to the precondition of a PE pair in order to accomplish this task. These refinements are based on fault models which represent common programming errors and best test practices. Here we provide an example of applying the cardinality refinement to the pre conditions. The principle behind the cardinality refinement derives from the quantifiers (both existential and universal) present in the preconditions of operation. A universal quantifier implies iteration over a collection valued domain model entity D such as instances of a class or members of an association) to check for a certain property. An interesting testing goal in this situation is to check the behavior when the collection D is empty to begin with. Other testing goal which is commonly practiced requires the collection to have cardinality of one. Lastly, to make testing process manageable another testing goal treats all the other cardinalities (≥ 2) as same. We encode this practice in our testing goal generation and derive from each universal quantified expression over collection D, three independent testing goals by conjoining the original flow condition with one of the following: COU N T (D) = 0, COU N T (D) = 1, COU N T (D) ≥ 2. Similarly, for an existential quantified guard condition, we derive two independent testing goals by conjoining one of COU N T (D) = 1, COU N T (D) ≥ 2. Note that the case where the collection is empty is not relevant in this situation since at least one instance of the collection is expected to satisfy the property being checked.

Goal # 1 2 3

WS Operation COa CO CO

Testing Goal Predicate count(o:Order) = 0 count(o:Order) = 1 count(o:Order) ≥ 2

[Set up sequence] + Testing Goal Invocation [],CO(999, 1234, 3, 1) [CO(999, 1234, 2, 1)], CO(999, 1234, 1, 2) [CO(999, 1234, 2, 1), CO(999, 1234, 3, 2)],CO(999, 1234, 1, 3)

Table 1. Partial List of Testing Goals for OrderMgmt a CO

= Create Order

In certain cases, the precondition may not have any quantification, yet the web service operation effect may have an update which creates an instance of a class. For example, the precondition for Create Order is a quantifier free expression in Order. However, Create Order includes an update effect which leads to creation of a new instance of Order. From a testing point of view, one would like to test such creation behaviors under different conditions of preexisting instances of Order. In such circumstances, our analysis derives three independent testing goals obtained by conjoining the original guard with one of the following: COU N T (D) = 0, COU N T (D) = 1, COU N T (D) ≥ 2. The resulting testing goals for operation Create Order are shown in Table 11 .

3.4

(a) World state after a call to “create order” with 2 existing orders

c1 : Customer accountNumber : int = 999

1 We ignore the cardinality constraints required by the quantified Customer and Product classes 2 We assume that the initial world state contains one instance each of Customre and Product, but does not contain any instance ofOrder

2007 IEEE International Conference on Web Services (ICWS 2007) 0-7695-2924-0/07 $25.00 © 2007

o2 : Order orderID : int = 2 n : int = 3

p : Product productID : int = 1234 count : int = 10

o3 : Order orderID : int = 3 n : int = 1

(b)Mutated state as if create order failed

c1 : Customer

Generation of Setup Sequence

The testing goals derived in this manner are input to a planner to derive a Set Up sequence of use case invocations. For example, Goal #1 in Table 1 requires that there be no Order instances in the system state prior to performing the operation under test (Create Order for this goal). This condition is satisfied by an empty sequence2 (indicated by an empty sequence [] in the last column of Table 1). A testing goal invocation corresponding to the operation under test is appended to the set up sequence to derive the test case. On the other hand, Goal 3 requires that there be at least two instances of Order class before performing the testing goal invocation. Our planner returns the set up sequence consisting of two invocations of operation Create Order as shown in Table 1. Once a valid sequence of test invocations (including the testing goal invocation) that satisfies the testing goal is found, our planner simulates this sequence from the given initial world state. This simulation evolves the world state and can be depicted in the form of a UML Object Diagram. Top half of the Figure 3(a) shows the object diagram that results after applying the test sequence for Goal #3 in Table 1 (three instances of Order class).

o1 : Order orderID : int = 1 n : int = 2

o1 : Order

o2 : Order

p : Product

Figure 3. World State Object Diagram, and its mutant diagram as a result of deleting the Create Update of Create Order

3.5

Generation of Verification Sequences

To increase confidence in the testing process, it is essential to validate that the System Under Test (SUT) also reaches an internal system state which is consistent with the world state resulting from the simulation process. In order to do so in an ideal manner, we would need to distinguish the obtained world state from all other world states. Obviously this is an impractical objective since there are potentially infinite world states. We need to prune the space of world states to a manageable number. We adopt another set of fault models, this time operating on the world states, to accomplish the task. We illustrate the No Instance Creation fault model for the world state in Figure 3. Note that object o3:Order was created as a result of the testing goal invocation. We assume that a faulty implementation does not create its analog in the implementation state. Thus, we would be left with an object diagram with objects o1, and o2 as shown in the bottom half of Figure 3(b). We need to disclass.

tinguish the two object diagrams from each other through an application of a suitable operation sequence (called a verification sequence). Inspection reveals that an invocation of operation Read Order (3, 3, 1) distinguishes the two object diagrams (the faulty version will lead to execution of the exceptional condition in operation Read Order returning a NULL order object). Note that Read Order is an observer operation of the world state - it returns values of the order instance in a world state to the calling environment. Such observer operation, if present, are convenient for verification. However, our approach does not require that an observer operation be present. For example, operation Delete Order(3, SUCCESS) would also distinguish the mutant with the same result - the faulty implementation will return Failure.

4

Case Study

In this Section, we report the findings from a case study during end user testing stage of an industrial application. The subject of the case study is an SOA application for online management of re-usable assets. For proprietary reasons, we refer to the application only as ProductA. ProductA is a consumer of five web services each serving an purpose as described in Table 2. For comparison, the existing manual practice by a test design team is used as benchmark. The team also used Rational Function Tester (RFT) for test automation. The team is composed of expert testers with at least 5 years of professional experience in testing applications. The testing team recorded the identified bugs through Bugzilla and assigned severity categories to them. Independently, we created IOPE models for these web services and used our approach for generating test cases. For the purposes of comparison we define a set of metrics as described in Table 3. The terms used in Table 3 are defined as follows: CF R : Coverage in %; F R1: Number of functional requirements covered in Test Suite produced by our approach; F R2: Number of functional requirements covered in traditional testing, F R12: Number of requirements that are covered both by our approach and the traditional testing technique; FF : Effectiveness in %, D1: Net number of faults detected by our approach; D2: Net number of faults detected by traditional testing technique; D12: Net number of faults detected by both the techniques;EC : Efficiency of the test generation process, T : Time taken to produce the test cases;EF : Efficiency of the test generation process. We divided the study in two iterations. The first iteration consisted of modeling the Asset Management web service and in the second iteration all the remaining four services were modeled. The requirements coverage, coverage efficiency and fault detection efficiency of the test suite thus

2007 IEEE International Conference on Web Services (ICWS 2007) 0-7695-2924-0/07 $25.00 © 2007

generated was measured against our benchmark. The first iteration was used to calibrate the measurement process and also to identify if our test generation approach itself had any missing requirements. Results of the first iteration indicated that our approach lacked the capability of handling NULL value for the optional variables. During iteration two, we created test templates for each service operation using Rational Function Tester (RFT) which was the choice of test execution automation by the testing team which was doing manual test design. The test cases created using this approach were imported into RFT. The case study was designed for an ”in-vivo” application and therefore like most other situations one could not have access to the entire set of development artifacts. Furthermore, data regarding the developer testing of individual services was also not available. Our limitation was that we had access to only a ”verystable” version of the application and its WSDL description. Thus we had to allow various approximations in order to guess the actual measure. For instance, the requirements for the application were measured by counting the operations and by identifying various failure modes of them. The assumption was that the application did not have any missing requirements (since it is ”very stable” it may be safe to assume so). Thus if we are able to test for all the failure causing scenarios, we are able to test for all behavioral requirements. Also, one would ideally measure effectiveness by running the two set of test cases on an identical version of the application, but such an experiment could not conducted since older buggy versions of ProductA were not available. Therefore, as a workaround, the generated test cases were evaluated against the bugs recorded in the Bugzilla database. The database recorded a total of 74 bugs out of which only 16 were functional. Thus the test cases were evaluated to identify if it would uncover any of the 16 functional bugs (9 for the first iteration, 7 for the second). For a test case to uncover a functional bug, we say that it must drive the system to a failure causing system state and must have a subsequent verification sequence to warrant the expected behavior. The results from the comparison are presented in Table 4. During the first iteration, our approach did not support NULL valued optional variables. Consequently, the Automated test suite had smaller requirements coverage and it found one less defect than the benchmark. However, the improvements in effort spent using the two approaches are encouraging for both the iterations. The results from the second iteration indicate the significant improvements in effort without any loss of requirements or fault detection coverage. Out of the seven faults detected during iteration 2, four were logical in nature such as: implementation not handling incorrect parameters, or not checking for creation of instances with duplicate keys, or not ensuring that the

Name Authorization

Number of operation 12

User Management

28

Asset Management

11

Change Management

4

Repository Access

39

Purpose Manage roles and permissions for users of ProductA Add user, user groups, manage accessibility of the groups Add, delete, modify assets; add comments and ratings for assets Create, browse and manage defect reports and feature request for assets Access assets, populate views, and search assets for the repository

Example operation addRole, addPermissionToRole searchGroup; addUserToGroup createRating; createAsset; createDefect; createFeatureRequest; getAssetAttribute; ryView;

getReposito-

Table 2. Description of Web Services in ProductA

No. 1

Metric Coverage

Measurement Model R1 CF R = F R1+FFR2−F R12 × 100

2

Effectiveness

FF =

D1 D1+D2−D12

3

Coverage Efficiency Fault Detection Efficiency

EF =

CF R T

EF =

FF T

4

× 100

Description Relative functional requirement coverage measures the ratio between the net numbers of functional requirements covered by the Test Cases. Relative fault detection effectiveness measures the ratio between the net numbers of faults detected by the test cases. Coverage efficiency denotes the coverage achieved per unit time spent on generating the test cases Fault Detection Efficiency denotes the number of faults detected per unit time spent on generating the test cases.

Table 3. Case Study Measurements

Iteration Measure Coverage Effectiveness (no of bugs hit) Effort

1 Benchmark 100 % 100%(9)

1 our approach 92% 88.89%(8)

2 Benchmark 100 % 100%(7)

2 our approach 100% 100%(7)

3.5 days 28 hours

1.25 days 12.25 hours

4 days + 8 Days for RFT Scripting

Coverage Efficiency

28.58/day 3.57/hour 28.58/day 3.57/hour

73.60/day 7.51/hour 71.11/day 7.26/hour

25/day 3.5/hour

2 days + 2 Days for RFT + 2 Days SOA 50/day 6.6/hour

25/day 3.5/hour

50/day 6.6/hour

Fault Detection ciency

Effi-

Table 4. Case Study Measurements

2007 IEEE International Conference on Web Services (ICWS 2007) 0-7695-2924-0/07 $25.00 © 2007

deleted entities were indeed deleted from the system state. The last defect underscore the importance of state verification sequences - without one this defect may have gone unnoticed. The remaining three faults were due to GUI navigation - for example cancel button leading to infinite loop. Even though the IOPE model of the web service does not contain any information about GUI navigation, the test template portion exercised the typical GUI navigation scenarios (such as pressing Cancel or Back buttons). Furthermore, we were able to simulate the developer testing scenario by creating a test template to produce executable test cases which consisted of SOAP messages which could be directly applied to the web service implementation through an execution engine (as indicated in ”+2 days SOA” value in the last column of ”Effort” row in Table 4). Thus, we could reuse the same IOPE information for both the developer and end user testing and obtain even more significant effort reductions. Further, these results also imply that if our test generation approach was used during the development testing using the IOPE paradigm, several logical faults that were encountered by the end user testing team would have been caught earlier in the life cycle.

5

Summary and Future Work

We have presented an automated approach to generate functional conformance tests for semantic web services which are defined using the Inputs, Outputs, Preconditions, Effects (IOPEs) paradigm. For each web service, our approach produces testing goals using a set of fault models. A novel planner component accepts these testing goals to generate a sequence of web service invocations as a test case. Another salient feature of our approach is generation of verification sequences. Lastly, our technique allows generation of executable test cases which can be applied to the various interfaces through which the web service may be accessed. We have described our technique through an example web service. We also presented results which compare two approaches: an existing traditional approach without the formal IOPEs information and the IOPEs-based approach reported in this paper. These results indicate that the approach described here leads to substantial savings in effort with comparable results for requirements coverage and fault detection effectiveness during both the development and end user testing. We would like to extend our work in several directions. For example, our current work assumes only atomic web services. We would like to extend our approach to composite web services (also defined using IOPE paradigm). We are interested in comparing our approach to those developed for testing BPEL4WS descriptions (which represent compositions of atomic web services). We would also like to extend our approach to a more on-the-fly approach where

2007 IEEE International Conference on Web Services (ICWS 2007) 0-7695-2924-0/07 $25.00 © 2007

the test cases are not produced apriori but are generated in an adaptive manner.

References [1] ”URL: http://www.ws-i.org/”. [2] ”Available at: URL: http://lsdis.cs.uga.edu/projects/meteor-s/wsdls/examples/purchaseOrder.wsdl”. [3] X. Bai, W. Dong, W.-T. Tsai, and Y. Chen. Wsdlbased automatic test case generation for web services testing. In Proc. IEEE International Workshop on Service-Oriented System Engineering’05, pages 215– 220, 2005. [4] A. Bertolino, L. Frantzen, A. Polini, and J. Tretmans. Audition of web services for testing conformance to open specified protocols. In R. Reussner, J. Stafford, and C. Szyperski, editors, Architecting Systems with Trustworthy Components, number 3938 in LNCS. Springer-Verlag, 2006. [5] A. L. Blum and M. L. Furst. Fast planning through planning graph analysis. Artificial Intelligence, 90(12):279–298, 1997. [6] T. O. S. Coalition”. Owl-s: Semantic markup for web services, 2003. Available at URL: ”http://www.daml.org/services/”. [7] R. Heckel and L. Mariani. Automatic conformance testing of web services. In Proc. Fundamental Approaches to Software Engineering (FASE 05), pages 34–48, 2005. [8] IBM. Rational function tester product overview. Available at URL: ”www.ibm.com/software/awdtools/tester/functional/index.html”,. [9] T. Klinger, C. Yilmaz, and A. Paradkar. ‘graphplancd : Object creation and destruction in a graphplan planner’. In AAAI 2007 Submitted, 2007. [10] W.-T. Tsai, Y. Chen, R. Paul, H. Huang, X. Zhou, and X. Wei. Adaptive testing, oracle generation, and test case ranking for web services. In IEEE COMPSAC, pages 101–106, 2005. [11] H. Zhu. A web services framework for testing web services. In IEEE COMPSAC’2005, pages 34–39, 2005.