From objects to services: toward a stepwise ... - Semantic Scholar

3 downloads 108130 Views 650KB Size Report
after each migration step the new service-oriented application is tested with the ... applications, speeds in custom application development and reduced cost of ...
Int J Softw Tools Technol Transfer DOI 10.1007/s10009-009-0123-4

SPECIAL SECTION ON WEB SYSTEMS EVOLUTION

From objects to services: toward a stepwise migration approach for Java applications Alessandro Marchetto · Filippo Ricca

© Springer-Verlag 2009

Abstract Migrating legacy applications toward serviceoriented systems is a hard task complicated by the lack of appropriate approaches and tools. In this paper, a stepwise approach is proposed to migrate a Java application into an equivalent application composed of a set of Web services invoked by an orchestrator. In each migration step, a portion of the target application is identified and migrated into a Web service. In this approach, the role of testing is central since after each migration step the new service-oriented application is tested with the aim of checking “its equivalence” with the original version. An experiment based on four Java applications has been conducted to tune the approach and evaluate applicability and effort involved in the migration process. The obtained results confirm the viability of the proposed approach and highlight some encountered SOA migration difficulties. Keywords SOA migration · Stepwise approach · Web services · Axis

1 Introduction According to a recent InfoWorld research report1 companies that have successfully implemented service-oriented appli1

http://www.s2.com.br/s2arquivos/403/multimidia/197Multi.pdf.

A. Marchetto (B) Fondazione Bruno Kessler, IRST, 38050 Povo, Trento, Italy e-mail: [email protected] F. Ricca CINI Unity at DISI (Laboratorio Iniziativa Software FINMECCANICA/ELSAG spa, CINI), 16146 Genova, Italy e-mail: [email protected]

cations have experienced many benefits, e.g., more agile IT systems that can be adapted to change faster, more flexible and reusable architecture, integration of new and existing applications, speeds in custom application development and reduced cost of maintaining applications. However, many companies adopting SOA had a lot of problems to achieve positive return on investment (ROI) and a recent Gartner’s survey about adoption/use, benefits of and SOA practices evidences that a lot of organizations fail in realizing their SOA adoption plan.2 Why? The reasons can be various: e.g., lack of skills or training, lack of mature standards, lack of best practices guidelines, IT and business organizational barriers. We believe that one of the main reasons that are delaying the SOA adoption is the integration/migration of legacy applications. We share the view that, when executed, the migration brings a lot of well-known advantages (e.g., reusability and flexibility) but before reaching it, a considerable amount of effort is required [21]. Migrating an application toward SOA is a difficult and error-prone task where tools and approaches play a fundamental role [22]. During this transformation important decisions have to be taken and several questions answered. Some examples of them are: which approach and tools are used to execute the entire migration? Which tools, languages and technologies are used to create and deploy a service? How to discover potential services from legacy systems (service mining)? How to identify code fragments that can be transformed in services? The main contribution of this paper is the definition of a stepwise approach based on testing that can help a developer to migrate an existing Java system into an equivalent serviceoriented system. To this end, some practical guidelines and a well-defined process have been proposed to identify and 2

http://www.gartner.com/it/page.jsp?id=790717.

123

A. Marchetto, F. Ricca

migrate candidate services of the original system (e.g., functionalities, relevant business entities, non-businesscentric entities) into a set of Web services invoked by an orchestrator. The orchestrator is a portion of Java code extracted and reengineered from the original system that invokes the Web services for realizing the same business process and the same set of functionalities of the original system. The approach proposed here is the result of a iterative, mixed top-down and bottom-up process. An initial approach was defined by analyzing the available literature (e.g., [21]). Then, it was tailored for Java and refined by means of several iterations of empirical validation performed using four Java applications. The architectural migration that we propose is quite different from the concept of migration investigated usually in the literature. Several approaches [3,8,21] propose to expose parts of the original (legacy) system in terms of Web services. In these works, an accessory software layer called “wrapper” is added to the original system. The wrapper externally exposes some system functionalities so that they are available and reusable for other software systems and users. By applying a wrapper, the system is enriched with an extra piece of software but its internal architecture remains the same. In fact, a Java system wrapped as a Web service remains an object-oriented system, just enriched with an additional “interface”. Conversely, the architectural migration investigated in this work aims at changing the internal structure of the original system performing a “complete” transformation of the original system. A benefit of our migration approach is that the new built services may concur to realize an enterprise service inventory3 and thus be re-used—individually or in group—in different software systems and projects. The paper is organized as follows. Section 2 gives some background regarding SOA, presents an useful service categorization and introduces notions used in the rest of the paper. Section 3 specifies the goal of this work and introduces our stepwise approach. Section 4 reports two activity diagrams that visually present our approach and discusses in detail the migration phases. Section 5 shows in practice the applicability of our approach by means of four case studies. Finally, Sect. 6 presents some related works and Sect. 7 concludes the paper. 2 Background A number of definitions (sometimes contrasting) and standards have been proposed in the literature to clarify the notion of SOA and its components, i.e., the services. According to the W3C Web service Glossary [11] for us “a service is an 3

From Erl [7], a service inventory is an independently standardized and governed collection of complementary services within a boundary that represents an enterprise or a meaningful segment of an enterprise.

123

abstract resource that represents a capability of performing tasks that form a coherent functionality from the point of view of providers entities and requesters entities”. Instead a Web service [1] is a possible implementation of a service with peculiar characteristics. For example, the service description is published in WSDL, the exchange of messages takes place via Web and the execution effects of a service are possibly persistent. In this work we have adopted: – a simplified and revised version of the “UML4SOA” profile [15] to describe the service-oriented architecture of a software system; and – the categorization of services presented by Erl [7] to better identify and classify different kinds of services in the migration approach. With the purpose of briefly introducing both of them, we consider a simple service-oriented Automated Teller Machine (ATM) application.4 It is composed of three classes (Cash Dispenser, Screen and Keypad) and four main components (service providers): the Authentication that authenticates the users, the Accounts handler that accepts the user transactions (e.g., deposit and withdraw), the Phone Cards handler that handles the phone card acquisition process with external services (i.e., the managing authorities) and the BankDB that stores the business data in persistent way (i.e., Current Account) and provides operations to execute database queries. In this simple example, the ATM component orchestrates the other services to realize the business process and the logger “traces” in a persistent way each user request. UML4SOA provides model elements, i.e., UML stereotypes (e.g., provider) for describing both structural and behavioral aspects of a service-oriented system. Figure 1b shows a portion of the ATM architecture described by applying a simplified version of UML4SOA. The BankDB component provides the BankDB service and implements two interfaces: HandleAccount with several operations (e.g., debit) and Authenticate with unique operation (i.e., authenticate). The Accounts handler component has two required interfaces (socket symbols not represented in the picture) to communicate with the other components (BankDB and Logger) and one provided interface to communicate with the Business Unit ATM. Following the Erl’s service classification, there are three kinds of services with different meaning [7]: – Entity services. They define the organization’s relevant business entities. An Entity service represents a business-centric service that bases its functional boundary 4

ATM is one of the four considered case studies.

From objects to services

Fig. 1 ATM architecture (portion). a Before. b After

and context on one or more related business entities. It is considered a highly reusable service since it is agnostic to most parent business processes. As a result, a single Entity service can be used to automate several parent business processes. In our example, we have only one Entity service, named BankDB, that handles the business entity Current Account. – Task services. They define the “process logic” that spans multiple entity domains and does not fit cleanly within a functional context associated with a business entity. This type of service tends to have less potential reuse than Entity services and it is generally used as the controller responsible for composing other, more processagnostic, services. In our example there are four Task services: Authentication, Accounts handler, Phone Cards handler and ATM. Really, ATM is a special kind of Task service since it orchestrates the other services with the aim of realizing the entire business process. It is called orchestrated Task service [7] and we mark it with the (not UML4SOA conform) BU stereotype—see Fig. 1b. These orchestrated Task services can be developed as standalone Web services implemented using traditional languages (e.g., Java) or they may represent a business process definition hosted within an orchestration platform (e.g., written using BPEL [6]). – Utility Services. They establish a functional context that is non-business centric. They are usually dedicated to provide reusable, cross-cutting utility functionality, such as event logging, notification, data transformations and exception handling. In our example there is only one Utility service: the Logger. 3 Stepwise approach The goal of this work is proposing an approach to help a developer to migrate an existing desktop Java applica-

tion—i.e., an application running in a desktop or laptop computer5 —toward SOA where the services are implemented as Web services [1]. While tools and problems presented in this paper are specific for Java Web services, the stepwise approach can also be applied when the services are implemented by adopting different technologies (e.g., distributed objects, software agents, grid services) and languages (e.g., C++, .NET, PHP). In contrast to the “big bang” approach where the migration takes place in a unique step we propose a stepwise approach. In each migration step only a candidate service— e.g., a functionality of the original system that is candidate to be migrated—is migrated in a Web service (i.e., Wsx). The starting point is a Java application called P and the result is a Web service-oriented Java application equivalent to P (i.e., with the “same semantic”) called Pn where some “portions of code” are implemented by means of Web services. More precisely, Pn is a software composed of two logical entities: (a) the new created Web services (Ws1,. . .,Wsn) and (b) the Business unit (BUn) or orchestrator that invokes the Web services Ws1,. . .,Wsn to realize the same set of functionalities of the original system (see Fig. 2). This approach is iterative and it stops when all the candidate services to migrate have been migrated (see Fig. 2). Then the migration can be considered complete and the role of the BUn consists only in invoking the migrated services—if we desire it may be next substituted with a business process definition given, e.g., in BPEL (this is out of our work). Otherwise, the BUn will contain a mix of invocations to Web services and portions of “old code” that realize the candidate services not yet migrated. We came to this “stepwise approach” after having ascertained, in some real migration projects, the difficulties incidental to the migration toward SOA.

5

Conversely to client/server and/or Web-based applications.

123

A. Marchetto, F. Ricca

Fig. 2 Stepwise approach. The full arrows represent a migration step while the dotted arrows indicate Web service invocation

This stepwise approach has three advantages: (a) it is agile and flexible since some prescriptions are optional and because the developer can stop the migration at the step that he/she want; (b) each step is simpler to realize than the complete “big bang” migration so the possibility to introduce new errors is smaller—this is particularly true when, as in our case, a test suite is executed against the code before and after each migration step to detect potential divergences between the two implementations; and (c) the partially migrated application (i.e., where not all the candidate services are migrated) can be delivered to the customers and so transparently used in the user environment.

4 Migration activities The UML activity diagram MAIN depicted in Fig. 3 visually presents our approach in detail. It represents the activities that a developer has to perform in order to migrate a Java application toward SOA. The activities (action nodes) are represented by rounded-corners rectangles, while object nodes are depicted by means of rectangles. The object nodes represent the inputs and the outputs of each activity. Enriching notes and comments are placed in rectangles with a folder over upper-right corner. The addition of an icon in the lower right corner means that the activity represents a nested activity diagram (i.e., the activity is further described by means of an auxiliary activity diagram). Finally, decisions are represented in the diagram by means of diamonds. The migration activities are summarized below and then the most relevant ones are detailed in the rest of this section. The first step consists in analyzing the Java application P—looking at the documentation/code and executing it—with the purpose of understanding its architecture and identifying its main functionalities. The second step is selecting the list L of candidate services to migrate. The third step is building a testsuite Ts of executable testcases6 for P. The suite will be used for two purposes: (a) to identify the fragments of code that implement the candidate services in the original application and (b) for regression testing purposes, 6

An executable testcase contains inputs and expected results and its execution against the code compares outputs with expected results.

123

i.e., for finding potential divergences between P and its subsequent versions (in the activity diagram a generic Px is used to identify them). Given the list of candidate services L and the testsuite Ts the next steps are choosing the candidate service to migrate (represented in the activity diagram with Csx)7 and executing the nested activity one step migration. This activity, further detailed below, executes a complete step of migration in which some fragments of code are transformed into a Web service and, accordingly, the target application is modified to invoke it. The execution of one step migration could raise a side effect, modifying L (e.g., when the developer separates a highly “coarse-grained” candidate service in two candidate services) and/or Ts. These two last activities (“choose a candidate service” and “execute one step migration”) can be iterated until the migration satisfies the developer. The overall migration approach terminates by applying an optional refactoring (and re-testing) step having the purpose of improving the granularity of some Web services. In fact, it could happen that the result is not satisfying and some services have to be glued together because they are very fine-grained (e.g., they expose only one operation each). Finally, the migrated application can be delivered to the customer (optional). The one step migration activity is detailed with an additional UML activity diagram (see Fig. 4). The candidate service Csx to migrate is the input of the one step migration. In this activity, the first step is identifying, in the actual program, the fragments of code (“core code”) that will be used to develop the future Web service Wsx. If the identified core code is wrapable (a piece of code is wrapable if it can be separated from the rest of the system code by spending a “reasonable” amount of effort [20]) then the process continues with the code extraction activity— i.e., copying the identified fragments of code in a separate unit and making it compilable and executable. Alternatively, the developer has two solutions: writing by hand (i.e., from scratch) the code implementing Csx or postponing its migration (e.g., the migration step could became feasible or simpler subsequently aggregating Csx with another candidate service). In case the Csx migration is not postponed, the following steps are: wrapping the code identified in the previous phase and deploying it as a Web Service (Wsx). The last step, before testing the new application with Ts, consists in refactoring the Business unit BUx. It means: “stripping out” the previously identified code from the application and changing all the methods invocations to the “stripped” code with invocations to the new Web service Wsx.

7

It could be useful to identify an order of the candidate services to be migrated. We suggest to start with candidate Entity services because they are highly reusable, agnostic and usually used by other services of the System (e.g., Task services).

From objects to services Fig. 3 Activity diagram representing the phases of the migration approach

Fig. 4 Activity diagram: one step migration

123

A. Marchetto, F. Ricca

4.1 Application understanding The first step of the approach consists in analyzing the target application with the aim of identifying its main functionalities and understanding the underlying system architecture. Applications may be enriched by documents describing the requirements at the user level, such as: textual use cases, functional requirements in natural language and user manuals. From them, it is possible to recover the list of the implemented functionalities (by means of documents inspection). Alternatively, when the documentation is lacking, this task can be manually conducted by the user by executing the target application with appropriate inputs and annotating the discovered functionalities. Finally, each functionality will be described by a set of (textual) scenarios that can be accomplished by a user exercising the target application. We have chosen to represent the application functionalities and their relationships (e.g., extend and include) using an UML Use Case diagram. That will be useful in the next phases: testsuite generation and candidate services selection. The knowledge of the architecture is important all through the migration process, especially for selecting the candidate Entity services and the non-functional parts of the original system (i.e., Utility services) candidate to be reimplemented as Web services. Reverse engineering techniques and tools can be used to recover and represent the architecture of the target application. They are particularly useful when the size of the application is considerable or when the architectural documentation of the target system is not updated or completely absent. 4.2 Selecting the candidate services The second step consists in selecting the initial set of candidate services L to migrate. Three types of candidate services (task, entity and utility) can be identified and each of them is selected in a specific manner: – candidate Task service. They have to be identified considering the functionalities of the application. In particular, we have considered each main functionality of the target application as a possible candidate Task service. – candidate Entity services. They have to be identified analyzing the persistent objects and the classes using them. To identify candidate Entity services we looked at the DB tables of the application (if any) and at the classes of the System implementing the CRUD operations— Create, Read, Update and Delete, i.e., the main functions for a persistent storage. The reason is that often relevant business entities are stored into databases or persistent storage [13].

123

– candidate Utility services. They have to be identified in portions of non-business-centric code, i.e., looking at cross-cutting functionalities, such as event logging, notification, data transformations and exception handling.

It is important to highlight that this step requires all the knowledge gained in the previous phase (in particular functionalities identified and architecture of the system) and produces an initial set of candidate services L. This set can be modified during each migration step for several reasons: e.g., because (a) a candidate service is migrated into a Web service (deleting it from L), (b) it is very difficult and expensive to migrate a candidate service (deleting it from L), (c) a candidate service is highly coarse-grained (separating one candidate service of L in two or more services) or highly finegrained (gluing two or more potential services of L), and (d) a new candidate service is discovered (adding it to L). 4.3 Creating the testsuite The third step is generating a testsuite Ts for the target application. Whether the use cases of the target application are available, e.g., from the documentation, we can apply the three-steps approach proposed by Heumann [12] to generate the testcases. It can be summarized as follows: – (step F1) for each use case, the whole set of scenarios is generated; – (step F2) for each scenario, a testcase and the conditions that will make it “executable” are identified; – (step F3) for each testcase, data values able to exercise the corresponding execution of the system are selected. Otherwise, we can derive the testsuite using the Use case diagram of the application and consider the scenarios determined in the application understanding phase. The generated testsuite will be used for identifying the fragments of code that implement the candidate Task services (see below) and for regression testing purposes. In this last case, the suite is applied after each migration step for finding potential divergences between the (initial) target code and its subsequent version (obtained after the migration step). In fact, it is well known that regression testing allows developers to refactor, restructure and re-engineer a system safely—if the new code “passes” a complete and well-defined testsuite then it is probable that new errors have not been introduced. For these reasons we stress the importance of this phase. The testsuite should be constituted by executable testcases (e.g., Junit testcases) and it should be as complete as possible; it should, at least, exercise all the main system functionalities.

From objects to services

4.4 Code identification

4.5 Code extraction

The fourth step consists in identifying the code fragments that will be used to develop the future Web service. Identifying the code fragments that implement an application functionality (i.e., the candidate Task service) is an hard task. Sneed [21] claims that the key to discover services in existing code is recovering the business rules. A business rule is basically “a requirement on the conditions or manipulation of data expressed in terms of the business enterprise or application domain”. An approach to recover a business rule in a system is to identify the names of the resulting data and trace how they are produced in the system [23]. This technique— i.e., simplifying, slicing on the target variable—is used to identify the system elements that a functionality impacts, i.e., classes, methods and lines of code used to implement that functionality. However, according to our experience, the intuition is that “a simplified version of feature location” (see below) is more adequate and effective for code identification. One of the limits of the techniques based on slicing is that a lot of manual effort is required to the developer to identify the interesting portions of code. In fact, the developer has to identify the outputs of interest and locate the statements of the system (starting point for slicing) in which the outputs are produced. Another limit is that often the code obtained applying slicing is very big since the business logic, that we want to separate, is usually mixed with large portions of code implementing GUIs, utilities and controllers and slicing is not able to select only business logic statements. Hence, an additional and heavy refactoring phase is often required to refine and clean the sliced code before using it as new Web service. Hence, the following (simplified) version of feature location can be adopted as a basis to identify the code of the service candidate:

Before starting with the extraction phase we have to assess the “core code” associated to the candidate services. Some questions have to be answered:

– tracing the code (only classes and methods). Usually, it is supported by specific tools (e.g., EMMA8 ); – executing the testcases of Ts related to the target candidate service; – analyzing the execution traces with the aim of collecting informations about the executed system elements (i.e., methods and classes). The result of this phase is a non-compilable Java class composed by a list of system elements (i.e., methods and classes) useful to implement the considered candidate service. These system elements are what we called “core code” of the target candidate service.

8

http://emma.sourceforge.net/intro.html.

– Are the identified fragments of code “Wrapable”? Inspired by Sneed [20], a piece of code is “wrapable” if it can be separated from the rest of the system code by spending a “reasonable” effort. A way to evaluate the wrapability of a code fragment is to analyze (a) its coupling with the rest of the system (e.g., method calls, field access) and (b) its required input and output objects. – Are the fragments of code “good” enough? In particular, is the code maintainable, testable and reusable in other contexts? This phase is “strongly” subjective but very important. In fact, the outcome could be rewriting a piece of code that implements a service rather than extracting it (e.g., the code is not wrapable or the quality is poor) or deciding of postponing/avoiding the migration of a candidate service. Several operations need to be performed to make the “core code” a compilable unit: – copying the fragments of code in a separate unit (e.g., a class) – cleaning the code, i.e., removing the portions of code not implementing the business logic (e.g., input/output statements) from it; – duplicating portions of code (if necessary). Some classes, such as the datatypes (e.g., Euro, Address) are used by the service that we are migrating as well as by the Business unit and so, they have to be consistently copied into them; – eliminating potential dependencies between the code actually implementing the target functionality and its contour (code isolation); – applying class flattening and methods inclusion [21] to handle inheritance. We are aware that some of these operations (e.g., code duplication and flattening) could degrade the overall quality of the system code. However, in our approach they are essential operations (key operations) able to simplify the code extraction phase. Only through the use of them, a developer can obtain a compilable and executable set of Java classes implementing the candidate service. In this phase, we have not yet modified the Business unit while we worked only on a copy of the considered code. The “real” code extraction will be executed in the Business refactoring phase (see below).

123

A. Marchetto, F. Ricca

4.6 Wrapping and deployment

5 Experimental study

Important steps to complete the migration are:

The proposed approach has been applied to four (simple) Java applications with the aim of iteratively refining it and showing its applicability in practice. More precisely, the first application that we have considered (LaTazza) was migrated by a master student [16]— under our constant supervision—with the aim of tuning the approach. Instead, the other three Java applications have been migrated by one of the authors for collecting some data and showing the viability of our approach. In this section we present the four Java applications used as case studies, the main difficulties encountered during the migration and some data that we have collected (e.g., LOCs before and after the migration and time spent) to give an idea of the effort spent during the migration.

– (optional) refactoring the set of Java classes obtained in the previous step for improving its quality; – “wrapping” the set of classes by adding a proxy/mock object that simplifies creation, invocation and use of the future Web service; – deploying it on a Web server for generating the new Web service; – creating the Java clients (called stubs) for invoking/using the Web service from the rest of the system. This means supplying the set of Java classes with a WSDL interface, publishing the service and creating the stubs able to invoke the Web service (starting from the WSDL). Fortunately, in Java these steps can be automated using the “code generation components” of Axis29 . To the best of our knowledge, one of the better ways to automatically realize a Web service starting from its Java code is to deploy it (e.g., a Java class or a set of classes) on Axis2 so that it can automatically create the correspondent WSDL interface description and make the code available as a Web service. Alternatively, the command java2wsdl of Axis2 can be used to manually customize the generation of the WSDL interfaces. Furthermore, the command wsdl2java of Axis2 permits to generate the stubs to invoke the Web services. However, to have a Web service working, some not-simple configuration files (e.g., services.xml) have to be written and some prescriptions have to be followed (e.g., building Web services with user-defined datatypes implies writing the correspondent serialization class [2]). 4.7 Business unit refactoring and testing The last migration step consists in refactoring the Business unit and testing the migrated system. After this step the Business unit will invoke the previously created Web service in order to realize the functionalities of the candidate service. The steps are:

– “stripping out” from the current application the fragments of code just transformed in a Web service; – converting all the methods invocations to the stripped code to invocations to the generated service-stubs; – running the testsuite Ts to be sure that the transformation has not introduced new faults in the whole application. 9

http://ws.apache.org/axis2/.

123

5.1 Desktop applications used as case studies ATM is a small desktop application that simulates an Automated Teller Machine. Its main functionalities are: authentication, deposit, withdraw, funds moving and phone cards handling. This application was realized by a master student of the University of Genova (Italy) enriching an application provided in the book “Java How to Program, 7/e”10 by adding a GUI (Swing technology) and a persistent mechanism (Hibernate). The system consists of 26 Java classes for a total of 4072 Lines of Code (LOCs). A portion of the ATM class digram (before migration) is reported in Fig. 1a while the architecture after the migration is depicted in Fig. 1b. LaTazza, used also for other purposes (e.g., [18]), is a simple Pojo application for a hot drinks vending machine realized by master students of the University of Genova. LaTazza supports sale and supply of small-bags of beverages (Coffee, Tea, Lemon-tea, etc.) from the Coffee-maker. The LaTazza user can: sell small-bags to clients, buy boxes of beverages, manage credit and debt of employees, check inventory and cash account. The application consists of 17 Java classes for a total of 6,184 LOCs. In LaTazza, persistent objects are implemented with Entity-Relationships tables using the H2 Database Engine11 (JDBC connection). The GUI is realized using the SWT technology. MTAC12 is an open source and easy-to-use symbolic math program written in Java. It supports complex numbers, symbolic differentiation, numerical integration and plotting; in other words, it performs both conventional arithmetic and scientific operations, such as derivatives and symbolic integrals. The application GUI is mainly realized using the AWT 10

http://www.pearsonhighered.com/educator/academic/product/0, 3110,0132222205,00.html.

11

http://www.h2database.com/.

12

http://sourceforge.net/projects/mtac/.

From objects to services

Fig. 5 Refactoring for testing the method move

technology while the overall system consists of 108 Java classes for a total of 10,095 LOCs. Jmove13 is a medium, open-source, Java Swing-based application that provides a framework and an extendable set of tools to ease the understanding of Java applications. It is based on a model-centric approach which allows different kinds of measurements and analysis for Java-based system, e.g., dependency analysis and impact analysis. The application consists of 685 Java classes for a total of 30,515 LOCs.

first solution because it is easier to manage. Conversely, the business managers would prefer the latter solution because it gives them the maximum flexibility [21]. “By construction”, our approach tends to privilege the second solution— the smallest granularity—however, at the end of the process an (optional) refactoring and re-testing step is devoted to glue together highly fine-grained services (see Fig. 3).

5.2 Encountered difficulties

Creating the testsuite is a difficult task that requires knowledge about the application and its underlying architecture. This task is further complicated when the code under test contains input and output statements (Console or GUI indifferently) intermixed with the business logic that we want to test (our case in all the considered applications). To better explain the problem and a possible solution (that we have called refactoring for testing) we can consider an example taken by the ATM application (see Fig. 5), but first we have to precisely identify the relevant components of a typical executable testcase written using an existing testing framework (e.g., Junit or Fit [17]). A testcase can be logically subdivided into four parts:

During the migration of the above-presented case studies, some practical difficulties/problems have been encountered. Here we summarize the most relevant ones: 5.2.1 Determining granularity The problem of deciding the “proper” granularity of a Web service (i.e., how much functionality a Web service should provide) is well known in the literature [7,21]. The granularity level is context dependent and usually it must be determined in every specific situation. We faced this problem in all the considered applications. For example, in the ATM application an unique Web service could be designed for maintaining a bank account with all the functionalities that go along with that: making deposits, making withdrawals, transferring funds, computing interest, creating balance notices, making phone cards acquisitions and handling authentication. This will, of course, require a highly complex interface for these kinds of Web services. Alternatively, one Web service for each functionality could be designed. For instance, a Web service could be restricted to simply making withdrawals. That would lead to a very simple interface for this type of Web services. As suggested by Sneed, often, the IT technicians would prefer to offer the 13

http://jmove.sourceforge.net/.

5.2.2 Testsuite construction

– Initialize: create the environment (e.g., creating object instances) that the test expects to run in; – Input creation: create the inputs to use; – Execution: call the code that is being tested passing the arguments and capturing any output; – Check: Use assert statements to ensure that the code worked as expected. If the method under test (e.g., method move, left Fig. 5) contains some input statements (e.g., the method keypad. getInput() that requires the user to provide the input values) intermixed with the business logic, then some refactoring interventions (e.g., extract method) will be required to separate such statements from the rest (right Fig. 5). Only

123

A. Marchetto, F. Ricca

doing that refactoring (Fig. 5), it will be possible to execute the testcase with its own inputs (i.e., the inputs generated during the testcase Input creation part). In our example, the testcase will set the variables accountTo and accountFrom, and will call the method moveBL passing them. Also regarding output statements [e.g., screen.displayMessage (. . .)], we have to refactor the original code (e.g., method move, left Fig. 5) to execute the testcases. In those cases, the output statements have to be wrapped into an object (a String in our example) and the object has to be returned from the method that we are testing (moveBL, right Fig. 5). Only in this way it will be possible to write a testcase that automatically can check—by means of assertions (Check part)—whether the method works as expected. In our migration approach this type of refactoring (refactoring for testing), essential for testing, is also useful to simplify the subsequent phases of code extraction and Business unit refactoring. In fact, during refactoring for testing we modularize each method separating input, output and business logic; the latter is what is used for developing a service. 5.2.3 Scattering and tangling At times, the code implementing a candidate service was scattered among several classes of the application under migration (code scattering). For instance, the code that realizes the business logic of the supply functionality in LaTazza was scattered among two Java classes: Supply and Main. On other occasions, the code implementing a candidate service was mixed with other code as GUIs, utilities and controllers (code tangling). That happened especially for applications developed without following the Model-View-Controller style (e.g., MTAC). In case of code scattering and tangling, a lot of effort was required to (a) identify in several classes the code fragments that realize the business logic (rationale) of the candidate services; (b) separate that code from the contour (e.g., GUIs); and (c) develop a unique wrapper for the recovered spread code. 5.2.4 Local resources access For security reasons resources and files stored in a local system cannot be easily accessed by a remote system (as a Web service). During the Jmove migration, we managed this difficulty—the new generated Web services have to read the Java files to analyze—by refactoring the original system code. In that case, we changed the local access to those files with a more adequate Web upload system in which the files are sent to the service via a HTTP connection. In our opinion, that solution can be generalized for those cases in which a local resource needs to be accessed by a remote Web service (clearly some problems can arise when the size of the resources increase a lot).

123

5.2.5 Code duplication A service is an autonomous, independent entity that to be used in a distributed context (and re-used in other applications/projects) need to contain “internally” all the used resources (e.g., used classes and libraries). Often, in our case studies, some classes of the original systems were used by a set of candidate services (i.e., common classes). To resolve the “common classes” problem, in this work, we adopted the following code duplication strategy14 : (a) such classes have to be consistently copied in all the developed Web services that use them and (b) in the service that invokes the new Web services, if any, such classes have to be copied and some conversion of operations implemented. That happens, for instance, in LaTazza where the class Euro is defined in the two entity candidate services: Cash and Depository. The sell-beverages Task service, that uses these services by means of their stubs, has access to both the two implementations of Euro (i.e., DepositoryStub.Euro and CashStub.Euro). Hence, to have a working sellbeverages Web service a conversion between the two implementations of Euro is required. Clearly the same problem can also be found in the Business unit (i.e., the portion of the application that invokes all the Web services). 5.2.6 JAX-RPC specification conformity Axis2 and its tools (e.g., wsdl2java) work preferably with Java code written according to the JAX-RPC15 specification. To automatically convert a Java class into a Web service, Axis2 requires that (a) the class must implement a public constructor without parameters, (b) methods have to be declared public and (c) parameters and return type of the methods must be Java primitive types (e.g., int and character) or simple datatypes (e.g., String and array) that can be automatically serialized. The reason is that complex user-defined datatypes (e.g., composition of java.util.List and java.util.Enumeration) cannot be automatically managed and serialized by Axis2. In our applications several classes violated that specification—e.g., Supply and PersonalRegistry in LaTazza and DumpRCMartinMetrics in Jmove. Therefore, a lot of effort has been required to make the code JAX-RPC compliant16 before the application migration. As an example, the return type of the class PersonalRegistry of LaTazza was LinkedList. Thus, to make it conform to the JAX-RPC specification, we had to convert the LinkedList into an array. 14

An alternative solution could be using XSD files shared by more services [7].

15 16

See http://java.sun.com/webservices/jaxrpc/docs.html.

An alternative was manually writing Serializers/Deserializers per each input/output.

From objects to services Table 1 Number of services migrated per application (N ), number of testcases composing the testsuite (Ts), LOCs of the two applications before and after the migration Appl.

N

Ts

Before

After

ATM

5

29

4,072

5,523

6,184

Table 2 LOCS before and after the migration divided by candidate services (only main) Appl.

Candidate serv.

ATM +1,451 (+36.6%)

Authentication Accounts Handler Phone Cards Handler BankDB

LaTazza

8

21

9,711

+3,527 (+57%)

MTAC

1

5

10,095 10,342

+247 (+2.4%)

Jmove

3

20

30,515 34,176

+3,661 (+11.9%)

After

Reused

598 828

494 (82.6%) 725 (90%)

543

589

520 (88%)

482

482

482 (100%)

1,834

2,920

752 (25.7%)

Pay debt

1,178

2,130

551 (25.8%)

Supply

1,250

1,734

660 (38%)

MTAC

Expr computation

7,008

7,144

6,985 (97%)

Jmove

Dependency analysis 17,001

17,375

16,531 (95%)

Metrics computation 17,148

17,608

16,531 (93.8%)

Statistics

18,036

16,681 (91%)

LaTazza

Diff is equal to (After−Before) with the increment per cent in parentheses

5.3 Collected data Table 1 presents, for each considered Java application, some data regarding the two implementations: Before (i.e., the original Object-Oriented system) and After (i.e., the Web service-oriented version obtained after the migration). Precisely, for each application it reports the number of services migrated (N), the number of testcases composing the testsuite (Ts) and the number of LOCs of the two versions (Before and After).17 The last column of the table reports the difference of LOCs between the two versions (i.e., After−Before) and the percent increment that Before underwent to be transformed in After (emphasized between parentheses). That increase is mainly due to the code written for: (a) overcoming some obstacles encountered during the migration (see encountered difficulties sub-section) and (b) implementing the code wrappers. From Table 1 it is apparent that LaTazza undergoes to a huge increment (+57%) with respect to the other applications where the increase is very small. That is mainly due to the fact that the student that made the LaTazza migration applied a lot of refactorings directly to the original code instead of creating same wrappers (as done in the other three applications). For instance, for solving code scattering and tangling problems some Java classes of the original LaTazza were mainly rewritten (by combining the scattered code and splitting the tangled code) rather than wrapped with ad hoc proxy/mock objects. That type of activity clearly improves the architectural quality of the final Web service-oriented system but require a lot of effort. Table 2 details the LOCs of the code fragments that implement the main candidate services of each application, Before and After the migration. For instance, the code that implements the sell beverages functionality in the original LaTazza counts 1,834 LOCs while the Web service providing that functionality counts 2,920 LOCs. Furthermore, the last column of Table 2 reports the number of LOCs that are shared (i.e., reused) between each initial candidate service and its 17

Before

Diff

Noting that the services-stub classes automatically built by Axis2 are not considered in After since they are automatically generated and managed by the tool in a transparent way.

Sell beverages

550 802

17,576

Table 3 Time (h) spent to migrate the applications Application Understand/ Testsuite Identify/ Wrap/ Refactor/ Total select extract deploy test BU ATM

2.5

6

5

9

5.6

28.1

LaTazza

9

18

15

35

6

83

MTAC

3

1.5

1.3

4

6

21.8

Jmove

11

13

9

28.7

6.9

68.6

Web service counterpart. The percentage of the shared code is emphasized in parenthesis and it can be used to evaluate the effort required to carry out the migration (e.g., high percentage means not much effort). For example, in LaTazza only the 28.9% of the Web service-oriented system code (1,963 out of 6,784 LOCs) was reused while the rest of the code (4,821 LOCs) was written from scratch (see Table 2). This constitutes an exception; in the other cases (e.g., Jmove) the percentage of reuse is higher, on average 93%. As already explained in this section, this happens for the “strong” refactoring performed by the student in the original LaTazza code. Table 3 shows the time expressed in hours, subdivided per activity, that has been spent to migrate the applications. As expected, the activities of understanding and selection of the candidate services were fairly demanding. This is particularly true for Jmove (16% of the total time) where its size complicated the program understanding phase. In LaTazza and ATM, the main problem was deciding the “proper” granularity of the services. A special comment deserves the testsuite generation activity. It was demanding in all applications but essential for the migration. In the time computed for this activity we have also included the time spent for refactoring the original application (refactoring for testing). This last operation was particularly demanding but it simplified the subsequent migration phases. Surprisingly a huge amount of time (and effort) has been spent during the Wrapping and Deployment activities (e.g., 42.1% of the total time in Lat-

123

A. Marchetto, F. Ricca

azza and 41.8% in Jmove). In particular, most of the time has been spent to make the applications conform to the JAXRPC specification. Instead the code identification/extraction activity was less demanding than expected (e.g., 18% of the total time in LaTazza and 13.1% in Jmove). LaTazza is an exception also considering the total time (last column of Table 3) since the overall migration time is higher with respect to the other applications. This difference is due to the fact that this migration was executed by a master student not much experienced in SOA development. 5.4 Threats to validity and general considerations The main causes that might limit the possibility of successfully applying our approach to other applications (i.e., generalization) are analyzed in this section. Moreover, we report here some critical considerations regarding the proposed approach and discuss the architectural quality of the migrated service-oriented applications. The selected applications are small/medium Java applications belonging to different domains. This makes the context quite realistic, although further studies with bigger applications are necessary to confirm or confute the applicability of our approach. In particular, problems could arise when size and complexity of the applications increase given that the approach is mainly manual and heavily based on testing. Moreover, we do not know whether the case of Java applications totally produced using frameworks and libraries is compatible with our approach. The selected applications are simple desktop applications implemented without using particular frameworks and libraries (exceptions are ATM that uses Hibernate and Jmove that partially uses ASM18 and Piccolo19 ). We are conscious that the application of the proposed stepwise approach is demanding. Because, as we have seen, our approach privileges rigor than agility. Even if testing is an ingredient particularly demanding of our approach, we consider it essential because it facilitates the “semantic equivalence” checks among the various versions, and hence improves the final quality of the result. We are well aware that our approach tends to produce applications with fine-grained and functionality-centric services (i.e., services that provides a functionality) that in some cases could be unsuitable for a specific target (e.g., building a system based on stateful Web services containing part of the flow of the application). Moreover, we know that the applications produced using our method do not fully satisfy all the design principles beneath SOA (e.g., discoverability, dynamic service invocation [7]). However, it is important to remark here that the goal of the proposed migration approach 18

http://asm.objectweb.org/.

19

http://www.cs.umd.edu/hcil/jazz.

123

is to obtain “a preliminary” service-oriented implementation of the original system, not the “best one”. Thus, it is accepted that the migrated system could be “non-perfect” from a service-oriented perspective (e.g., granularity of the built services too fine). In some sense this work can be considered as a first step toward a complete SOA migration process. To complete the migration toward SOA other steps could be accomplished (we plan to consider them in our future work): – an additional refactoring step could be applied to the system after the migration process for improving its architectural quality (e.g., services reusability, services loose coupling, services composability [7]); – the Java orchestrator obtained with our approach could be substituted with a business process definition given in BPEL. 6 Related works In this section we briefly summarize some approaches and methodologies described in the literature for migrating legacy applications into service-oriented systems. In the literature the following types of approaches can be identified: – Business-oriented. Approaches that use information about the business process realized by the target application as starting point to perform its migration; – Functionality-based. Approaches that are mainly focused in (a) identifying the functionalities implemented by the target system and (b) locating them in the code, e.g., using slicing or feature location techniques; – Model-based. Approaches that build the new serviceoriented system by starting from a model of the target system. This model is often extracted by applying reverse engineering techniques—e.g., starting from the analysis of system artifacts, such as code, execution traces and documentation. – Interaction-based. Approaches using as a starting point for the migration the interactions between the user and the target system or between system components (e.g., database and business logic); 6.1 Business-oriented The approach meet-in-the-middle, proposed by Inaganti and Behara [14], combines the bottom-up with the top-down approach to migrate existing applications. The steps of the approach are: (a) establishing the business process underlying the target application and identify its business activities (top-down); (b) identifying the points of functionality in the existing system (bottom-up); (c) exposing as services the points of functionality; (d) mapping the identified services with the business process activities and (v) refining

From objects to services

the generated map with the aim of achieving a one-to-one map between activities and services (new services are created whether a correspondence is lacking). The approach proposed is totally manual. Cetin et al. [5] introduce another business-oriented approach. This “Mashup approach” refines [14] in several directions and it does not re-engineer the existing source code. The Business model is expressed using BPMN (Business Process Modeling Notation) [24]. 6.2 Functionality-based Harry Sneed [21] proposes a research framework and several guidelines for migrating to Web services. The introduced framework is useful to recover Web services from existing legacy applications. It involves four main activities. First, discovering potential Web services (i.e., fragments of code realizing the desired functionality) from the existing application via inverse data flow analysis. Second, evaluating the potential discovered Web services—i.e., evaluating maintainability, testability, interoperability and reusability of the code as well as the “business value” of the functionalities implemented by this code. Third, extracting the Web service code and, finally, wrapping and deploying it to be reused as a Web service. In a companion paper [16] we described a preliminary version of our stepwise approach and briefly summarized some data regarding the LaTazza case study. The present work refines the previous one in several directions (e.g., in the phase of code identification we replaced slicing with a “simplified version of feature location”) and report two UML activity diagrams that better detail and explain the approach. Moreover, it extends our previous work by adding further three case studies that have been used to evaluate the approach applicability. Consequently some new SOA migration difficulties have been addressed. Our proposal can be considered an instantiation in the Java world of the framework [21] with new aspects (e.g., our approach is stepwise and the role of testing is central) and details on how to face practical problems involved in the migration. Zhang et al. [25] present another service-oriented re-engineering process applicable to legacy systems that can be classified as functionality-based. Its distinguishing feature is in the use of clustering during the code identification phase. Basically, the target application is clustered with the aim of identifying cohesive modules that can be wrapped together and becoming a service.

activities. First, a UML model of the target system is recovered and then converted into an intermediate model format (expressed in ER). Second, the ER model is reinterpreted by a generic modeling environment (GEM). Finally the model is translated into a service-domain specific language based on WSDL. Shimin et al. [19] present a framework (Jcomp) for extracting reusable components from a Java system. The steps are: (a) extraction of models (e.g., dependencies graphs) from the target system using a source code parser; (b) identification and extraction of reusable components by applying a semiautomatic approach based on graph transformation; and (c) system conversion according to the identified components. Guzman et al. [10] describe a model-based approach to re-engineer a relational database and integrate it into a SOA application. It is composed of four steps: First, the target database is reverse engineered and its schema is used to generate a Platform Specific Model (PSM). Second, according to the recovered schema and applying a model-to-model transformation a Platform Independent Model (PIM) is obtained and a service candidate is identified. Third, the abstract objects contained in the PIM-based model are identified and analyzed for enriching the model with OCL constraints. Finally, using: (a) the recovered models, (b) a WSDL metamodel and (c) by applying a model-to-model transformation, the WSDL description of the service is generated to expose the database as a service.

6.4 Interaction-based Canfora et al. [3] present a “wrapping methodology” to make interactive functionalities of legacy systems accessible as Web Services. The idea is to develop a Web service that is used as a proxy-module able to interact at run time with the legacy application. This module is built by analyzing the interactions between the user and the legacy application. The result of this analysis is a User Interaction Model described using a Finite State Automata then used to develop the Web service. Del Grosso et al. [9] propose a semi-automatic approach for mining services in database oriented applications. The approach identifies “pieces of functionality” to be potentially exported as services observing the interactions between the application and the database.

6.3 Model-based

7 Conclusions

Cao in his PhD dissertation [4] proposes a model-driven approach for re-engineering a legacy system to a Web service-oriented application. The approach involves three main

In this paper we have presented an approach to migrate a “traditional” Java application into an equivalent Web serviceoriented application. The peculiarities of our approach are:

123

A. Marchetto, F. Ricca

– it is stepwise; in fact the final Web service-oriented application is obtained step by step transforming a candidate service at the time; – the role of testing is central since after each migration step the new service-oriented application is tested with the aim of checking “its equivalence” with the previous version; – it is tailored for Java and hence several free tools of the Java world, such as Axis2 and Junit, can be employed. Our approach was successfully applied to four Java applications. This experimentation was useful for two main reasons: (a) for experimenting that our approach is workable and applicable in practice and (b) for highlighting some difficulties associated with real SOA-migration projects. Future works will be devoted to apply our migration approach to bigger and more complex applications for evaluating its scalability. Furthermore, we are also considering the possibility of developing an Eclipse plug-in implementing a SOA migration tool supporting our approach. Acknowledgments We would like to thank Prof. Egidio Astesiano and Dr. Paolo Tonella for their useful comments and suggestions. Filippo Ricca is supported by the project “Iniziativa Software” Finmeccanica (Research unity of Software Engineering coordinated by Prof. Egidio Astesiano, DISI, Genova, Italy) funded by Elsag-Datamat.

9.

10.

11.

12.

13.

14. 15.

16.

17. 18.

References 1. Alonso, G., Casati, F., Kuno, H., Machiraju, V.: Web Services: Concepts, Architectures and Applications. Springer, Heidelberg (2004) 2. Arumugan, N.: Building web services with user-defined data types. In: Developer.com (2007) 3. Canfora, G., Fasolino, A.R., Frattolillo, G., Tramontana, P.: Migrating interactive legacy systems to web services. In: Proceedings of 10th Conference on Software Maintenance and Reengineering— CSMR 2006, pp. 24–36. IEEE Computer Society Press, USA (2006) 4. Cao, F.: Model Driven development of Web Services and Dynamic Web services composition. Ph.D. dissertation, University of Alabama at Birmingham, USA (2005) 5. Cetin, S., Altintas, N., Oguztuzun, H., Dogru, A.H., Suloglu, O.T.S.: A mashup-based strategy for migration to service-oriented computing. In: IEEE International Pervasive Services. IEEE Computer Society Press, USA (2007) 6. Curbera, F., Goland, Y., Klein, Y., Leymann, F., Roller, D., Weerawarana S.: Business process execution language for web services. Web page. Version 1.0—July 31 (2002) 7. Erl, T.: SOA principles of service design. In: Erl, T. (ed.) ServiceOriented Computing Series. The Prentice Hall, USA (2007) 8. Glatard, T., Emsellem, D., Montagnat, J.: Generic web service wrapper for efficient embedding of legacy codes in service-based

123

19.

20. 21.

22.

23.

24. 25.

workflows. In: Grid-Enabling Legacy Applications and Supporting End Users Workshop (2006) Grosso, C.D., Penta, M.D., de Guzman, I.G.-R.: An approach for mining services in database oriented applications. In: IEEE European Conference on Software Maintenance and Reengineering— CSMR. IEEE Computer Society Press, USA (2007) Guzman, G.R., Polo, I., Piattini, M.: An adm approach to reengineer relational database towards web services. In: Working Conference on Reverse Engineering (WCRE). IEEE Computer Society, USA (2007) Haas, H., Brown, A.: Web Services Glossary. Technical report, W3C Working Group Note, World Wide Web Consortium (W3C). http://www.w3.org/TR/ws-gloss/ (2004) Heumann, J.: Generating test cases from use cases. Technical report, Rational Software. http://www.ibm.com/developerworks/ rational/library/content/RationalEdge/jun01/GeneratingTestCases FromUseCasesJune01.pdff (2002) Hung, M., Zou, Y.: Extracting business processes from three-tier architecture systems. In: International Workshop on Reverse Engineering to Requirements—wREtoR (2005) Inaganti, S., Behara, G.: Service identification: Bpm and soa handshake. In: BPTrends (2007) Koch, N., Mayer, P., Heckel, R., Gonczy, L., Montangero, C.: UML for Service-Oriented Systems. Technical Report D1.4a, Sensoria, Munich, Germany. http://www.pst.ifi.lmu.de/projekte/ Sensoria/del_24/D1.4.a.pdf (2007) Marchetto, A., Ricca, F.: Transforming a java application in an equivalent web-services based application: toward a tool supported stepwise approach. In: IEEE International Symposium on Web Site Evolution—WSE. IEEE Computer Society Press, USA (2008) Mugridge, R., Cunningham, W.: Fit for Developing Software: Framework for Integrated Tests. Prentice Hall, USA (2005) Ricca, F., Penta, M.D., Torchiano, M., Tonella, P., Ceccato, M., Visaggio, C.: Are fit tables really talking? a series of experiments to understand whether fit tables are useful during evolution tasks. In: International Conference on Software Engineering—ICSE, pp. 361–370. IEEE Computer Society Press, USA (2008) Shimin, L., Tahvildari, L.: Jcomp: A reuse-driven componentization framework for java applications. In: IEEE International Conference on Program Comprehension—ICPC. IEEE Computer Society Press, USA (2006) Sneed, H.: Measuring reusability of legacy software systems. Softw. Process Improv. Pract. 4(1), 43–48 (1998) Sneed, H.: Migrating to web services. a research framework. In: Workshop at CSMR 2007. Service-Oriented Architecture Maintenance—SOAM (2007) Sneed, H.: Cob2web—a toolset for migrating to web services. In: IEEE International Symposium on Web Site Evolution—WSE. IEEE Computer Society Press, USA (2008) Sneed, H. Erdos, K.: Extracting business rules from source code. In: 4th International Workshop on Program Comprehension (WPC’96), pp. 240–250. IEEE Computer Society, USA (1996) White, S.A.: BPMN Modeling and Reference Guide. Future Strategies Inc., Lighthouse Pt, FL (2008) Zhang, Z., Liu, R., Yang, H.: Service identification and packaging in service oriented reengineering. In: The Seventeenth International Conference on Software Engineering and Knowledge Engineering (SEKE 2005), pp. 620–625 (2005)

Suggest Documents