Decomposing legacy programs: a first step towards ... - CiteSeerX

10 downloads 85428 Views 993KB Size Report
Decomposing legacy programs: a first step towards migrating to client±server platforms ... computing is the organisation of software systems into a set of loosely ...... based online processing systems (a report from the field). In: Proceedings of ... degree in Electronic Engineering and Computer Science from the. University of ...
The Journal of Systems and Software 54 (2000) 99±110

www.elsevier.com/locate/jss

Decomposing legacy programs: a ®rst step towards migrating to client±server platforms Gerardo Canfora a,*, Aniello Cimitile a, Andrea De Lucia a, Giuseppe A. Di Lucca b b

a Facolt a di Ingegneria-Universit a del Sannio, Palazzo Bosco Lucarelli, Piazza Roma, 82100 Benevento, Italy Dipartimento di Informatica e Sistemistica ± Universit a di Napoli ``Federico II'', Via Claudio 21, 80125 Naples, Italy

Received 8 June 1999; received in revised form 22 November 1999; accepted 26 December 1999

Abstract A solution to the problem of salvaging the past investments in centralised, mainframe-oriented software development, while keeping competitive in the dynamic business world, consists of migrating legacy systems towards more modern environments, in particular client±server platforms. However, a migration process entails costs and risks that depend on the characteristics of both the architecture of the source system and the target client±server platform. We propose an approach to program decomposition as a preliminary step for the migration of legacy systems. A program slicing algorithm is de®ned to identify the statements implementing the user interface component. An interactive re-engineering tool is also presented that supports the software engineer in the comprehension of the source code during the decomposition of a program. The focus of this paper is on the partition of a legacy system, while issues related to the re-engineering, encapsulation, and wrapping of the legacy components and to the de®nition of the middleware layer through which they communicate are not tackled. Ó 2000 Elsevier Science Inc. All rights reserved.

1. Introduction Many large organisations have invested in the past 30 years in centralised, mainframe-oriented software development. These software systems, sometimes called legacy systems, have reached a crisis point because of two main factors: high maintenance costs and scarce ¯exibility. Legacy system maintenance often monopolises the e€ort of software organisations: whilst ®gures vary, there is a general agreement that software maintenance, including error corrections, adaptations, and enhancements, consumes between 50% and 70% of the budget of a typical software organisation (Lientz and Swanson, 1980; Nosek and Prashant, 1990). Among the main causes of the high maintenance costs of a legacy system are a poor design, the size, the degradation of the structure due to the frequent maintenance operations, the inadequacy of the documentation, and the use of obsolete technologies, languages, and tools. A poor *

Corresponding author. Tel.: +39-0824-305804; fax: +39-082421866. E-mail addresses: [email protected] (G. Canfora), [email protected] (A. Cimitile), [email protected] (A. De Lucia), [email protected] (G.A. Di Lucca).

design, the inability to anticipate the changes desirable today, and the obsolescence of technologies, languages, and tools are the main reasons for the scarce ¯exibility of legacy systems and, in particular, for their inability to evolve together with the continuously changing needs of an organisation: legacy systems often keep organisation business from staying on top (Bennett, 1995; Brodie and Stonebaker, 1995). A solution to the problem of salvaging the past investments, while keeping competitive in the dynamic business world, is migrating legacy systems towards more modern environments, in particular client±server platforms (Butler, 1996). A key concept of client±server computing is the organisation of software systems into a set of loosely coupled components. In particular, a typical client±server system comprises a client component (sometimes called a front-end) that is in charge of managing and co-ordinating the processing components, and one or more server components (also called back-end) which actually process information based on the clientÕs requests. Client and server components do not necessarily have to be executed on di€erent machines, even if this is the most frequent case. The process of migrating a legacy system towards a client±server platform entails costs and risks that depend

0164-1212/00/$ - see front matter Ó 2000 Elsevier Science Inc. All rights reserved. PII: S 0 1 6 4 - 1 2 1 2 ( 0 0 ) 0 0 0 3 0 - 3

100

G. Canfora et al. / The Journal of Systems and Software 54 (2000) 99±110

on the characteristics of both the architecture of the source system and the target client±server platform. A software system can be considered as having three types of components: interface components, application logic components, and database components. Depending on how separated and well identi®ed are these components, the architecture of a legacy system can be decomposable, semidecomposable, or nondecomposable (Brodie and Stonebaker, 1995). In a decomposable system, the application logic components are independent of each other and interact with the database components and potentially with their own user and system interfaces. In a semidecomposable system, only interfaces are separate modules, while application logic components and database services are not separated. A nondecomposable system has the worst architecture: the system is a black box with no separated components. The architecture of the target client±server platform can vary depending on how the system components are distributed between the client and the server machines. For example, the Gartner Group identi®es ®ve styles of client±server computing depending on where the border between the client and the server is placed (Butler, 1996): at one end is the case when the user interface is split between the client and the server and the remaining components run on the server, while at the other end is the case when data management is split between the server and the client which also hosts the other components. To address the complexity of building client± server systems, at the border between clients and servers is very often a middleware layer consisting of a specialised communication software that manages the interaction between the various components of the system. Notable examples of middleware includes the Open Software FoundationÕs (OSFÕs) Distributed Computing Environment (DCE), 1 the Object Management GroupÕs (OMGÕs) Common Object Request Broker Architecture (CORBA), 2 and the MicrosoftÕs Distributed Component Object Model (DCOM). 3 In this paper, we deal with software systems composed of programs each of which may comprise all types of components and propose a technique to decompose them. Our decomposition technique is based on slicing an interprocedural control dependence representation of a system. More speci®cally, slicing is used to identify all the statements and predicates that implement, or control, I/O operations; these statements de®ne the minimal user interface component of the system. Accordingly, each program is reengineered to a client±server style by extracting and encapsulating in separate subroutines code fragments implementing database components and application logic components. As a consequence of this 1 2 3

http://www.opengroup.org/dce. http://www.omg.org, http://www.corba.org. http://www.microsoft.com/com.

re-engineering, all calls are directed from the user interface component to subroutines that implement database or application logic components. This constitute a preliminary step for migration to client±server platform, as the extracted subroutines may be wrapped and moved to the server. The focus of this paper is on the partition of a legacy system, while issues related to wrapping of the legacy components, to the de®nition of the middleware layer through which they communicate, and to the re-implementation of the user interface with modern, eventdriven GUI languages, are not tackled. The remainder of the paper is organised as follows. Section 2 sets the goals and discusses di€erent strategies for program decomposition depending on the target client±server styles. Section 3 presents our program decomposition slicing algorithm for identifying the set of statements that implement the user interface component of a system. In Section 4, an interactive tool for re-organising the procedural logic of programs is presented; the tool implements the slicing algorithm shown in Section 3 and produces a program that can be more easily migrated to client±server platforms than the original one. Section 5 discusses related work, while concluding remarks are given in Section 6.

2. Program decomposition strategies Decomposing a program entails the identi®cation and the reorganisation of di€erent program components. Basically, these components can be distinguished in user interface components, application logic components, and database components. User interface components correspond to code fragments clustered around I/O statements on WORKSTATION ®les, where the term WORKSTATION ®le refers to any stream read/written through the user interface. A database component might be any piece of code clustered around I/O statements on DISK ®les, including chunks of embedded SQL or 4GL code. Application logic components correspond to code fragments implementing business rules. The user interface components include statements that directly or indirectly control the execution of I/O statements on WORKSTATION ®les, in addition to the I/O statements themselves. Indeed, in a client±server system the client should mainly implement the user interface, have the control of the system, and requests the services needed (mainly database accesses) from the server. Accordingly, user interface components are typically migrated to the client, while database components are migrated to the server. Application logic components can be migrated either to the client or to the server, or they can even be split between the server and the client, depending on the chosen style (see Fig. 1). For

G. Canfora et al. / The Journal of Systems and Software 54 (2000) 99±110

Fig. 1. Client±server styles.

a program to be ready to be migrated to a client±server platform, the following conditions should be satis®ed: 1. the program is structured in subroutines each implementing a di€erent type of component; 2. each subroutine implementing a database component must not reach, on an interprocedural path, programs or subroutines implementing user interface components. One of the main diculties that prevent migration is the interleaving in the same subroutine of statements implementing both types of components; a particular case is when subroutines implementing database components (and that should then be migrated to a server) contain calls to subroutines implementing user interface components (that should be migrated to the client). Fig. 2 shows the call graph of a typical procedural program visualised through VCG (Lemke and Sander, 1993), a tool that displays a graph from a textual speci®cation. Di€erent colours (levels of grey in this printed version) are used to distinguish between di€erent classes of subroutines: subroutines that do not contain statements of user interface components are white, while subroutines containing statements of both user interface and database components are light grey; subroutines that contain statements of user interface and do not contain statements of database components are grey. Fig. 3 shows the call graph of the same program in a version prepared for migration: here no subroutine contains statements of both user interface and database components. The new call graph can be decomposed by an ideal border line in two parts: the subroutines of the client component (in grey) are in the higher part of the call hierarchy, while the lower part contains the subroutines of the server components (in white). The border line crosses all the edges corresponding to calls between

101

subroutines in the client component and subroutines in the server components; these calls will have to be converted into service requests. Preparing a program for migration requires the decomposition of the subroutines that contain statements of both user interface and database components. There are di€erences in the level of complexity of the decomposition techniques due to the target client±server style chosen. In fact, if the application logic components are to be allocated to the client (case (c) in Fig. 1), the decomposition simply requires that all the I/O statements on DISK ®les are identi®ed, and replaced by requests to the server. In the other cases, more complex analyses are required to identify the set of statements that will certainly be included in the user interface components. In Section 3, we present a slicing algorithm for identifying the set of statements contributing to implement the user interface components; this de®nes the minimal set of statements to be included in the client application. If the application logic components have to be migrated to the server (case (a) in Fig. 1), then automatic techniques can be de®ned to extract the maximal sequences of statements including I/O statements on DISK ®les but not including any statement of the user interface component. Each one of such sequences of statements will implement a service on the server and must be encapsulated in a separate subroutine to meet the requirements set in points 1 and 2. However, completely automatic decomposition techniques might produce meaningless and too fragmented subroutines which would be dicult to understand, thus compounding the problem of future maintenance. An alternative approach consists of supporting the software engineer with semiautomatic decomposition tools that facilitate the comprehension of the code fragments to be extracted and encapsulated in subroutines. This is also the only approach usable when the application logic components are to be split between client and server (case (b) in Fig. 1). Section 4 presents an interactive program decomposition tool which allows notable code fragments to be selectively extracted and encapsulated in subroutines. The tool prevents the extraction of the statements of the user interface components (identi®ed by program slicing) and forces the extraction of all the I/O statements on DISK ®les.

3. Identifying user interface components Most program decomposition algorithms exploit graph representations of programs at di€erent levels of abstraction. The decomposition algorithm presented in this paper uses representations carrying information about the control ¯ow of programs at the intraprocedural and interprocedural levels.

102

G. Canfora et al. / The Journal of Systems and Software 54 (2000) 99±110

Fig. 2. A VCG visualisation of the call graph of a program.

3.1. Background We refer to software systems written in procedural languages, such as COBOL, FORTRAN, and RPG. Such systems are typically composed of a set of programs related through external calls and can be represented by a graph whose nodes represent the programs and whose edges depict the call relation between programs; we will refer to such a representation as the system call graph. Each program in a software system might in turn be structured in a set of subroutines related through internal calls. Several languages, such as RPG and FORTRAN, provide primitives to explicitly de®ne subroutines and to express internal calls; in the case of COBOL, subroutines correspond to (sequences of) sections or paragraphs and internal calls are expressed as PERFORM (THRU)

statements. The call relation on the subroutines of a program can be represented by a graph whose nodes correspond to the program subroutines and whose edges depict the internal calls; we will refer to such a representation as a program call graph. A program call graph has always a main, i.e. a subroutine reaching any other subroutine on the call graph. Conversely, a system call graph does not have necessarily a main, as a software system might comprise di€erent programs activated through the operating system shell or within a job. Finally, each program subroutine can be depicted by low level representations such as a control ¯ow graph and a control dependence graph. A control ¯ow graph is a directed graph, where each node represents a predicate or a statement and each edge represents a transfer of the control between statements. Nodes

G. Canfora et al. / The Journal of Systems and Software 54 (2000) 99±110

103

Fig. 3. A VCG visualisation of the call graph of a decomposed program. The border line separates the subroutines of the client (the higher part) from the subroutines of the server (the lower part).

corresponding to predicates have two outgoing edges representing conditional transfers of the control; these edges are labelled ``T'' (true) and ``F'' (false). Nodes corresponding to statements have one unlabelled outgoing edge. Conditional transfers of control generate control dependencies (Ferrante et al., 1987). Informally, a node m is control dependent on a node n if and only if n has two outgoing edges; following one of the edges always results in m being executed, while taking the other edge may result in m not being executed. If the edge which always causes the execution of m is labelled with true (false, respectively) we say m is control dependent on the true (false) branch of n. Control dependencies can be represented by a control dependence graph (Ferrante et al., 1987); this is a directed graph containing the same set of nodes as the control ¯ow graph and whose edges depict the control dependence relation. 4

4 Ferrante et al. (1987) also introduces region nodes to summarise control dependencies; this is not required for our purposes.

The decomposition algorithm proposed in this paper exploits a uni®ed representation combining the four aforementioned graphs; we call such a representation the system graph. In a system graph, the nodes corresponding to I/O statements are annotated with information about the WORKSTATION or DISK ®le involved, while nodes corresponding to call statements are linked to the node representing the called subroutine or program. Fig. 4 shows the skeleton of a software system and its system graph. The system graph is a variant of the uni®ed interprocedural graph proposed by Harrold and Malloy (1993) as a core representation for a maintenance environment, with the aim of saving on both storage space and access time. 3.2. The slicing algorithm Identifying the user software system can be slicing starting from the STATION ®les in the

interface component of a achieved through program I/O statements on WORKinteractive programs and

104

G. Canfora et al. / The Journal of Systems and Software 54 (2000) 99±110

Fig. 4. A software system skeleton and its system graph.

backward searching for all the call statements and predicates which directly or indirectly control their execution. Program slicing has been introduced by Weiser (1984) as a program decomposition technique based on the analysis of the control and data ¯ow. Di€erent versions of program slicing have been proposed and used for di€erent applications, including

software maintenance, program comprehension, testing, function recovery, program parallelization and optimization, program integration, and software metrics. Surveys about program slicing techniques and their applications can be found in (Tip, 1995; Binkley and Gallagher, 1996; Gallagher and Harman, 1998).

G. Canfora et al. / The Journal of Systems and Software 54 (2000) 99±110

105

Fig. 5. The procedure ComputeUserInterface.

Our interprocedural slicing algorithm for identifying the set of statements contributing to implement user interface components is based on the analysis of the control ¯ow information of a software system summarised in a system graph. The algorithm is shown in Fig. 5: it takes a system graph G as an input and returns the set UserInterfaceSet of nodes corresponding to statements and predicates which contribute to implement user interface components. The algorithm also returns the set InteractiveProgramSet of the programs containing such statements. The algorithm analyses all the subroutines of all the programs in the system. The programs are visited according to the partial order induced by the reverse external call relation on the system

call graph: 5 in this way each program is analysed only after the programs it calls have been analysed (the list ProgramList contains the program nodes in such order). Similarly, the subroutines of a program are visited according to the partial order induced by the reverse internal call relation on the program call graph (the list SubroutineList contains the subroutine nodes in such order). For each subroutine, the algorithm computes the initial set StartSet of control ¯ow graph nodes that correspond to I/O statements on WORKSTATION 5

This partial ordering is only de®ned in the absence of recursion, as in the case of legacy systems written in RPG, COBOL, and FORTRAN.

106

G. Canfora et al. / The Journal of Systems and Software 54 (2000) 99±110

Fig. 6. The procedure ComputeSlice.

®les, or call statements to programs or subroutines that might cause the execution of I/O statements on WORKSTATION ®les (contained in the sets InteractiveProgramSet and InteractiveSubroutineSet, respectively). The slice SliceSet is actually computed from StartSet using the procedure ComputeSlice shown in Fig. 6: the control dependencies are backward traversed, starting from the initial set of nodes; all nodes reached during this transitive closure are included in the slice. If the set SliceSet returned for a subroutine is not empty, the subroutine, and the program that contains it, might cause the execution of I/O statements on WORKSTATION ®les and therefore they are inserted in the sets InteractiveSubroutineSet and InteractiveProgramSet, respectively. The set UserInterfaceSet is also updated with the nodes in the current slice SliceSet. The time complexity of the algorithm depends on the number np of programs in the software system, on the maximum number ns of subroutines contained in a program and on the maximum number nn of statements in a subroutine. It is worth noting that using a suitable data structure for implementing a call graph, the construction of the ordered lists ProgramList and SubroutineList can be obtained in linear time with respect to the number of nodes in the graph. Therefore, the complexity of the algorithm is O…np  ns  Cs †, where Cs is the time complexity for computing the slice for a single subroutine. Note that the time complexity for building the control dependence graph is O(nn2 ) and the time complexity of the procedure ComputeSlice is O(nn) (Ferrante et al., 1987). Finally, the time complexity for updating the set UserInterfaceSet is linear with the respect to the number of nodes contained in SliceSet (because the two sets are disjoint). Therefore, the overall complexity of the algorithm is O…np  ns  nn2 †. If the control dependence graphs have previously been constructed, the time complexity becomes O…np  ns  nn†.

4. Re-organising the procedural logic of legacy programs This section presents an interactive tool that supports the software engineer in the extraction and encapsulation in separate subroutines of meaningful sequences of statements implementing application logic and/or database components. The tool gives the software engineer the possibility of deciding whether application logic components have to be allocated to the client or to the server. Moreover, it prevents the extraction of statements contributing to implement user interface components, while forcing the extraction of database accesses. The tool is part of a larger environment for reverse engineering and migrating legacy systems mainly written in RPG on the IBM AS/400 machine towards object-oriented platforms (De Lucia et al., 1997). Fig. 7 shows the decomposition tool architecture.

Fig. 7. The architecture of the decomposition tool.

G. Canfora et al. / The Journal of Systems and Software 54 (2000) 99±110

107

Static analysers for the RPG/400, SQL and DDS (Data Description System) languages 6 have been implemented using the YACC compiler-writing facility and Visual Age C++ (IBM, 1995). The analysers produce both ®ne-grained and coarse-grained representations, including the system graph described in the previous section, and store them into a relational database, implemented with the DB2 relational database management system. Each node of the control ¯ow graph is annotated with information about the variables it de®nes and/or uses and is linked to the corresponding RPG statement, also stored in the database. Embedded SQL is treated as a single node of the control ¯ow graph and is linked to ®ne-grained information stored in other representations. Information about the structure of records, ®les, arrays, key and parameter lists are also contained in the database. Finally, the tool provides facilities for exporting graphical views of the software system in the form of VCG speci®cations. The program slicer implements the decomposition algorithm discussed in Section 3. It builds the control dependence graph of a subroutine, based on the control ¯ow information extracted by the RPG analyser; the control dependencies are exploited, together with information about external and internal calls and I/O statements on WORKSTATION ®les, to identify the nodes corresponding to statements and predicates contributing to implement the user interface component. The identi®ed slice is stored in the database and is exploited during the reorganisation of the program subroutines. The browser consists of a graphical user interface which visualises an interactive program (these programs correspond to the system graph nodes in the set InteractiveProgramSet returned by the slicing algorithm) chosen by the software engineer and allows sequences of statements to be selected, extracted, and encapsulated into new subroutines. The software engineer can select any sequence of statements that do not contain statements of the user interface component (see Fig. 8(a)). The tool visualises the latter statements (corresponding to nodes included in the slice UserInterfaceSet returned by the slicing algorithm) with right arrows (®) on the left side and prevent their selection by displaying an error message. The tool also marks with left arrows (¬) DISK ®le operations, that conversely should be extracted from the client component. A warning message is displayed at the end of the session if there are sub-

6

A typical RPG software system on AS/400 is composed of programs mainly written in RPG/400, with possible embedded SQL code, and the OS/400 operating system control language CL; moreover, the OS/400 environment provides a Data Description System (DDS) language, for the description of ®le records and tables of the native database.

Fig. 8. The decomposition tool: selecting a sequence of statements (a), encapsulating the extracted sequence in a subroutine (b), and replacing the extracted sequence by a call (c).

routines containing both DISK ®le operations and statements of the user interface component; the tool provides facilities for localising these subroutines.

108

G. Canfora et al. / The Journal of Systems and Software 54 (2000) 99±110

Extracting and encapsulating a selected sequence into a subroutine (see Fig. 8(b)), and replacing it with a subroutine call (see Fig. 8(c)), is performed automatically by the subroutine reorganiser; the software engineer is in charge of assigning a name to the new subroutine according to the meaning of the code fragment extracted. A description of the business rule the subroutine implements can be optionally added; this will be automatically included in the reorganised program in the form of a comment. 5. Related work Other approaches have been presented in the literature addressing the problem of migrating legacy systems to client±server architectures. Sneed and Nyary (1994) propose three di€erent approaches ± procedural, functional, and data type ± for re-modularising and downsizing legacy systems from a centralised, mainframe environment to a distributed client±server platform. The procedural approach views a program as a directed graph with the decisions as nodes and the branches as edges; complex graphs are split into subgraphs by ®nding the points of minimum interconnections. Other procedural approaches exploit program slicing techniques for program restructuring (Kim and Kwon, 1994) or to identify re-usable functions (Lanubile and Visaggio, 1997). The functional approach assumes that a program has been designed according to a functional decomposition; the program can be viewed as a hierarchy of superordinate control nodes and subordinate elementary nodes and each business rule can be mapped onto one or more control nodes. Examples of remodularisation techniques based on the functional approach can be found in Cimitile and Visaggio (1995), Markosian et al. (1994) and M uller et al. (1993). The data type approach views a program as a set of cooperative processing objects; modules are constructed clustering together the set of operations upon a given data type or entity. Several methods have been proposed in the literature for identifying object-oriented architectures in legacy systems; see Breuer et al. (1993), Canfora et al. (1996) and Lindig and Snelting (1997) for examples. Whilst all these approaches are suitable for migrating the architecture of legacy systems composed of batch programs, they do not address the problem of separating user interface components from database and application logic components. This is a preliminary step for migrating interactive programs to client±server platforms. This issue is addressed in Sneed (1997), where the need to separate and encapsulate in object wrappers the business logic and the data accesses is pointed out. However, unlike our approach, the paper does not present a method to achieve such a decomposition.

Further work related to the problem of migrating legacy systems to client±server architectures concerns user interface re-engineering. Merlo et al. (1995) propose a technique for re-engineering CICS based user interfaces in COBOL programs into graphical interfaces for client±server architectures. Character based interface components are extracted and converted into speci®cations in the Abstract User Interface Description Language (AUIDL); then they are reengineered into graphical AUIDL speci®cations and used to generate the graphical user interfaces. The authors outline the need for investigating slicing and dependence analysis to integrate the new interface code into the original system. In this sense, our approach is complementary: it can be used to reengineer the program control ¯ow logic, before converting the character based user interface into a graphical user interface. Other approaches have been presented for migrating user interface components from one platform to another, although they do not address the problem of migrating legacy systems from a mainframe to a client± server platform (Moore et al., 1994; Van Sickle et al., 1993). Van Sickle et al. (1993) propose a technique for converting user interface components in large minicomputer applications, written in COBOL, to run under CICS on an IBM mainframe. Moore et al. (1994) propose a knowledge-based approach for user interface migration. Although generally interesting, this approach has only been experimented to migrate graphical user interfaces from a MS-Windows based PC environment to a Motif based POSIX workstation environment. 6. Experiences and concluding remarks The work described in this paper has addressed the problem of decomposing a program in di€erent types of components for migrating legacy systems to client± server platforms. The proposed approach exploits static analysis and program slicing techniques for identifying the set of program statements that contribute to implement database and user interface components. An interactive tool has also been presented for extracting database and application logic components to be allocated to the server. The method and the tool presented in this paper have been experimented in a pilot migration experiment conducted by two software engineers on a medium sized software system for the management of the administrative services of a peripheral government organisation. The system consisted of 124 RPG programs, 89 DDS modules and 65 CL procedures (jobs). The overall size of the system is about 100 KLOC. Most RPG programs were well structured in subroutines; about 70% were interactive programs, coded according to a procedural style. Using the tool described in Section 4, these

G. Canfora et al. / The Journal of Systems and Software 54 (2000) 99±110

programs were decomposed into user interface and application domain components and reorganised according to a client±server logic. Whilst the limited extent of the experiences does not allow de®nitive conclusions, there are a number of useful considerations that can already be drawn: · the identi®cation of the statements implementing the minimal user interface component of a system can be fully automated. We registered very good performances for our slicing tool that required less than 5 min to analyse 100 KLOC; · the identi®cation of application logic and database components is a human intensive task. The e€ort required to decompose the interactive programs was two man-months. In general, the e€ort required to decompose a legacy system into its components depends on the quality of the system and on the number of interactive programs; · highly interactive legacy systems may drive to a sensible decomposition of procedural programs into ®ner grained code components. This may increase reengineering risks and costs and may also degrade the performances of the reengineered client±server system. In general, the re-engineering risks and costs and the performances of the resulting system should be evaluated when extracting server components from the user interface component, to avoid the excessive fragmentation of the original program. To better understand the strengths and the limitations of our approach, we have conducted a further reengineering step: we have extracted and encapsulated in separate programs the previously generated subroutines implementing application logic and database components. Accordingly, calls to these subroutines in the original programs have been replaced with program calls. This re-engineering task required intensive use of data ¯ow analysis to identify the interface data of the new programs (Cimitile et al., 1998). Then we have wrapped the resulting programs and migrated them to the server (Canfora et al., 1999). A team of software engineers is currently re-implementing the client (user interface component) in Visual Age C++ on the top of Microsoft Windows. The results we have obtained from a preliminary testing activity con®rm that keeping fragmentation at a minimum during the decomposition of interactive programs is a key to keep the performances and the reliability of the ®nal system in line with the user expectations. Our approach can be extended in two main directions. A ®rst direction consists of de®ning techniques to support the software engineer in the decisions to be made when distributing application logic components between client and server. A second direction consists of combining existing methods for user interface rezengineering with middleware technology to enable the re-engineering of legacy systems to client±server archi-

109

tectures. We are currently extending our approach to data intensive legacy systems written in languages different than RPG, such as COBOL. This work is part of a recently started project conducted with small/medium enterprises to migrate to the Web legacy systems written in COBOL for the management of internal and external services of peripheral government organisations. Acknowledgements The research described in this paper has been funded by CORINTO (Consorzio di Ricerca Nazionale per Tecnologia ad Oggetti) within the project ERCOLE (Encapsulations Re-engineering and Coexistence of Objects with Legacy Systems). The authors would like to thank Fabio Castiglioni and Silvia Petruzzelli for their contribution to the development of the ideas presented here. Patrizia Angelini, Michele De Leo, Maria Pia Dicuonzo, Patrizia Guerra, Maria Nella Palese, and Mary Tafuri have contributed to the implementation of the decomposition tool as outlined in this paper. References Bennett, K.H., 1995. Legacy systems: coping with success. IEEE Software 12 (1), 19±23. Binkley, D., Gallagher, K., 1996. Program slicing. In: M. Zelkowitz (Ed.), Advances in Computers, vol. 43. Academic Press, San Diego, CA. Breuer, P.T., Haughton, H., Lano, K., 1993. Reverse-engineering COBOL via formal methods. Journal of Software Maintenance: Research and Practice 5, 13±35. Brodie, M.L., Stonebaker, M., 1995. Migrating Legacy Systems ± Gateways, Interfaces and Incremental Approach. Morgan Kaufmann, San Francisco, CA. Butler, J.G., 1996. Mainframe to Client/Server Migration: Strategic Planning Issues and Techniques. Computer Technology Research Corporation, Charleston, SC. Canfora, G., Cimitile, A., Munro, M., 1996. An improved algorithm for identifying reusable objects in code. Software Practice and Experiences 26 (1), 24±48. Canfora, G., De Lucia, A., Di Lucca, G.A., 1999. An incremental object-oriented migration strategy for RPG legacy systems. International Journal of Software Engineering and Knowledge Engineering 9 (1), 5±25. Cimitile, A., De Carlini, U., De Lucia, A., 1998. Incremental migration strategies: data ¯ow analysis for wrapping. In: Proceedings of the Fifth IEEE Working Conference on Reverse Engineering. Honolulu, Hawaii, IEEE Computer Society Press, Los Alamitos, CA, pp. 59±68. Cimitile, A., Visaggio, G., 1995. Software salvaging and the call dominance tree. The Journal of Systems and Software 28 (2), 117±127. De Lucia, A., Di Lucca, G.A., Fasolino, A.R., Guerra, P., Petruzzelli, S., 1997. Migrating legacy systems towards object-oriented platforms. In: Proceedings of the IEEE International Conference on Software Maintenance. Bari, Italy, IEEE Computer Society Press, Los Alamitos, CA, pp. 122±129. Ferrante, J., Ottenstein, K.J., Warren, J., 1987. The program dependence graph and its use in optimization. ACM Transactions on Programming Languages and Systems 9 (3), 319±349.

110

G. Canfora et al. / The Journal of Systems and Software 54 (2000) 99±110

Harrold, M.J., Malloy, B., 1993. A uni®ed interprocedural program representation for a maintenance environment. IEEE Transactions on Software Engineering 19 (6), 584±593. IBM, 1995. Visual Age C++-UserÕs Guide, IBM Canada Ltd Laboratory, North York, Ontario, Canada. Gallagher, K.B., Harman, M. (Ed.), 1998. Program Slicing (Special issue). Information and Software Technology 40 (11/12). Kim, H.S., Kwon, Y.R., 1994. Restructuring programs through program slicing. International Journal of Software Engineering and Knowledge Engineering 4 (3), 349±368. Lanubile, F., Visaggio, G., 1997. Extracting reusable functions by ¯ow graph-based program slicing. IEEE Transactions on Software Engineering 23 (4), 246±259. Lemke, I., Sander, G., 1993. VCG: A Visualization tool for Compiler Graphs, The COMPARE consortium, 1993, available from: ftp.es.uni-se.de (134.96.254.254) :/pub/graphics/cdg/. Lientz, B.P., Swanson, B.E., 1980. Software Maintenance Management. Addison-Wesley, Reading, MA. Lindig, C., Snelting, G., 1997. Assessing modular structure of legacy code based on mathematical concept analysis. In: Proceedings of the 19th International Conference on Software Engineering. Boston, MA, ACM Press, pp. 349±359. Markosian, L., Newcomb, P., Brand, R., Burson, S., Kitzmiller, T., 1994. Using an enabling technology to reengineer legacy systems. Communications of the ACM 37 (5), 58±70. Merlo, E., Gagne, P.Y., Girard, J.F., Kontogiannis, K., Hendren, L., Panangaden, P., Mori, R., 1995. Reengineering user interfaces. IEEE Software 12 (1), 64±73. Moore, M., Rugaber, S., Seaver, P., 1994. Knowledge-based user interface migration. In: Proceedings of the IEEE International Conference on Software Maintenance. Victoria, Canada, IEEE Computer Society Press, Los Alamitos, CA, pp. 72±79. M uller, H.A., Orgun, M.A., Tilley, S.R., Uhl, J.S., 1993. A reverseengineering approach to subsystem structure identi®cation. Journal of Software Maintenance: Research and Practice 5, 181±204. Nosek, J.T., Prashant, P., 1990. Software maintenance management: the change in the last decade. Journal of Software Maintenance: Research and Practice 2 (3), 157±174. Sneed, H.M., Nyary, E., 1994. Downsizing large application programs. Journal of Software Maintenance: Research and Practice 6 (5), 105±116. Sneed, H.M., 1997. An object oriented migration strategy for hostbased online processing systems (a report from the ®eld). In: Proceedings of the ICSEÕ97 Workshop on Migration Strategies for Legacy Systems. Boston, MA. Tip, F., 1995. A survey of program slicing techniques. Journal of Programming Language 3, 121±189. Van Sickle, L., Liu, Z.Y., Ballantyne, M., 1993. Recovering user interface speci®cations for porting transaction processing applications. In: Proceedings of the Second IEEE Workshop on Program Comprehension. Capri, Italy, IEEE Computer Society Press, Los Alamitos, CA, pp. 71±76. Weiser, M., 1984. Program slicing. IEEE Transactions on Software Engineering SE-10 (4), 352±357.

Gerardo Canfora received the Laurea degree in Electronic Engineering from the University of Naples ``Federico II'', Italy, in 1989. He is currently an associate professor of Computer Science at the Faculty of Engineering of the University of Sannio in Benevento, Italy. From 1990 to 1991, he was with the Italian National Research Council (CNR). During 1992 he was at the Department of ``Informatica e Sistemistica'' of the University of Naples ``Federico II'', Italy. From 1992 to 1993, he was a visiting researcher at the Centre for Software Maintenance of the University of Durham, UK. In 1993 he joined the Faculty of Engineering at Benevento, Italy. He has served on the program committees of a number of international conferences; he was program co-chair of the 1997 International Workshop on Program Comprehension and will be program co-chair of the 20001 International Conference on Software Maintenance. His research interests include software maintenance, program comprehension, reverse engineering, reuse, reengineering, and migration. Aniello Cimitile received the Laurea degree in Electronic Engineering from the University of Naples, Italy, in 1973. He is currently a full Professor of Computer Science at the University of Sannio in Benevento, Italy. Previously, he was with the Department of 'Informatica e Sistemistica' at theUniversity of Naples 'Federico II'. Since 1973 he has been a researcher in the ®eld of software engineering and his list of publications contains more than 100 papers published in journals and conference proceedings. He serves in the program and organising committees of several international conferences and in the editorial and reviewer committees of several international scienti®c journals in the ®elds of software engineering and software maintenance. Prof. Cimitile was program co-chair of the Workshop on Program Comprehension, in 1993, 1994, and 1996, program co-chair and general cochair of the International Conference on Software Maintenance, in 1996 and 1997, respectively. He is a co-editor in chief of the Journal of Software Maintenance: Research and Practice. He has been responsible for many international and national applied research projects. His research interests include software maintenance and testing, software quality, reverse engineering, and reuse reengineering. Andrea De Lucia received the Laurea degree in Computer Science from the University of Salerno, Italy, in 1991, the M.Sc. degree in Computer Science from the University of Durham, UK, in 1995, and the Ph.D. degree in Electronic Engineering and Computer Science from the University of Naples ``Federico II'', Italy, in 1996. He is currently an assistant professor of Computer Science at the Faculty of Engineering of the University of Sannio in Benevento, Italy. Previously, he was with the Department of ``Informatica e Applicazioni'' of the University of Salerno, Italy, and with the Department of ``Informatica e Sistemistica'' of the University of Naples ``Federico II'', Italy. From 1994 to 1995, he was a visiting researcher at the Centre for Software Maintenance of the University of Durham, UK. He serves in the program and organising committees of several international conferences and he will be program co-chair of the 2001 International Workshop on Program Comprehension. His research interests include software maintenance, reverse engineering, reuse, reengineering, migration, program comprehension, and visual languages. Giuseppe A. Di Lucca received the Laurea degree in Electronic Engineering from the University of Naples ``Federico II'', Italy, in 1987 and the Ph.D. degree in Electronic Engineering and Computer Science from the same University in 1992. He is currently an assistant professor of Computer Science at the Department of ``Informatica e Sistemistica'' of the University of Naples ``Federico II''. His research interests include software engineering, software maintenance, reverse engineering, software reuse, software reengineering, and software migration.

Suggest Documents