METAMORPHOS: MEthods and Tools for migrAting software systeMs towards web and service Oriented aRchitectures: exPerimental evaluation, usability, and tecHnOlogy tranSfer Andrea De Lucia(1) , Massimiliano Di Penta(2) , Filippo Lanubile(3) , Marco Torchiano(4) (1) Dipartimento di Matematica e Informatica - Universit`a di Salerno, Italy (2) Dept. of Engineering - University of Sannio, Benevento, Italy (3) Dipartimento di Informatica, University of Bari, Italy (4) Politecnico di Torino, Torino, Italy
[email protected],
[email protected],
[email protected],
[email protected]
Abstract The project METAMORPHOS is a two-year Italian research project, funded by the Ministry of University and Research, aimed at facilitating the selection and the adoption of reverse engineering and migration techniques and tools in industry. To pursue such an objective, the project aims at empirically evaluating techniques and tools that can potentially fulfill industry needs. The project focuses in particular on migration activities towards distributed architectures— such as Web-based and service-oriented—and mobile devices that are nowadays gaining an increasing diffusion and central role. Keywords: software migration, empirical studies.
1. Introduction In recent years, many advances have been made in the development of distributed—Web-based in particular—and service oriented software systems, that can be accessed through the Internet using mobile devices. However, the software portfolio of many organizations consists mostly of legacy systems, resulting from a huge amount of investments. Thus, developing a new system from scratch is not feasible or at least risky and expensive. Even when replacement systems or standard packages are available, it is mandatory to salvage the data of legacy systems [5]. The effort devoted to reverse engineering and migration activities could be greatly reduced by adopting proper support tools. The adoption of reverse engineering and migration tools in industry cannot be made without a systematic and quantitative evaluation, in particular concerning usability. Although this need has been widely discussed in the
past, and guidelines to define, plan and conduct empirical studies have been defined [3, 29], very few studies have been conducted for the evaluation of reverse engineering and migration methods and tools [16, 26, 27]. The objective of the project METAMORPHOS (MEthods and Tools for migrAting software systeMs towards web and service Oriented aRchitectures: exPerimental evaluation, usability, and technology tranSfer) is to empirically evaluating reverse engineering and migration techniques to facilitate their selection and adoption in industry. The project is a PRIN (Italian Relevant National Interest Projects), funded by the Ministry of University and Scientific Research (MiUR) under grant PRIN-2006098097. It started in February 2007, and ended up in January 2009. It involved four Italian Universities: University of Salerno (Andrea De Lucia, project coordinator), University of Sannio (Massimiliano Di Penta), University of Bari (Filippo Lanubile), Polytechnic of Turin (Marco Torchiano). Other than the principal investigators mentioned above, the project involved 14 faculty members and 8 among postdoctorals, PhD students, and research assistants. The budget is of 271,430 Eur, of which 190,000 funded by MiUR and 81,430 co-funded by the Universities. Further information can be found on the project website1 .
2. Project Organization As described by Linkman and Rombach [17], the evaluation of a technology can be pursued through different studies, comprising limited risks and small scale studies (i.e., surveys), limited-size studies performed in highly controlled environments (i.e., controlled experiments), and higher-risks/large scale studies (i.e., case studies). 1 http://www.sesa.dmi.unisa.it/metamorphos/
uation. Finally, the empirical studies were performed in an environment properly supported by tools for cooperative work, artifact management and by an environment that supports the execution of controlled experiments. Figure 1 shows how activities belonging to different WPs contributed to the life-cycle of all the empirical studies foreseen by the project.
3. Achievements
Figure 1. Project WPs mapped to empirical study process.
Accordingly, the project has been organized in three empirical assessment phases, and consists of 9 work packages (WPs). The first phase aimed at selecting techniques and tools be evaluated in the second phase. This goal has been pursued by surveying the state-of-the-art and the state of the practice of the Italian industry to collect their reverse engineering and migration needs. The empirical studies design and planning followed guidelines such as [3, 29], and mainly focused on functional aspects of the technique (i.e., the technique effectiveness and cost). We also took into account usability as a crucial factor that can affect the overall result of a task despite of the effectiveness of the used tools. Before conducting the empirical studies, we identified, acquired and customized of technologies and tools to be experimented, and with the identification of experimental objects (i.e., systems to be analyzed/migrated). The second phase was mainly conducted in a controlled environment, i.e., like University classes or small groups of industrial developers. The project was conceived to allow for experiment replication among partners. Such a replication is necessary to mitigate conclusion and external validity threats [4]. To deal with the limited external validity of results obtained in controlled experiment (very often carried out with students), we foresaw a third phase of industrial case studies, during which the previously evaluated techniques and tools are adopted in small industry projects. Wherever necessary, the artifacts produced during the empirical studies were evaluated using peer reviews. Such a systematic strategy reduced the subjectivity of the eval-
This section summarizes the main results achieved during the first 1.5 year of the project. Further details are in a longer report [11]. Survey on the state of the practice: the survey of the state of the practice has involved more than 50 Italian companies [28]. Results indicated that most of the migration activities the company recently performed targeted the Web, with a small number of migrations towards mobile devices and—less than 10%—towards service oriented architectures (SOA). A large portion of the migration projects regarded Cobol, with Java as target language. In two thirds of the cases the target operating system for the migration was Linux or other Unix systems. Finally, the survey confirmed the very limited use of automatic tools for supporting migration tools, if not for database migration. Empirical Assessment of Reverse Engineering Techniques: in the context of this project, empirical studies were performed to assess a series of reverse engineering techniques that can support migration activities, in particular: • Traceability recovery: controlled experiments have been conducted [13, 21] to assess a tool based on Latent Semantic Indexing (LSI) and integrated with the ADAMS artifact management system [12] in traceability recovery processes. • Design pattern recovery: a design pattern recovery tool developed at the University of Salerno was empirically assessed [10]. • Web clustering: clustering techniques have been experimented to support the re-modularization of web applications [14]. Assessment of notations to support migration activities: series of controlled experiments have been performed to evaluate the usefulness of different kinds of notations to support comprehension and maintenance tasks, in particular tasks that can occur in the context of migration activities: • Data models: controlled experiments have been conducted, showing that UML class diagrams are simpler to understand than Entity-Relationship (ER) diagrams [20].
• Web application documentation: comparing Web applications documented using pure UML models with stereotyped models. Overall, the study [22, 23] indicates that the use of stereotyped diagrams helps to reduce the gap between low ability/experience developers and high ability/experience developers. • Use of Framework for Integrated Test (Fit) tables to improve application comprehension and maintainability: acceptance test cases can be used to improve the comprehensibility of change requirements and— in case they can be made executable—the correctness of the produced code. Results of the experiments we carried out [25, 24] indicate that Fit tables significantly help to improve the comprehension level and the correctness of the maintained code, without however requiring a significant additional time overhead. Empirical assessment of tools to support software migration: based on the results of the survey, a migration tool developed by the University of Salerno in cooperation with a small enterprise has been selected [19] and evaluated in several controlled experiments and case studies conducted both in academic and industrial context [9]. The tool allows the semi-automatic migration of legacy systems developed in COBOL to the web. Results show that the use of the migration tool increases the productivity with respect to the use of traditional development environments used in the subject company and reduces the gap between novice and expert software engineers. Migration towards wireless applications: as regards the migration towards wireless applications, the the Turin research unit conducted a case study aimed at migrating a Web mail application from Java Server Pages (JSP) technology towards Java Platform Micro Edition (Java ME). The Bari research unit replicated the study following a Story Test-Driven Migration (STDM) process, which uses automated acceptance tests both on the legacy and target versions of the system, and foresees a review of story tests done by customers. Migration towards SOA: the Bari research unit conducted an experience of migrating a text-based conferencing (eConference) system towards service-oriented platforms [7]. eConference has evolved over a sequence of releases, aimed at changing the underlying communication framework, and changing the architecture towards a pureplugin system, built on top of the Eclipse Rich Client Platform. Quality assessment by mining software repositories: the University of Sannio unit developed a framework to track the evolution of software artifacts [8]. Using this framework, different kinds of studies have been conducted, among others (i) to investigate the changeability design patterns [1]; (ii) to analyze the maintenance of clones (prelim-
inary results limited to manual analysis are reported in [2]); and (iii) to investigate on the evolution and decay of vulnerabilities in network applications [15]. Experimenting tools for software inspection: since tools supporting cooperative work and code/document inspection can be useful during maintenance (and migration in particular) tasks, experiments have been conducted to assess (i) a distributed inspection tool [18]; and (ii) the effects of synchronous and asynchronous interaction to reach mutual agreements in remote inspection meetings [6].
4. Conclusions The project represented a privileged context to conduct empirical studies addressing the problem of software reverse engineering and migration from several different perspectives. A broad-spectrum investigation of the industrial practices was performed by means of a survey, which provided a useful overview of the phenomenon. The project also produced a large body of knowledge by leveraging different types of empirical studies, from small to large scale, encompassing varying degrees of risks. In the opinion of the investigators, the project really facilitated the execution of a large number of empirical studies in a limited time frame, given the availability of student classes from courses of the involved participants. This, in many cases allowed us to build an empirical evidence from multiple experiment replications, sometimes conducted with subjects having different background and experience. On the other hand, while we were able to involve a large number of companies (59) in a survey study, it resulted harder to involve them in experiments and case studies, with few exceptions [9]. It has to be noticed that this kind of project scheme (PRIN) did not expect the direct participation of companies to the project, whose participation to empirical studies was on a volunteer basis and not funded. Work-in-progress is devoted to better distill guidelines from the empirical studies conducted, to disseminate results in industry and academia, and to make available the replication packages used in our empirical studies to researchers external to the project. This would ease further replication of the studies we conducted, to increase the external validity of the obtained results.
References [1] L. Aversano, G. Canfora, L. Cerulo, C. Del Grosso, and M. Di Penta. An empirical study on the evolution of design patterns. In Proceedings of the 6th ESEC/FSE, Dubrovnik, Croatia, September 3-7, 2007, pages 385–394, 2007. [2] L. Aversano, L. Cerulo, and M. Di Penta. How clones are maintained: An empirical study. In 11th European Confer-
[3]
[4]
[5] [6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
ence on Software Maintenance and Reengineering, CSMR 2007, 21-23 March 2007, Amsterdam, The Netherlands, pages 81–90, 2007. V. R. Basili, R. W. Selby, and D. H. Hutchens. Experimentation in software engineering. IEEE Trans. Software Eng., 12(7):733–743, 1986. V. R. Basili, F. Shull, and F. Lanubile. Building knowledge through families of experiments. IEEE Trans. Software Eng., 25(4):456–473, 1999. M. L. Brodie and M. Stonebraker. Migrating Legacy Systems. Morgan Kaufmann, San Francisco, CA, 2005. F. Calefato, F. Lanubile, and T. Mallardo. A controlled experiment on the effects of synchronicity in remote inspection meetings. In Proceedings of the First International Symposium on Empirical Software Engineering and Measurement, ESEM 2007, September 20-21, 2007, Madrid, Spain, pages 473–475, 2007. F. Calefato, F. Lanubile, and M. Scalas. Evolving a textbased conferencing system: An experience report. In CollaborateCom, pages 427–431, 2007. G. Canfora, L. Cerulo, and M. Di Penta. Tracking your changes: a language-independent approach. IEEE Software, 27(1):50–57, 2009. M. Colosimo, A. D. Lucia, G. Scanniello, and G. Tortora. Evaluating legacy system migration technologies through empirical studies. Information and Software Technology, to appear. A. De Lucia, V. Deufemia, C. Gravino, and M. Risi. A two phase approach to design pattern recovery. In 11th European Conference on Software Maintenance and Reengineering, Software Evolution in Complex Software Intensive Systems, CSMR 2007, 21-23 March 2007, Amsterdam, The Netherlands, pages 297–306, 2007. A. De Lucia, M. Di Penta, F. Lanubile, and M. Torchiano. The project METAMORPHOS: An overview. Technical report, Available online at http://www.sesa.dmi.unisa.it/metamorphos/docs/metamorphosoverview.pdf, 2008. A. De Lucia, F. Fasano, R. Oliveto, and G. Tortora. Recovering traceability links in software artifact management systems using information retrieval methods. ACM Trans. Softw. Eng. Methodol., 16(4), 2007. A. De Lucia, R. Oliveto, and G. Tortora. Ir-based traceability recovery processes: An empirical comparison of ”oneshot” and incremental processes. In 23rd IEEE/ACM International Conference on Automated Software Engineering (ASE 2008), 15-19 September 2008, L’Aquila, Italy, pages 39–48, 2008. A. De Lucia, G. Scanniello, and G. Tortora. Identifying similar pages in web applications using a competitive clustering algorithm. Journal of Software Maintenance and Evolution: Research and Practice, 19(5):281–296, 2007. M. Di Penta, L. Cerulo, and L. Aversano. The evolution and decay of statically detected source code vulnerabilities. In proceedings of the 8th IEEE Working Conference on Source Code Analysis and Manipulation, pages 101–110, 2008. G. C. Gannod and B. H. C. Cheng. A framework for classifying and comparing software reverse engineering and design recovery techniques. In WCRE, pages 77–88, 1999.
[17] S. G. Linkman and H. D. Rombach. Experimentation as a vehicle for software technology transfer-a family of software reading techniques. Information & Software Technology, 39(11):777–780, 1997. [18] A. D. Lucia, F. Fasano, G. Scanniello, and G. Tortora. Assessing the effectiveness of a distributed method for code inspection: A controlled experiment. In Proceedings of 2nd International Conference on Global Software Engineering, Munich, Deutchland, pages 252–261, 2007. [19] A. D. Lucia, R. Francese, G. Scanniello, and G. Tortora. Developing legacy system migration methods and tools for technology transfer. Software Practice and Experience, 38(13):1333–1364, 2008. [20] A. D. Lucia, C. Gravino, R. Oliveto, and G. Tortora. Data model comprehension: an empirical comparison of ER and UML class diagrams. In Proceedings of 16th IEEE International Conference on Program Comprehension, Amsterdam, Netherland, pages 93–102, 2008. [21] A. D. Lucia, R. Oliveto, and G. Tortora. Assessing ir-based traceability recovery tools through controlled experiments, empirical software engineering. Empirical Software Engineering, accepted. [22] F. Ricca, M. Di Penta, M. Torchiano, P. Tonella, and M. Ceccato. How design notations affect the comprehension of web applications. Journal of Software Maintenance and Evolution: Research and Practice, 19(5):339–359, 2007. [23] F. Ricca, M. Di Penta, M. Torchiano, P. Tonella, and M. Ceccato. The role of experience and ability in comprehension tasks supported by UML stereotypes. In 29th International Conference on Software Engineering (ICSE 2007), Minneapolis, MN, USA, May 20-26, 2007, pages 375–384, 2007. [24] F. Ricca, M. Di Penta, M. Torchiano, P. Tonella, M. Ceccato, and C. A. Visaggio. Are fit tables really talking?: a series of experiments to understand whether fit tables are useful during evolution tasks. In 30th International Conference on Software Engineering (ICSE 2008), Leipzig, Germany, May 10-18, 2008, pages 361–370, 2008. [25] F. Ricca, M. Torchiano, M. Di Penta, M. Ceccato, and P. Tonella. Using acceptance tests as a support for clarifying requirements: a series of experiments. Information & Software Technology, 51(2), February 2009. [26] S. E. Sim and M.-A. D. Storey. A structured demonstration of program comprehension tools. In WCRE, pages 184–193, 2000. [27] P. Tonella, M. Torchiano, B. D. Bois, and T. Syst¨a. Empirical studies in reverse engineering: state of the art and future trends. Empirical Software Engineering, 12(5):551– 571, 2007. [28] M. Torchiano, M. Di Penta, F. Ricca, A. De Lucia, and F. Lanubile. Software migration projects in italian industry: Preliminary results from a state of the practice survey. In Proceedings of Evol 2008: 4th Intl. ERCIM Workshop on Software Evolution and Evolvability, September 2008, L’Aquila, Italy. IEEE CS Press, 2008. [29] C. Wohlin, P. Runeson, M. H¨ost, M. Ohlsson, B. Regnell, and A. Wessl´en. Experimentation in Software Engineering An Introduction. Kluwer Academic Publishers, 2000.