An Iterative Reengineering Process Applying Test-Driven Development and Reverse Engineering Patterns V INÍCIUS H. S. D URELLI1 ROSÂNGELA A. D. P ENTEADO2 S IMONE DE S OUSA B ORGES2 M ATHEUS C. V IANA2 1
USP - University of Sao Paulo ICMC/USP - Institute of Mathematical Sciences and Computation 13560-970 – Sao Carlos – SP – Brazil
[email protected] 2 UFSCar - Federal University of Sao Carlos DC - Computer Department 13565-905 – Sao Carlos – SP – Brazil (rosangel,simone_borges,matheus_viana)@dc.ufscar.br
Abstract. Nowadays, software technology is evolving quickly and therefore software systems which have been built upon some technologies are deprecated even before being released and used. Thus, software systems are in constant evolution in order to adapt to current technologies as well as users’ needs. An approach to revitalize software systems that have already been released is reengineering. In this paper, we propose an iterative reengineering approach that uses reverse engineering patterns and test-driven development to cope with issues involved in migrating from a legacy system to an equivalent software system implemented in more recent technologies. As a preliminary evaluation of the proposed approach, we contrasted it with an ad-hoc approach during the reengineering of a legacy system from Smalltalk to Java. The results gathered during the case study suggest that our approach produces higherquality code and the information obtained by means of applying reverse engineering patterns is very accurate and cost-effective. Keywords: Reengineering; Reengineering Patterns; Test-Driven Development; Refactoring.
(Received October 30, 2009 / Accepted January 22, 2010)
1
Introduction
Software systems undergo many modifications during their life cycle. The act of either improving or modifying an existing software system without introducing problems is quite a challenge. Reengineering is aimed at revitalizing software systems through fixing existing or perceived problems. However, unlike forward engineering that is supported by a plenty of processes, such as the spiral and waterfall models of software development, no established processes for reengineering are available [14].
Due to the absence of an established reengineering process, patterns and some techniques, which have existed for a long time and are recognized and generally accepted, can be combined and used within a reengineering context. Moreover, agile practices have been widely applied in many forward engineering processes since their introduction, thus there is interest in determining the applicability of some agile practices to reengineering projects as well. The novelty of our approach is that reverse engineering activities use particular reverse engineering patterns (drawn from literature) and forward engineering activi-
2_______________________________________________________________________________________________________ Durelli, V. H. S. et al.
ties are performed by applying test-driven development. Furthermore, in the context of our approach, reverse and forward engineering activities are iteratively and incrementally carried out. To assess the effectiveness of the approach, it has been experimented on a legacy system; the results indicate its effectiveness in terms of quality of the produced code. In order to describe our approach, the remainder of this paper is structured as follows. Section 2 and 3 present background on the main concepts and techniques involved in the proposed reengineering approach: reverse engineering patterns and test-driven development, respectively. Section 4 describes our iterative reengineering approach and Section 5 describes a case study within which a large (more than 29 KLOC) legacy system was partially reengineered from Smalltalk to Java in order to compare an ad-hoc and test-last approach with our reengineering approach. Section 6 presents threats to the validity of our described case study and Section 7 concludes the paper with some remarks, limitations of the approach, and future directions. 2
Reverse Engineering Patterns
Patterns describe a solution to recurrent problems [10]. They were at first adopted by the software community as a way of documenting solutions and best practices in solving problems that occur in many phases of the software systems’s life cycle. Usually, they are documented as a literary form which introduces the problem to the reader, describes the context within which it generally occurs, and presents a possible solution to the underlying problem. Reengineering efforts deal with some typical problems and there is no tool or technique which is able to overcome all these problems. In addition, the process of reengineering is, like any other process, one in which many techniques have emerged, each of which entails many trade-offs. Reengineering patterns are well suited to describing and discussing these techniques; they help in diagnosing problems, identifying weaknesses that may hinder further development of the system and finding more appropriate solutions to problems typically faced by developer [15]. Reverse engineering can be regarded as the initial phase in the process of software reengineering [1]. Thus, reverse engineering patterns aim at building higher-level software models and acquiring more abstract information from the source code. Significant research has been done to record patterns that occur in reengineering and other contexts [6, 7]. The reverse engineering patterns used in our process are described in Section 4.1. Next
section outlines the agile practice called test-driven development.
3
Test-Driven Development
Test-Driven Development (TDD) is one of the core practices that have been introduced by the Extreme Programming discipline [3]. TDD is also known as testdriven design or as test-first design [11]. Applying TDD fundamentally requires writing automated tests before producing functional code. Using TDD, the implementation of each new functionality starts with the developer writing a test case which specifies how the program should invoke that functionality and what its result should be. The recently implemented test fails, thus the developer implements just enough code to make the test pass. Finally, unless a prior test is not still passing, the developer reviews the code as it now stands, improving it by means of a practice called refactoring. Refactorings are behavior-preserving program modifications that improve a software system design and its underlying source code [9]. Refactoring as a practice consists in restructuring software systems by applying a series of refactorings without altering their observable behavior. In the context of a TDD cycle, refactorings are carried out in order to make the introduction of new functionalities easier. During this process all of the previously written tests act as regression tests to make sure the changes have not had any unexpected side effects. Applying TDD, working software is available at every step and tests validate whether each feature works as expected. The software system being developed using TDD is built and improved feature by feature and the tests ensure that it is still working before the developer move on to next features. Thus, although its name implies TDD is a testing technique, it is an analysis and design practice [2, 12]. It is considered an analysis technique because, during the creation of the tests, the developer selects what is going to be implemented, thus defining the functionality scope. Moreover, it is regarded a design technique because, while each test is implemented, the developer makes decisions related to the application programming interface (API) of the software system (e.g., classes and methods names, number of parameters, return type, and exceptions that are thrown). Next section describes our reengineering process and the fundamental role that reverse engineering patterns and TDD play in the underlying process.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 01–08, fev. 2010
An Iterative Reengineering Process Applying Test-Driven Development and Reverse Engineering Patterns 3 _______________________________________________________________________________________________________
4
Iterative Reengineering Approach
Usually, legacy software systems are complex. Therefore, to overcome this complexity, our reengineering approach deals with parcels of legacy software systems. The software system being reengineered is split into coarse-grained parcels, such as layers and packages, or fine-grained such as classes. After subjectively split the legacy system into parcels, two types of activities are iteratively done for each parcel of this legacy system. The first type approaches the recovery of the existing design – reverse engineering – and the second one is related to implementing and improving the extracted design – forward engineering and restructuration. In order to accomplish reverse engineering activities, some of the patterns proposed in [7] are applied. During the forward engineering activities, TDD is applied to implement the information related to the parcel which was reverse engineered. An overview of the approach is shown in Figure 1. The patterns applied on this iterative approach consider the available documentation as well as the source code of the legacy system being reengineered. The reverse engineering activities as well as the patterns used are described in the next subsection.
Figure 1: An overview of our iterative reengineering approach.
4.1
Reverse Engineering Patterns Used in the Approach
In our approach, while performing reverse engineering activities, both the documentation and the source code are iteratively consulted in order to understand and validate the information obtained on a software system parcel. The patterns used in this case are: Read All the
Code in One Hour, Skim the Documentation, and Speculate About Design [6, 7]. These patterns have been chosen because they are well documented and have produced adequate results in our previous studies. Moreover, although these patterns are presented in the context of a major effort of reengineering, according to their authors, they can also be applied when the reengineering is done in small iterations. The main goal of Read All the Code in One Hour pattern is to assess the source code quality and complexity. This assessment is done by means of brief but intensive code review. There is an important difference between traditional code reviews and the ones performed in the context of our approach. The former is mainly meant to detect errors, while the latter is meant to get a first impression of the code quality and to recover information on how the functionality is implemented. This pattern originally suggests that all the source code should be read in an hour. However, in our approach, only the source code related to the parcel being reengineered is examined. Therefore, there is a reduction in the amount of relationships among classes that must be comprehended in each iteration. A drawback of applying Read All the Code in One Hour is that the obtained information needs to be complemented with more abstract representations [7]. Thus, to complement those information another more abstract representations of the legacy system as, for instance, class and sequence diagrams must be consulted if available. Skim the Documentation is applied in order to evaluate the relevance of the available documentation, and it is applied either before or after Read All the Code in One Hour. In the context of this iterative reengineering approach, it is applied to select the sections of the documentation which contain more relevant information on the parcel being reengineered. This information is used to validate and complement the low level information acquired by the use of Read all the Code in One Hour pattern. Figure 2 contains an activity diagram which represents the activities that have to be carried out during the reverse engineering of each parcel. The mentioned patterns are repeatedly applied for each parcel of the legacy system being reengineered. All acquired information is verified and summarized since the documentation may not correspond to the implementation. Due to its inherently iterative nature, the proposed approach enables the developer to decide when the obtained information is enough to start carrying out reverse engineering activities. Some parcels are harder to be comprehended since their documentation does not contain class diagrams or any other abstract representation of the functionality im-
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 01–08, fev. 2010
4_______________________________________________________________________________________________________ Durelli, V. H. S. et al.
appear in the source code. After that, the developer searches the parcel code for those names.
plemented by such parcels. Subsection 4.2 describes how class diagrams which depict the design of the most complex parcels of the legacy system can be produced.
3. Developer keeps in the class diagram classes, methods and attributes that are found in the parcel code. In this step, it is also possible to rearrange the number assigned to the likelihood of each element. 4. Developer renames classes, methods and attributes whose names do not match with the ones chosen in the parcel code. 5. The hypothetical class diagram is re-modeled when it does not correspond to the parcel code representation. For instance, a method may be turned into a class. 6. The hypothetical class diagram is extended when there are elements in the parcel code which do not appear in the class diagram. Figure 2: Reverse engineering activities.
4.2
Constructing Class Diagrams of Complex Parcels
Class diagrams assist in understanding some parcels of a legacy system. Speculate About Design pattern can be applied in order to produce these class diagrams [7]. This pattern suggests the creation of a hypothetical class diagram based on suppositions about how the functionality of those parcels has been implemented. This hypothetical diagram, initially abstract and without any implementation details, is gradually refined and classes, methods and attributes may be added to it. Hence, the hypothetical diagram becomes closer to what is implemented by the parcel being considered. If the legacy system already has class diagrams, this pattern can also be applied, but these class diagrams have to be verified in order to check their consistency with the implementation. The pattern steps used to construct and refine the class diagram have been adapted to be more flexible in the context of this iterative process. The modified steps used are: 1. Developer creates a class diagram based on his/her understanding of the legacy system parcel being reengineered. Such class diagram serves as an initial hypothesis of what to expect in the source code. 2. The names of classes, methods and attributes are enumerated according to the likelihood that they
Steps 2 to 6 are repeated until the class diagram adequately represents the functionality implemented by the parcel being reengineered. In this process, class diagrams do not need to be very detailed because during forward engineering implementation details are approached using TDD, hence the effort needed to generate such class diagrams is reduced. Subsection 4.3 describes the forward engineering step of this process. 4.3
Forward Engineering Applying TDD and Refactoring
The information previously obtained is used to implement a parcel which is equivalent to the existing one. In this step of the reengineering approach, the implementation of the resulting parcel may contain some improvements concerning the legacy system equivalent parcel, due to an improvement in the functionality or some technological difference. Furthermore, some modifications may be necessary during the integration of the recently implemented parcel with the already reengineered parcels. If these alterations were not addressed, problems would be inadvertently introduced. In this approach TDD is adopted in order to attenuate problems caused by the necessary modifications. A list of test cases is created based on the information that was obtained during the reverse engineering activities. After the creation of the test case list, the steps of TDD cycle are performed and the parcel being addressed and a set of automated tests are implemented. These automated tests can be used as regression tests and therefore used to verify if problems problems have been introduced in the source code due to modifications
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 01–08, fev. 2010
An Iterative Reengineering Process Applying Test-Driven Development and Reverse Engineering Patterns 5 _______________________________________________________________________________________________________
made during the integration of the parcels. If it is necessary, some refactorings can be done to assist in improving the parcel code which facilitates the parcel integration. Since parcels are implemented applying TDD, the creation of an upfront project may be unnecessary. This way, developers do not need to be concerned about how some features will be re-implemented when a new programming language is chosen. In order to evaluate the effectiveness of this reengineering approach and the seamlessly integration of the techniques in it, a legacy system which has more than 29 KLOC has been reengineered. Next section outlines how the reengineering has been conducted using our approach. 5
Case Study: GREN Framework
This case study describes how GREN framework [5] was partly reengineered from Smalltalk to Java. GREN was built based upon a pattern language, called Business Resource Management (GRN) [4]; a pattern language that contains fifteen patterns which belong to the business resource management domain. The framework architecture consists of three layers: persistence, business (model) and graphical user interface (GUI). The business layer comprises the implementation of GRN patterns, and only the code related to this layer was taken into consideration during the case study. The major available documentations related to the framework are: its “cookbook”, which conveys information on how GRN patterns have been implemented, and GRN pattern language, which was also used as a source of information since it contains analysis-level class diagrams for each of its patterns. However, the main information source used was GREN framework source code. In order to provide evidence of our approach efficiency, the first three GRN patterns implemented in GREN framework have been reengineered using an adhoc approach to perform reverse engineering activities. The information that was obtained during these reverse engineering activities has been implemented by using a test-last approach. In contrast, other three patterns have been reengineered by applying our proposed approach, in other words, using reverse engineering patterns to support reverse engineering activities and applying TDD to support forward reengineering activities. Table 5 shows information related to the pattern implementations that have been reengineered during the case study. Throughout the case study we were interested in investigating whether our approach produces higher-qual-
ity code and reduces the amount of time spent on reengineering. Nevertheless, in the context of the proposed case study, quality is simply defined in terms of defect rates from the perspective of the software developer conducting the reengineering. Thus, the measures used to draw conclusions are: defect density (i.e., the number of defects per line of code) and the time spent to reengineer each pattern. The effect we expect our approach to have is formalized into hypotheses as follows. Pattern Name NC∗ KLOC (Smalltalk) Ad-hoc approach/test-last Identify the Resources 7 0,652 Quantify the Resources 4 0,339 Rent the Resource 4 0,781 Total 15 1,772 Reverse engineering patterns/test-first Trade the Resource 5 0,783 Quote the Trade 3 0,350 Check Resource Delivery 2 0,256 Total 10 1,391 ∗ NC stands for number of classes.
Table 1: Patterns implemented in the framework GREN that have been reengineered.
Null Hypothesis, H0 : this hypothesis states that there is no real advantage in applying our approach, i.e., using TDD does not result in lower defect rates and the accuracy of the information retrieved by undertaking reverse engineering applying patterns is not cost-effective. Alternative Hypothesis, H1 : according to this hypothesis, carrying out forward engineering activities using TDD improves the code quality, i.e., the resulting code has lower defect rates than the code generated by a test-last approach. Moreover, the accuracy of the information drawn from the source code applying reverse engineering patterns is costeffective. It is worth noting that only one of the authors participated in the case study and his knowledge of the languages involved can be classified as: advanced and intermediate, regarding Java and Smalltalk, respectively. The tools used to undertake the case study were Integrated Development Environment (IDE) Eclipse (Classic 3.5.0) [8] and JUnit framework (version 4.5) [13]. During the reengineering of the first three patterns, following the ad-hoc approach, the developer was able
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 01–08, fev. 2010
6_______________________________________________________________________________________________________ Durelli, V. H. S. et al.
Minutes Reverse Engineering Forward Engineering Testing Debugging Total
Pattern#1 Estimated Spent 240 192 480 698 120 158 120 47 960 1095
Pattern#2 Estimated Spent 120 116 300 345 60 45 60 143 540 649
Pattern#3 Estimated Spent 240 348 480 557 120 96 120 98 960 1099
Table 2: Estimation and real time in minutes spent reengineering the first three patterns
to switch around between activities by performing reverse and forward engineering activities at will. Thus, given that the boundaries of each activity performed to reengineer the first three patterns are distinguishable (e.g., reverse engineering, forward engineering, testing, and debugging activities), the developer was requested to keep track of the amount of time spent in each activity, thereby filling a form out with such information. In order to stipulate the time that would be spent in each activity, a code review has been performed by one of the authors. Table 5 shows the stipulated times for each activity, as well as the time spent in carrying them out. The results, gathered during the reengineering of these first patterns, are shown in Table 5. As it can be noticed in Table 5, more time than previously estimated had to be spent on the reengineering of all patterns. Moreover, tests have revealed that defect density of the first pattern code was considerably low (Table 5). However, as the first pattern’s code had to be changed in order to be integrated with the other patterns’s code, an increase in defect density of Pattern#2 and Pattern#3 appeared and, therefore, in the amount of time spent in debugging activities. Pattern#1 Pattern#2 DD KLOC DD KLOC 4,47 1,567 5,67 0,882 ∗ DD stands for defect density.
Pattern#3 DD KLOC 6,24 1,443
Table 3: Defect density and lines of Java code of each pattern (1 through 3).
The other patterns have been reengineered using our approach. Thus, the time devoted to forward engineering and testing were joined up into one activity, namely TDD. Again, a code review has been performed by one of the authors and, hence, establishing the stipulated times required to reengineer the parcels of the code related to the remaining patterns, i.e., Pattern#4 through #6. These stipulated times as well as the actual amount
of time spent in carrying out the reengineering of each pattern are shown in Table 4. The results obtained after reengineering the remaining patterns are very satisfactory, Table 5. Although the time spent applying reverse engineering patterns was always longer than previously stipulated, the information retrieved by using these reverse engineering patterns has been shown more accurate than the information obtained from an ad-hoc approach. The defect density presented was very encouraging for almost all patterns. The number of defects found in Pattern#6 was very close to the other patterns, ranging from 2 to 5. Nevertheless, its defect density was higher than the other pattern implementations since it has a reduced number of lines of code. Forward engineering activities carried out applying TDD have shown to be well suited to deal with issues involved in translating constructions of a language into similar constructions of another language. Moreover, as it can be seen in Table 5, the amount of time spent in debugging activities was very short. TDD seemed to reduce overall debugging efforts since generated unit tests helped the developer in pinpointing faults as soon as they were inserted. Nevertheless, applying TDD took longer and the quantity of unit tests was greater than the test-last approach. The information gathered during the case study is summarized in the graphs of Figures 3 and 4. The data shown in these figures support our alternative hypothesis, thereby refuting the null hypothesis. Thus, it is possible to state that, under given risks (Section 6), our approach improves code quality as well as the accuracy of information drawn both from the documentation and the source code. 6
Threats to Validity
The threats to the validity of the presented case study are: the size of the parcels that have been reengineered are different and there are inter-relationships and dependencies (coupling) among the involved classes. This
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 01–08, fev. 2010
An Iterative Reengineering Process Applying Test-Driven Development and Reverse Engineering Patterns 7 _______________________________________________________________________________________________________
Minutes Reverse Engineering TDD (Forward Engineering) Debugging Total
Pattern#4 Estimated Spent 240 312 545 509 120 21 905 842
Pattern#5 Estimated Spent 120 255 360 499 60 23 540 777
Pattern#6 Estimated Spent 120 179 300 405 60 47 480 631
Table 4: Estimation and real time in minutes spent reengineering the last three patterns
Pattern#4 Pattern#5 DD KLOC DD KLOC 2,68 1,498 3,90 0,771 ∗ DD stands for defect density.
Pattern#6 DD KLOC 9,13 0,438
more similar to industrial projects shall be considered in future studies. 7
Concluding Remarks
may introduce a lot of variables that were not taken into consideration when evaluating the results. The difference among the parcels also makes it difficult to compare the results in terms of defects, size, or effort required. Social threats, such as inclination towards a development approach, are another potential threat to the validity of the case study. Given that the author, who participated in the case study, is an experienced TDD practitioner, this threat cannot be ruled out.
In this paper, we have presented an iterative reengineering approach that consists of selecting a decomposition of the original system in terms of parcels and reengineering each of them by using TDD and reverse engineering patterns. Reverse engineering activities are carried out by using patterns, thus there is reuse of knowledge. During forward engineering, the automated tests created applying TDD provide the developer with feedback about analysis, design, and implementation decisions. Furthermore, these test cases can also be used as regression tests when future modifications are necessary. Nonetheless, a drawback of this practice is that the developer has to maintain both the functional code and the automated tests.
Figure 3: Real time spent in each activity during the case study.
Figure 4: Defect density rates presented by each pattern.
A threat to external validity has been identified, imposing restrictions on generalizing the case study results to more general contexts: in the case study, the scope of the reengineering effort was not actually comparable to real ones, since it has been designed considering time restrictions. Therefore, reengineering efforts
To present evidence of the approach efficiency, we have described a case study where a legacy system has been partially reengineered from Smalltalk to Java. Although we have tested our approach to reengineer the aforementioned system, with more than 29 KLOC, we believe that our approach does not scale well for larger
Table 5: Defect density and lines of Java code of each pattern (4 through 6).
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 01–08, fev. 2010
8_______________________________________________________________________________________________________ Durelli, V. H. S. et al.
applications, i.e., more than 35 KLOC. The evaluation of the time necessary to perform reverse engineering using the proposed patterns pointed out that: in order to apply this approach in larger legacy systems, either reverse engineering or program comprehension tools must be used instead of reverse engineering patterns; applying patterns to accomplish reverse engineering activities is time-consuming. According to the results obtained during the presented case study, TDD seems to be an effective practice to deal with the issues related to incrementally reengineering legacy systems. In the context of this case study, creating the tests before implementing the functional code has helped to address the translation of the up-front design, created by means of the information drawn from reverse engineering activities, into a suitable design which conforms to the features of Java language. The assessment of the results indicates that employing our approach implies in trading productivity for quality, since more time was generally spent using TDD and reverse engineering patterns than test-last and an ad-hoc approach. Several relevant points are not precisely addressed by our approach: (i) the criteria to be used to define the parcels to be submitted to each iteration of the approach and (ii) how the documentation, extracted by reverse engineering, can be used to define test cases in the forward engineering step. We are currently working on solutions for these limitations. Moreover, the case study has been performed by a single developer following both approaches, i.e., ad-hoc and our reengineering approach. As a future work, we aim at examining the effectiveness of our approach by means of case studies performed by different people or even teams; in such context, we would require a better way to define quality, other than defect rates from the perspective of the software developer conducting the reengineering. References [1] Arnold, R. S. Software Reengineering. IEEE Computer Society Press, 1993. [2] Beck, K. Aim, fire. IEEE Software, (5).
[5] Braga, R. T. V. and Masiero, P. A Process for Framework Construction Based on a Pattern Language. Computer Software and Applications Conference, 2002. [6] Demeyer, S., Ducasse, S., and Nierstrasz, O. A Pattern Language for Reverse Engineering. Proceedings of EuroPLoP 1999, pages 189–208, 1999. [7] Demeyer, S., Ducasse, S., and Nierstrasz, O. Object-Oriented Reengineering Patterns. Morgan Kaufmann, 2002. [8] Eclipse. The Eclipse Foundation. Available at: http://www.eclipse.org/, 2009. (Accessed 20 January 2009). [9] Fowler, M., Beck, K., Brant, J., Opdyke, W., and Roberts, D. Refactoring: Improving the Design of Existing Code. Addison-Wesley, 1999. [10] Gamma, E., Helm, R., Johnson, R., and Vlissides, J. Design Patterns: Elements of Reusable ObjectOriented Software. Addison-Wesley, 1995. [11] Janzen, D. and Saiedian, H. Test-driven development concepts, taxonomy, and future direction. Computer, (9). [12] Jeffries, R. and Melnik, G. Guest editors’ introduction: Tdd–the art of fearless programming. IEEE Software, 24(3):24–30, 2007. [13] JUnit. JUnit.org Resources for Test Driven Development. Available at: http://www.junit.org/, 2009. (Accessed 21 June 2009). [14] Mens, T. and Tourwé, T. A Survey of Software Refactoring. IEEE Transactions on Software Engineering, 30(2). [15] Stevens, P. and Pooley, R. Systems Reengineering Patterns. In SIGSOFT 98: Proceedings of the 6th ACM SIGSOFT international symposium on Foundations of software engineering, pages 17– 23, 1998.
[3] Beck, K. and Andres, C. Extreme Programming Explained: Embrace Change (2nd Edition). Addison-Wesley, 2004. [4] Braga, R. T. V., Germano, F. R., and Masiero, P. C. A Pattern Language for Business Resource Management. Conference on Pattern Languages of Programs, 6, 1999.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 01–08, fev. 2010
REESTRUCTURING AN APPLICATION FRAMEWORK WITH A PERSISTENCE CROSSCUTTING FRAMEWORK Ivan Botacini Zanon, Valter Vieira de Camargo, Rosangela A. Dellosso Penteado Departamento de Computação - Universidade Federal de São Carlos
[email protected],
[email protected],
[email protected] Abstract: This paper presents the the maintenance activities thatthat hadhad been done onon thetherestructuring Abstract : This paper presents maintenance activities been done restructuringprocess processof an Application Framework to usetoause Persistence Crosscutting Framework. Application Frameworks areare reuse of an Application Framework a Persistence Crosscutting Framework. Application Frameworks reuse technologies that support development, but their complex architecturemakes makescomprehension comprehension and technologies that support system system development, but their complex architecture and maintenance verytasks. difficult The restructuring of Application Frameworks to use maintenance activities activities very difficult Thetasks. restructuring of Application Frameworks to use Crosscutting Crosscutting Frameworks eases theofcomprehension of their architecture as well as their maintenance. It the Frameworks eases the comprehension their architecture as well as their maintenance. It was observed that was observed that the application framework architecture became more modular, showing lower levels of application framework architecture became more modular, showing lower levels of code tangling and scattering. code tangling and scattering.
Keywords: Framework Maintenance, Aspect-Oriented Software, Crosscutting Frameworks, Application Keywords: Framework Maintenance, Aspect-Oriented Software, Crosscutting Frameworks, Application Frameworks. Frameworks.
(Received October January 01, 31,2010) 2005) (Received 30,2005 2009//Accepted AcceptedDecember January 22, 1. Introduction Reuse-based software development is currently one of the most used development practices as it facilitates computational systems production using previously acquired knowledge and experience [9]. Among the most common reuse techniques are the frameworks, which support the development of specific systems in a certain domain, promoting reuse in analysis and design level [8]. An Application Framework (AF) is a structure composed of abstract classes that must be specialized in order to obtain systems that belong to an specific domain. Its structure is basically divided in pre-implemented code (frozen spots) and points that can be extended in order to implement specific behaviors of an application (hotspots). This architecture is composed of a great number of concrete and abstract classes, making its comprehension harder. This difficulty leads to problems in the instantiation of new applications and in their maintenance [11]. One alternative for facilitating the comprehension of software architecture and consequently its maintenance is to modularize its concerns in a suitable way. An alternative to that is the use of Crosscutting Frameworks (CF) to address concerns that are scattered over its architecture. Crosscutting Framework (CF) is a special kind of aspect-oriented framework that encapsulates a specific crosscutting concern, like persistence, security
and concurrency. They can be coupled to some base code that uses the concern in question, having no need of invasive modifications in this base code [2]. In literature, papers that present studies about the coupling of crosscutting frameworks to generic structures, like application frameworks, are a rare subject. In this paper an experience with the coupling of a crosscutting framework to an application framework, called GRENJ [6], which is based on an analysis pattern language for business resource management, is presented. The maintenance activities performed on both frameworks in order to allow this coupling are also presented. Once these maintenances were finished, it was observed that the application framework architecture became more modular, showing lower levels of scattering and tangling code. We have observed that the coupling of a persistence crosscuting framework to a generic structure, like an application framework, is feasible and enables a higher level of the persistence concern separation in this structure. However, the coupling of these frameworks are only possible by the realization of certain maintenances on both structures. Also it was possible to investigate the effort needed to adapt GRENJ instantiated systems in order to allow them to use the concern isolation obtained with the use of the crosscutting framework. In Section 2 the application framework GRENJ and the persistence layer implemented on its structure are presented. In Section 3 the Persistence CF coupled to the GRENJ architecture is discussed. In Section 4 the
10 Zanon, I. B. et al. _______________________________________________________________________________________________________
modifications made to allow the CF coupling to the GRENJ framework are presented. In Section 5 case studies developed to verify the proposal are presented. In Section 6 some related studies are presented. Finally, in Section 7 the final considerations are presented. 2. Application Framework GRENJ GRENJ is a white-box application framework that may be instantiated in order to create systems inside the business resource management domain, like rental or sale management. This framework was developed using Java language, and is the product of the GREN framework reengineering, which was implemented in SmallTalk and based on an analysis pattern language called GRN [1].
which contains the model classes of the framework, and locadora.model, that groups the application classes of an instantiated system, are presented. It can be observed that only the PersistentObject class in the persistence layer is extended by the classes of the model layer. This relationship is enough for the GRENJ inner classes, like StaticObject in Figure 1, to access the persistence methods encapsulated on the persistence package. As the GRENJ class can access these methods, the classes that extend it, like DestinationParty and Resource, can also use this same functionality. Therefore, the application classes can be persisted using the functionality offered by the persistence layer. Despite the improvement of the persistence concern modularization, the Persistence Layer pattern does not remove the persistence concern scattered over the model layer. Two different kind of methods can be pointed as persistence concern in the GRENJ framework: a) methods responsible for the creation of SQL statements used to persist the information stored in the classes, which must be overridden by some classes of the framework; b) persistence methods that must be called on the instantiation process in order to inform the persistence layer that the information of an object was modified. In Figure 2 some examples of code scattered over an application class, instantiated from GRENJ, are shown.
Figure 1: Partial class diagram of an application instantiated from the GRENJ.
The GRENJ architecture is composed of two layers: the model layer, which contains the classes responsible for the framework business concerns, and the persistence layer, which encapsulates the persistence implementation. The second layer was developed using the Persistence Layer design pattern [12] in order to isolate the persistence implementation. According to this pattern, every class to be persisted must extend a single class of the persistence layer. In Figure 1 the packages grenj.persistence, which represent the Persistence Layer implementation on GRENJ, grenj.model,
Figure 2: Tangling code on GRENJ.
In (a), the setChanged() method must be called to inform the persistence layer that some changes has been made in this class. In (b), the method updateSetClause() is overridden, giving to the update() method the information about the attributes of this class. In both cases, the implementation of the related persistence concern is made inside of an
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 09–16, fev. 2010
Reestructuring an Application Framework with a Persistence Crosscutting Framework 11 _______________________________________________________________________________________________________
application class, making the code scattering in its structure evident. The GRENJ instantiation process is performed using the GRN pattern language [1]. This process is defined by three main steps, as shown in Figure 3.
relational databases. The solutions offered by these frameworks try to maintain the persistence concern isolated from the base-application and can be reused on the development of any application that needs this management. In order to ensure the appropriate operation of this framework is important that the base-code development follows some structural policies like: - Every persistent class must have a corresponding table with the same name in the database, and each persistent attribute of this class must have a corresponding column on this table with the same name; - Every persistent class must have access methods to these attributes
Figure 3: GRENJ instantiation process.
In Step 1, Analyze Requirements, the requirements of the system to be developed are analyzed to verify if the system belongs to the Business Resource Management domain, so that the GRENJ can be used to support its construction. In Step 2, Relate Requirements with Patterns, each requirement must be related to the patterns presented by GRN so that they can be designed according to the specifications of each analysis pattern. In Step 3 the application classes are built, using the patterns identified in the previous step as reference. The GRENJ framework must be instantiated by the realization of the abstract classes that represents the identified patterns. In Figure 3, the circles with the letter P represent the presence of persistence concerns on the framework model classes and also on the application classes instantiated from them.
- Every persistent class must have an attribute called ID of type int that will be the entity identifier; - Every attribute of type int or double must be passed on the corresponding set method as Integer and Double, respectively. 4. Maintenance Activities In this section the maintenance activities conducted during the GRENJ restructuration are presented. As mentioned before, the coupling of the Persistence CF to the GRENJ framework aims to improve its modularization, separating the model layer from the persistence layer. An adaptative maintenance was made on the GRENJ so that the persistence layer present in its architecture was replaced by a product from the crosscutting framework family of persistence developed by Camargo & Masiero [2, 3]. a
3. Crosscutting Frameworks Crosscutting Frameworks are aspect-oriented frameworks that encapsulate only one crosscutting concern, as persistence or security. It aims to support the development of new application allowing the reuse of these concerns [2]. Unlike Application Frameworks, this kind of framework does not generate a complete application, but must be coupled to a base code that needs to deal with the concern encapsulated by the CF. Examples of Crosscutting Frameworks are the Crosscutting Frameworks Family of Persistence developed by Camargo & Masiero [2, 3]. The products of this family, like other frameworks developed to the persistence management, aim to facilitate the development of object-oriented systems that use
b
Figure 4: GRENJ architecture before and after the adaptative maintenance.
During the maintenance process, the GRENJ classes to be persisted must no longer extend the original persistence layer. This happens because these classes must now be crosscut by the Persistence CF, as shown in Figure 4. In order to distinguish the original
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 09–16, fev. 2010
12 Zanon, I. B. et al. _______________________________________________________________________________________________________
GRENJ version and the one that uses the Persistence CF, this new version of the framework is denominated GRENJ-FT. The isolation of the persistence concerns was not the only maintenance activity performed in the GRENJ persistence layer. It can be noticed that the Persistence CF used in this process did not posses all the functions needed by the GRENJ persistence management. So, some improvements were made on this CF, allowing it to be coupled to the GRENJ, and also to other frameworks that may need this same functionality. Some syntactic differences on the persistence methods signatures between the two frameworks also were noticed, which indicated that some modifications should be made on the method calls into the GRENJ model classes. Moreover, some specific persistence demands of GRENJ were implemented in a new layer that crosscuts the Persistence CF, isolating this concern from the model layer and keeping the CF generic. These modifications are described in detail below. 4.1 GRENJ Adaptations Two kinds of modifications were performed in the GRENJ structure: a) modifications corresponding to adaptations to the coupling of the two frameworks, in this paper denominated “coupling maintenances”; and b) modifications to separate the persistence concern isolating this concern from the model layer; denominated here of “isolation maintenances”. In order to make the coupling maintenances, some actions was needed, for example: a) use the standard name ID of type int for the attributes that represent class identifiers, which had different names on each framework class; b) change the methods and attributes that manage date values; the ones that was of Date type were adapted to FrameworkDate type, which is a proprietary standard of the Persistence CF; and c) change every set() method that set values of type int or double to receive parameters type of Integer or Double respectively. Regarding the coupling maintenances, it can be highlighted that the uniformity of the identification attribute (that is, the “ID” attribute) name facilitates the framework architecture comprehension during the instantiation process. The other modifications performed did not affect the GRENJ instantiation process, maintaining its generalization and its scope of work unchanged.
Two isolation maintenances were performed. First, the use of the crosscutting framework allows the elimination of persistence methods from the GRENJ model classes. These changes are related to the call of methods after updating or saving the persistent classes and to the implementation of methods that inform to the persistent classes which fields must be used on the saving and updating instructions. These modifications ensure a better obliviousness of the persistence details to the framework inner-classes, as it does to the application classes. This also facilitates the framework instantiation process, reducing the effort demanded by the application engineer. The second isolation maintenance was needed since the persistence methods implemented on GRENJ had different signatures from the corresponding methods on the Persistence CF. To make the adaptation of the persistence methods call, it was needed to identify the methods used by the GRENJ, relating them to the corresponding methods on the Persistence CF. The adaptation of these methods was made by the implementation of the Adapter design pattern [7] on the GRENJ. In this way the persistence layer has an interface with the signatures of the methods used by the application framework model layer. This interface is realized in a class that contains the implementation of these methods, and this implementation depends on the persistence mechanism used by the GRENJ. In case of the Persistence CF used in this maintenance, this concrete class is crosscut by the CF, allowing the class to access the persistence methods that will be used on the implementation of the interface defined methods. The use of the Adapter pattern allowed the persistence methods on the GRENJ code to remain unchanged, yet allowing the generalization of these methods signatures, which makes future maintenances on the framework persistence mechanism easier. 4.2 Adaptations on the Crosscutting Framework So that the Persistence CF could be coupled to the GRENJ code, some modifications are also needed in its architecture. Each maintenance performed can be classified as an improvement implemented in the CF to supply some persistence demands of the GRENJ. The first improvement on the Persistence CF was made in order to recognize the Java types Money and Enum. The Money type is native from the Java language and is necessary for GRENJ operate properly, mainly on handling the values of the transactions managed by the system. The Enum type is used to
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 09–16, fev. 2010
Reestructuring an Application Framework with a Persistence Crosscutting Framework 13 _______________________________________________________________________________________________________
identify some pre-defined multi-valued fields, as the status attribute, which can be set as Available or Unavailable. The GRENJ framework also demands a way to inform the persistent classes attributes that would not be persisted. A mechanism was created in order to allow the application engineer to inform these attributes to the CF in coupling time. This mechanism must only be used if there is some attributes that does not represent fields on the table corresponding to the class. Finally, some persistence methods required by the GRENJ that could not be supplied by the existent methods in the CF were created. For example, customizable search methods, in which the filter clause was informed, or the record deletion based on a predefined condition. All the modifications performed on the Persistence CF, even the ones motivated by GRENJ demands, can be seen as improvements of its functionality and can be useful to any other base code that has the same needs. With the realization of these modifications, the Persistence CF has its scope of work increased, maintaining the same reuse process. 4.3 Creation of an Intermediate Layer Another change made on the GRENJ persistence layer is the creation of an intermediate layer for the management of a specific persistence concern. This concern is represented by classes that are hotspots of the framework, which may be extended during the creation of a specific application. Therefore, some attributes of GRENJ classes make reference to generic classes that are instantiated by the application engineer during the instantiation process. Once the application class has a different name of the framework class, defined by the application engineer, the Persistence CF has no means to know the table name that corresponds to that class. To access the names of the application specific classes, it must call some GRENJ methods, overridden during the instantiation process. This behavior was implemented in an aspect that crosscuts some methods of the CF, capturing the result of these methods execution and feeding these fields according to the specific rules of the GRENJ. This implementation generates an intermediate layer as shown in Figure 5. In this way the concern remain isolated on the GRENJ model layer as no specific behavior is directly inserted on the CF structure, maintaining its generalization even with the adaptation performed in this study.
Figure 5: Intermediate layer to the coupling of the Persistence CF to the GRENJ.
5. Case Studies After the changes that resulted on the GRENJ-FT were performed, two systems were instantiated from this new framework. These systems use two of the three main GRN patterns: “Rent the Resource” and “Sell the Resource”. For the case study referred to the “Rent the Resource” pattern, an adaptation of a system originally instantiated from the GRENJ was made, in order to verify the effort needed to adapt instantiated systems to the new framework version. The system that uses the “Sell the Resource” pattern was generated straight from the new framework, allowing the comparison of the GRENJ and GRENJ-FT instantiation processes. In addition, both studies allowed the verification of changes in the GRENJ functionality after the performed maintenances. The respective case studies are presented below. 5.1 DVD Rental Management System This case study presents the instantiation of a DVD Rental Management System (DVDRMS) that was originally obtained from the GRENJ instantiation, so its classes were built according to this framework instantiation process. By these means, the application model classes, as the view classes created to attend them, present scattering of persistence code over its implementation since the isolation of the persistence concern on the GRENJ is weak. These concerns correspond to the persistence implementations discussed in Section 2. The objective of this instantiation process is to verify the use of the “Rent the Resource” pattern, of the GRENJ, in a system which was instantiated by using
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 09–16, fev. 2010
14 Zanon, I. B. et al. _______________________________________________________________________________________________________
GRENJ-FT, ensuring that no changes on the GRENJ pattern behavior was caused by the maintenances performed. Also it is intended to verify what maintenances activities are needed in order to make GRENJ-based systems may evolve to use the GRENJFT. The maintenances were performed in an iterative process, in which each class was adapted to the new framework, and then tests were made in order to confirm that the application functionality remained unchanged. In this process, each class of the system was removed from its original project and was integrated in a new one containing the GRENJ-FT. This first modification generated some errors referring to methods not found, as the persistence concerns scattered by the system code. Two kinds of maintenances were made to the classes in order to correct these errors. Adaptation 1: the persistence methods scattered over this application code were removed. This modification was made mainly in the model classes of these applications and decreased the number of lines of code of the modified classes.
Classes
DVDRMS IM MC PL FT PL FT
using the Persistence CF does not present persistence methods implementation, having no more presence of this kind of implementation on the application model classes. It can also be said that the number of persistence method calls was reduced in an expressive manner. It can be noticed that the use of GRENJ-FT reduced the presence of persistence concerns on the model classes, maintaining only the implementation of concerns that are under the responsibility of this class. Removing these methods also result in decreasing the number of lines of code implemented in these classes, making the volume of information on their structures smaller and making these classes simpler and easier to maintain. Adaptation 2: Adaptations related to the policies required by the Persistence CF were performed. The application classes of the GRENJ were changed during the coupling process of the Persistence CF in order to implement these policies on their structure. Once the DVDRMS was built based on this framework, these features were not considered in its implementation.
Classes Model Classes
DVDRMS LA LoC
%
LoC PL FT
Category
3
59
5,1
Client
0
85
0
Category
3
0
2
0
83
59
DVD
1
31
3,2
Client
4
0
4
0
118
85
Movie
1
92
1,1
DVD
0
0
1
1
30
31
Gender
1
34
2,9
Movie
4
0
2
0
121
92
RentItem
0
43
0
Gender
0
0
0
0
34
34
Rent
3
170
1,8
RentItem
2
0
0
0
52
43
18
0
0
0
0
0
167
170
FineTax View Classes
0
Rent FineTax
0
0
0
0
18
18
CategoryPanel
11
211
5,2
TOTAL
13
0
9
1
623
532
ClientPanel
12
222
5,4
DVDPanel
29
255
11,4
MoviePanel
26
271
9,6
GenderPanel
13
202
6,4
StandardPanel
3
82
3,6
RentPanel
36
481
7,4
TOTAL
152
2256
Table 1: Persistence concern removed from the DVDRMS.
In Table 1 a quantitative comparison of this system using the two frameworks is shown: GRENJ with the persistence layer, represented by PL, and the GRENJ-FT, represented by FT. The acronym MC refers to the number of persistence method calls, the IM refers to the number of persistence methods implemented and LoC refers to the number of lines of code existent on the instantiated classes. The table shows that the system
Table 2: Adaptations on the DVDRMS system.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 09–16, fev. 2010
Reestructuring an Application Framework with a Persistence Crosscutting Framework 15 _______________________________________________________________________________________________________
In Table 2 the number of altered lines of code, represented by AL, on the instantiated application, in relation to the lines of code of each class is shown. The presence of few changes on the model classes can be observed in this table, since the quantity of persistence methods call is small, and the alterations were performed only on data types of attributes declared on these classes. The view classes present a higher number of alterations as they have persistence methods call that could not be removed from its structure, such as saving or updating information. These methods call must be altered, corresponding to the methods implemented on the persistence interface developed in this study and presented in Section 4.1. It can also be observed that the modified line of codes is lower than 11% in all classes, which shows that only a small part of the developed classes must be changed to perform this adaptation. After the adaptations on the system, it can be observed that the persistence concern isolation achieved by the use of this new framework compared to the original GRENJ is significant. This improvement is possible by changing a maximum of 11% of the application classes, affirming that the DVDRMS adaptation is feasible and presents a good relation between effort and gain. 5.2 Newsstand Manager System This case study presents the development of a Newsstand Manager System (NMS) that uses the “Sell the Resource” pattern of GRN, responsible for managing the buying and selling of resources. The creation of this system was based on requirements originally used in an application instantiated from the GRENJ. In this system instantiation, the same process described in Section 2 was used. This instantiation process defines that the pattern language GRN can be used as a support to the development of a system, relating the offered patterns to the system requirements. After the survey of the requirements, the activities defined in Step 1 and 2 of the GRENJ instantiation process were executed with no changes. After that, the identified classes were implemented, according to Step 3 of the framework instantiation process. As presented in Section 2, the GRENJ demands that the developer implements the framework hotspots and persistence related methods on the creation of the application classes. Now with the use of the GRENJ-FT, the non presence of persistence concerns on the framework model layer is reflected on the applications instantiated from it.
After the Step 3 execution it can be verified that the GRENJ-FT instantiation process does not require the implementation of the persistence methods demanded by the GRENJ process. Once these concerns are isolated by the Persistence CF present on its architecture, the application engineer must deal only with the implementation of the classes defined by the GRN patterns and with the hotspots present on its classes. Once the instantiation process has fewer steps to be followed, it is noticed that the GRENJ-FT instantiation process demands less effort than using the GRENJ. The elimination of one step from this process directly affects the productivity of the instantiation process, reducing the application engineer effort on the instantiation of applications using GRENJ-FT. Besides, it is noted that the concern isolation is also higher on application classes developed on GRENJ-FT instantiated systems, improving its reusability and maintainability. 6. Related Works Some studies about framework maintenances have been performed, aiming to define processes and tools that help this activity. Cortes et al. [4] shows how refactoring and unification rules can help the application framework evolution and the maintenance process. Dagenais & Robillard [5] presents a system that suggests adaptations in programs instantiated from frameworks according to the analysis of alterations made on the own framework. Although these works have their focus on framework maintenance, none of them consider the application of crosscutting frameworks during their maintenance activities or during their maintenance processes. Lobato et al. [10] present a systematic case study in which the evolution of an object-oriented and an aspect-oriented version of the MobiGrid framework were compared. Evolutionary modifications were applied in the study, as insertion of new functions and composing with other frameworks. It was observed that the Aspect-Oriented version modularization helped the evolution of this framework. However, neither of the used versions considered the use of CFs in its architecture. Yet the modularization observed on the AO version of MobGrid presents similar advantages to the obtained by the GRENJ-FT. 7. Final Remarks Through this study, it was observed that a crosscutting framework can be used to encapsulate the persistence
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 09–16, fev. 2010
16 Zanon, I. B. et al. _______________________________________________________________________________________________________
crosscutting concern existent on an application framework, but some adaptative maintenances are needed. It was also noted that, even with the adaptations performed, the frameworks generic structures and their domain scope remained unchanged.
[4] CORTES, Mariela; FONTOURA, Marcus; LUCENA, Carlos. Using refactoring and unification rules to assist framework evolution. ACM SIGSOFT Software Engineering Notes, New York, NY, USA, v. 28, n. 6, 2003.
The performed adaptations to couple the Persistence CF to the GRENJ resulted in improvements for both structures. In the Persistence CF case, all the performed improvements can now be used during the coupling to any other base code. In the GRENJ case, besides improving the persistence concern isolation and facilitating the comprehension of its structure and instantiation process, the implementation of an interface according to the Adapter design pattern provided a higher flexibility to the framework. This makes future maintenances of the persistence mechanism used by this framework easier.
[5] DAGENAIS, Barthélémy; ROBILLARD, Martin P. Recommending adaptive changes for framework evolution. 30th International Conference On Software Engineering, Leipzig, Germany, p.481-490, 2008.
Difficulties were faced on the identification of all the points that must be adapted to the maintenance. A deep analysis had to be made in both frameworks, mainly on the GRENJ, since this application framework presented persistence code scattering over the business rules. This activity could be facilitated by the use of tools that aid the semantic identification of persistence concerns in a given architecture. As future works, it is intended to couple more crosscutting frameworks to the GRENJ structure, aiming to have a better analysis of the used process. Also it is intended to generalize the intermediate layer between these frameworks, in order to use it in any context that presents the same characteristics presented here. References [1] BRAGA, Rosana Terezinha Vaccare. Um processo para construção e instanciação de frameworks baseados em uma linguagem de padrões para um domínio específico. 172 f. Tese (Doutorado) ICMC, USP, São Carlos, 2003. [2] CAMARGO, Valter Vieira de; MASIERO, Paulo Cesar. Frameworks Orientados a Aspectos. In: XIX Simpósio Brasileiro de Engenharia de Software, 2005, Uberlândia, p. 200-216, 2005. [3] CAMARGO, Valter Vieira de; MASIERO, Paulo Cesar. An Approach to Design Crosscutting Framework Families. In: ACM Workshop on Aspects, Componentes and Patterns for Infrastructure (ACP4IS), Bruxelas. Seventh International Conference on Aspect-Oriented Software Development (AOSD´08), 2008.
[6] DURELLI, V. H. S. Reengenharia Iterativa do Framework GREN. 2008. Dissertação (Mestrado) – DC, UFSCar, São Carlos, 2008. [7] GAMMA, Erich; HELM, Richard; JOHNSON, Ralph; VLISSIDES, John. Design patterns: elements of reusable object-oriented software. Boston, MA, USA: Addison-Wesley Longman Publishing Co. Inc., 1995. [8] JOHNSON, Ralph E. Frameworks = (components + patterns). Communications Of The ACM, New York, NY, USA, v. 40, n. 10, p.39-42, 1997. [9] LIU, Yong; YANG, Aiguang. Research and application of software-reuse. SNPD 2007: Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, Qingdao, China, p. 588-593. 2007. [10] LOBATO, Cidiane; GARCIA, Alessandro; KULESZA, Uirá; VON STAA, Arndt; LUCENA, Carlos. Evolving and composing frameworks with aspects: the MobiGrid case. ICCBSS 2008: Seventh International Conference on Composition-Based Software Systems, Madrid, Spain, p. 53-62. 2008. [11] LOPES, Sergio; TAVARES, Adriano; MONTEIRO, João; SILVA, Carlos. Design and description of a classification system framework for easier reuse. ECBS '07: 14th Annual IEEE International Conference and Workshops on the Engineering of Computer-Based Systems, Tucson, Arizona, USA, p. 71-82. 2007. [12] YODER, J.; JOHNSON, R.; WILSON, Q. Connecting Business Objects to Relational Databases. Proceedings of the 5th Conference on the Pattern Languages of Programs, 1998.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 09–16, fev. 2010
Modeling, Implementation and Management of Business Rules in Information Systems G LAUBER B OFF J ULIANO L OPES DE O LIVEIRA Federal University of Goiás - UFG Informatics Institute - INF Zip 74001-970 - P.O. Box 131 - Campus II - Goiânia (GO)- Brazil (glauber,juliano)@inf.ufg.br
Abstract. Building portable and maintainable Information Systems (IS) software is a challenge for Software Engineering, but there are two essential requirements which, when present in the development process, make it easier to deal with the system complexity. The first requirement states that it should be possible to describe the IS business rules (BR) using a high level conceptual language, generating a single implementation independent model. The second requirement demands the generation of software source code from the abstract BR model, and the integration of this code into the IS. This paper presents an approach to fulfill these requirements. Instead of hardwiring the BR into applications, this approach adopts OCL as a platform independent high-level language to define a single BR model for the IS. Rules are automatically converted into SQL and stored in database systems for later evaluation. This approach improves IS maintainability since it promotes a centralized and abstract description of all BR in the IS. Keywords: System, Business Rules, Object Constraint Language (OCL)
(Received 31,2010) 2005) (ReceivedJanuary October01, 30,2005 2009/ /Accepted AcceptedDecember January 22, 1
Introduction
Since BR constraint business operations, they are also known as application domain rules [13]. The imInformation systems (IS) comprise people, software, infraplementation of application programs allows people (the structure (machines and communication facilities), and users) to perform business processes and to manipulate procedural components organized to collect, process, business information according to the BR. and transmit data in order to generate and disseminate Traditionally, BR are represented and implemented useful information to support a business context. Due to in the application programs code. Thus, business rules its flexibility and adaptability, the software component and application programs are analyzed, designed and handles most of the complexity of IS. implemented as a single concept. This approach has Therefore, the development of IS software involves several drawbacks, mainly on the portability and mainmodeling the business domain to elicit system requiretainability of the IS, due to the tight coupling of what ments. Business modeling encompasses the discovery, the system must do (defined in the BR) and how it does design, and documentation of business rules (BR). These it (coded in the application programs) [3]. To minimize rules formalize the business concepts, the relationships this dependency, business rules should be represented in among these concepts, and the constraints that must be an abstract way, and should contain no implementation enforced to guarantee the integrity and consistency of detail, like platform or technology definitions. business data and processes. The BR of IS can be considered as statements that define or constraint any busiIn this paper we show that, using an appropriate lanness aspect [9]. guage, which in our case is OCL (Object Constraint
18 Boff, G.; Oliveira, J. L. de _______________________________________________________________________________________________________
Language) [15], it is possible to define BR in an abstract model, so that it can be transformed to a specific platform language using an MDA (Model-Driven Architecture) approach [12]. This approach proposes a framework that focuses on design of platform independent models (PIM), containing BR, that are automatically converted to platform specific models (PSM), including software code. OCL is a suitable language for representing the three types of BR [9]:
nents of the architecture, like the SQL Code Generator and the OCL/SQL Mappings. This paper is structured as follows: Section 2 presents the components of the mechanism of BR; Section 3 discusses business rules modeling using OCL, a high level language; Section 4 presents how BR modeled using OCL are implemented in a DBMS, using some mapping patterns and code generator; Section 5 describes how our approach can be used to evaluate BR in IS; Section 6 analyses related works and Section 7 presents our conclusions in terms of advantages and disadvan• structural assertion rules: define concepts or state- tages of the approach described in this paper. ments about the IS data structure; • action assertion rules: include conditions or statements that constraint IS actions;
2
Components Architecture Overview
Our mechanism for business rules modeling, implementation and maintenance is organized as an integrated architecture of components, illustrated in Figure 1, where an arrow indicates a dependency between components. In this paper, we focus on action assertion and deriva- These components are separated in two groups accordtion rules, since structural rules are well supported by ing to the moment in which they are used, namely, Defcurrent technologies (Database Systems, for instance). inition Time, involving BR modeling and implementaModels transformations convert BR written in OCL (plat- tion; and Execution Time, involving BR evaluation durform independent model) to a target language which is, ing the execution of the IS. The Definition Time compoin our approach, SQL (platform specific model), specif- nents are: OCL Editor, Consistency Checker, SQL Code ically plPgSQL, which is the procedural language of the Generator and OCL/SQL Mapping, while in Execution PostgreSQL database system. Time there is a single complex component: the Rules The choice for implementing BR in the DBMS rather Engine. than in the application code was made to reduce the The OCL Editor component is a specific purpose coupling of BR and application. This decreases imtext editor, with functionalities to support the definition plementation efforts and improves system maintenance, of OCL expressions. Two of these functionalities (sefor example, in the context of multiple distributed sysmantic and syntax checker) are provided by the Contems that share the same database. sistency Checker component, which guarantees that the The main ideas of the approach adopted in this paOCL expressions created within the OCL Editor are per were introduced in [1]. These ideas are refined and consistent with the OCL grammar. Figures 3 and 6 ildetailed in the present paper, which describes the whole lustrate the user interface of the OCL Editor composoftware architecture designed to specify, maintain and nent. evaluate business rules (BR) in IS. Moreover, this paThe SQL Code Generator component translates OCL per discusses the implementation of the architecture, expressions into SQL procedural code (in plPgSQL). including the main software components (OCL Editor, This component depends on the Consistency Checker Consistency Checker, SQL Code Generator and Rules (to avoid translation of inconsistent OCL expressions) Engine), business rules repository and some user interand on the OCL/SQL Mapping components, which stores face descriptions, which were not presented in that paXML files containing rule templates to translate OCL per. expressions into the specific SQL dialect. The implemented architecture was applied in a softThe Rules Engine component evaluates business rules ware framework to build and maintain IS. This frameat IS execution time. As other components, it depends work, presented in [4], relies on the BR architecture deon the DBMS because all data, business rules and metascribed in this paper to generate and maintain IS apdata are stored there. plication programs, based on ideas from Model-Driven Development. This paper focuses on the BR architecIn the next sections, it is explained how this whole ture, describing all components, while in [4] the focus mechanism is used to model, implement and evaluate is on the framework and it describe only some compo- BR in IS. • derivation rules: represent knowledge that can be derived or computed from the data stored in the IS.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 17–28, fev. 2010
Modeling, Implementation and Management of Business Rules in Information Systems 19 _______________________________________________________________________________________________________
Figure 1: Components of the BR Mechanism.
3
Business Rules Modeling
The development of IS software involves understanding all business domain concepts, including business rules, and gathering requirements that should be implemented. According to the model driven approach, these concepts should be specified in an abstract model, free of implementation details, that is, a Platform Independent Model (PIM). The main advantage of this separation of concerns (isolating the domain model form the implementation model) is that developers focus their efforts on specifying what must be done (the right information system) instead of worrying about the technologies that will support the system. The Unified Modeling Language (UML), is a conceptual language standardized by OMG (Object Management Group) for modeling software structure, behavior, architecture, business processes and data structures. Models created with UML are abstract, but their object-oriented nature allows definition of real world concepts and their relationships. However, UML provides appropriate support only to structural assertions. Action assertions and derivation rules are not easily described in UML specifications. A common and popular way of modeling and implementing BR is to define rules in natural language which are then translated and hard-coded into application software. This traditional approach has several drawbacks, such as:
• natural language rules can be interpreted in different ways by users, software engineers and programmers; • the manual translation of natural language rules into code is error prone; • rules related code is spread across several applications, making its maintenance more difficult. These shortcomings can be avoided or minimized if restrictions are expressed in a formal, though conceptual, language; and an efficient translation mechanism forces the correct translation of these conceptual restrictions to implementation models. Recognizing UML limitations to express BR, OMG proposed an alternative language, OCL (Object Constraint Language), to define and express those types of rules that cannot be clearly represented in UML. OCL is a specific purpose modeling language to allow expressing general rules in object-oriented models. OCL expressions are declarative (describing what a system should do and separating rules specification from implementation) and without side-effects (BR execution does not change the system state). They can be used to define: • invariant conditions that must be satisfied for all system states; • query expressions over the model elements;
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 17–28, fev. 2010
20 Boff, G.; Oliveira, J. L. de _______________________________________________________________________________________________________
Figure 2: A simple conceptual model for an enterprise.
• constraints over operations that can change system state (pre or pos-conditions, for example). Combining the expressive power of conceptual data models with OCL dynamic constraints expressions allows specifying both structural and behavioral constraints in a high level, abstract model of the IS [4]. In the example illustrated in Figure 2, an UML class diagram represents a simplified enterprise domain. This simple conceptual model contains several structural constraints, such as the following cardinality (or multiplicity) constraints: Rule 1)"An Employee must work in a single Department"; and Rule 2) "An Employee can manage at most one Department". However, the power of expression of constraints of this simple model is very limited, since only structural constraints can be represented. For example, suppose that the following business rule most be enforced: Rule 3)"An Employee can manage only the Department in which he works". This business rule is not represented in Figure 2, and due to UML expressive power limitations, it cannot be stated without modifying the model structure [4]. To solve this problem, this paper proposes using OCL for representing rules that cannot be expressed using only UML modeling primitives, such as action assertions and derivation rules [4]. Thus, Rule 3 could be formally represented in OCL as the query expression shown in Figure 3. In this approach, rules should be specified as an independent aspect, i.e., rules are first class citizen in the IS conceptual model. The separation of rules, data and functions is not complete, since rules have influence on data and functions; however, there is no subordination
of rules specification with regard to data or functions specifications. Moreover, rules are formally expressed, but without dependency of implementation technologies or specific platforms [4]. 3.1
Business Rules Repository
After specifying the BR, it is important to store them in a secure, efficient and consistent structure. In this approach, we created a business rules repository, implemented in the DBMS, which allows that business owners and developers can access rules and change them, if necessary. Some characteristics of the repository are: • BR are stored in different abstraction level during their lifecycle (natural language, OCL); • it ensures traceability between BR and use cases where they are evaluated; • it ensures traceability between BR and their implementation, as procedural code or stored procedures; • it ensures security and integrity of BR and related information through a mechanism of access control; • it allows creating new versions of BR as part of a process of change, where each version is identified and also stored in the repository. To insert or update a BR in the repository, the form illustrated in Figure 4 must be filled accordingly. It is
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 17–28, fev. 2010
Modeling, Implementation and Management of Business Rules in Information Systems 21 _______________________________________________________________________________________________________
Figure 3: Editor screen and the OCL code for Rule 3.
necessary to specify the BR identification, type (validation or derivation), context, business operations that trigger the BR, moment of the BR execution (before or after the related operations), status (modeled, implemented or inactive) and BR definition in natural language. This form does not contains any field to write OCL expressions, because they must be written using the OCL Editor component. 4
Business Rules Implementation
Modeling BR using OCL assures technology implementation independence. However, to be used by applications, BR specified using high-level language need to be converted to a platform specific language. To perform automatic models transformation, it is necessary to define mappings between source and target languages. Section 4.1 describes how the mappings between OCL and SQL languages are defined, while Section 4.2 describes the implementation of the proposed approach. 4.1
OCL to SQL Mappings
In this work, we decided to implement the BR in the DBMS rather than in the application code to reduce the coupling of BR and application. This decreases imple-
mentation efforts and improves system maintenance. In order to make it feasible to transform OCL business rules to SQL, it is necessary to define mappings between elements of these languages. [5] proposes some patterns for invariants transformation, specifying their general structure, attribute access and navigation across associations. We adapted these patterns in our work to define transformation patterns according to the needs of our IS BR mechanism. In our approach, all mappings are defined in a XML file, based on Dresden OCL Toolkit [8] ideas. The XML file contains information to guide the transformation. An example of mappings from OCL to SQL is shown in the following code:
JOIN ON =
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 17–28, fev. 2010
22 Boff, G.; Oliveira, J. L. de _______________________________________________________________________________________________________
Figure 4: Screen used to register, update or remove a business rule.
SELECT FROM WHERE
query result. In the example, it is set to "Employee.ssn = employee_ssn". Figure 6 shows the whole code that was generated using these mapping patterns. Mappings between models should be done in such a way that changing the target platform can be easily implemented on the transformation definition, without modifying application code. Following this idea, we create one XML file for each target platform. In our current implementation, the target platform is plPgSQL, the SQL procedural language of the PostgreSQL. This way of defining mappings has the advantage that, if we need to change the DBMS used(from PostgreSQL to MySQL, for example), then only a new XML file must be created. It will contain the mappings from OCL to the new DBMS procedural language. Thus, it facilitates maintenance of the IS and improves its portability.
This example shows how to generate SQL code corresponding to an attribute_navigation and allInstances operations defined in OCL. The first operation refers to navigation through the classes of the model and the second refers to a query operation over all instances of a class, like a select in a table in SQL. In OCL, navigation operations are represented using a dot notation ("."). An example of those operations is illustrated in Figure 3 , lines 5 and 6. To map that expression, it is necessary to combine these two mapping patterns (attribute_navigation and allInstances). The first parameter, "column", is set to "id". As the expression refers to a navigation across classes, it corresponds to an at- 4.2 Code Generation tribute_navigation operation. So, the second template Mappings from OCL to SQL are necessary for code must be used to replace the value of parameter “table”. generation process. This process includes the followIn this case, parameter "table1" is set to "Employee" ing tasks: and "table2" is set to "Department". Parameters "key1" and "key2" are used to make connection between classes, 1. perform lexical and grammatical analysis of the rules and validate them according to the OCL gramacting as joining condition in SQL code. In the allInstances mar; template, parameter "condition" is used to constrain the
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 17–28, fev. 2010
Modeling, Implementation and Management of Business Rules in Information Systems 23 _______________________________________________________________________________________________________
2. build the abstract syntax tree relative to the rule; 3. check if rule elements are part of the model; 4. perform SQL code generation using mappings definition. To perform the analysis of the rule in OCL, first rules are retrieved from the repository in the database and then they are parsed. The parser used in this work was built using SableCC [6], an OO parser generator, which generates Java classes that implement the lexical and grammatical analysis of OCL expressions. The parser reads a rule and constructs the abstract syntax tree based on the OCL grammar. The tree contains all tokens of the analyzed expression and thus it is possible to verify whether the elements in the rule are part of the source model of the transformation. To facilitate this task, a metadata structure was implemented, containing all information about the source model, like entities, attributes and relationships. As an example of the metadata structure, consider the model illustrated in Figure 2. For that model, the metadata structure must contain information about all business entities (Person, Employee, Department), attributes (name, type, domain, mandatory/optional, derived or not), relationships (work and manage). This structure is a different representation of the abstract model that allows manipulating those information. If no problem related to rules definition is found, the next step is to perform code generation using the mappings definition in the XML file and the abstract syntax tree. The SQL code generation process results in business rules implemented as stored procedures. The name of each procedure is the same of the original rule in OCL. Figure 5 shows package OCLTransformation, their components and other external packages that are needed for the transformation between models. In this figure there is a class named Metadata, which contains the description of the model elements. The class Mapping Library uses the external package xerces.jar, which is responsible for reading the XML file that contains the mappings definition. An interface named Translator was created, which contains essential methods to perform the transformation between models. It uses the OCLParser package to perform analysis of OCL expressions. The SQLTranslator class contains the implementation of methods that perform rules translation. The JavaTranslator class is represented in Figure 5 as an example, just to show that, if we decide to change business rules implementation from database to application layer (specifically Java language), it will be necessary to write code for this class. It will be responsible
for getting OCL/Java mappings and translate user defined OCL expressions to Java code. After implementation, if any change in the business domain rules occur and these changes need to be incorporated in the IS, programmers cannot manually modify any stored procedure and they don´t need to change any application code. It is only necessary to modify BR written in OCL to incorporate the new definitions, and then the stored procedures are generated automatically by the transformation tool. The old procedure will then be overwritten by the new one that was generated and then it may already be used by the application. Thus, we can see the advantage of using this approach, which facilitates the maintenance of the system. However, changes made to BR can’t be incorporated to the IS without checking incompatibilities to previous application data. It is done by invoking the modified BR for each instance of the context entity. If for any of those instances the BR evaluation fails, then it means that there is a conflict. So, changes made to the BR can’t be deployed in the database. It is mandatory to resolve all conflicts previously. 5
Evaluating Business Rules
In previous sections we described our approach to define BR using OCL and to transform these BR to a PSM (SQL). In order to use BR it is also necessary to integrate them with the application programs. Figure 7 describes our process to evaluate BR, which is executed by the Rules Engine component of the architecture. The sequence of activities represented in Figure 7 will be executed when a predefined operation is performed in application. Following the Active Database principle, our mechanism is sensitive to operations that modify the state of the Information System: insert, update and delete. When these operations occur within an application program, the rule evaluation process is triggered, performing the following activities [4]: • Get rule and context metadata: : the first thing to do is to identify the BR context (entity or attribute) affected by the operation. Since the application program has to forward the operation to the persistence component (which acts as a façade to the DBMS), it is possible to observe all the communications between application programs and the persistence mechanism and to automatically detect the operations and the affected entities and attributes. After that, it is necessary to retrieve the corresponding metadata containing structure details for the rule context. In this step, we also get BR data that are stored in the BR repository
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 17–28, fev. 2010
24 Boff, G.; Oliveira, J. L. de _______________________________________________________________________________________________________
Figure 5: of the OCL transformations package.
in this step, all necessary information for BR evaluation is encapsulated in the BusinessRules object, notably the name of the operation that must be executed and its parameter’s values. Now, the BR evaluation mechanism calls the corresponding stored procedure in the database and manipulates its results, according to the type of rule. If it is a validation rule, then it will return a boolean value and if it is a derivation rule, it will return the derived value to the application program. When evaluating any BR, if it fails, then a error message will be presented to the user in a window, explaining what was wrong.
(parameter names, for instance). With these data in hand, it is possible to create an instance of the BusinessRule class. The evaluation of the import BR execution is different from that of the validation or derivation BR. So, it is important to identify the BR type. If it is a validation or derivation BR, we can proceed to the "Manage rule parameters" activity; otherwise, it is necessary to execute the "Get imported rule" activity. • Manage rule parameters: a BR may need parameters for executing operations with model elements. The values of these parameters generally come from the application program data and it is necessary to retrieve and transform them to the correct data type, according to what is defined in the OCL BR. This step gets and converts the parameters, making the parameter values available to be used for invoking the stored procedure that corresponds to the OCL rule. • Verify rule behavior and time compatibility: every business rule must have a related operation and time for execution, which must be specified when a BR is inserted in the repository. Before start executing the rule, it is necessary to check that the behavior and the time of the application is the same as the BR. For example, suppose there is a rule that must be executed before inserting an element on an entity "A" context type. Then, this rule will be executed only when the application program informs to the BR evaluation mechanism that its behavior is that: insertion on context "A" and before executing this operation. If this condition is not satisfied, then the BR will not be executed.
All activities described here are executed every time a BR is evaluated, but the evaluation process is not triggered by the application programs. Therefore, BR and application programs can be developed independently. Moreover, the mechanism enforces the business rules, so there is no risk that a fail (accidental or intentional) in the application causes the business rules to be ignored. This is an important advantage of our mechanism: all data in the IS database will be consistent with the business rules, without any intervention from the application programs. 6
Related Work
There are several works that investigate the same problem, namely, the automatic conversion of rules expressed in high level languages (such as UML and OCL) to software that implements an IS. Our solution takes into account some ideas from Dresden OCL Toolkit, a modular software platform for OCL, providing facilities for the specification and evaluation of OCL constraints [8]. • Execute derivation rule and execute validation rule: The toolkit performs parsing and type-checking of OCL
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 17–28, fev. 2010
Modeling, Implementation and Management of Business Rules in Information Systems 25 _______________________________________________________________________________________________________
Figure 6: Transformation of the OCL Business Rule of Figure 3 into SQL
Figure 7: Activity Diagram for BR evaluation.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 17–28, fev. 2010
26 Boff, G.; Oliveira, J. L. de _______________________________________________________________________________________________________
constraints and generates Java and SQL code [8]. language. OMG created in 2002 a standard to specify modelWe have reused many ideas from this toolkit, but to-model transformations related to Model Driven Arwe had to modify and adapt several features to fulfill chitecture (MDA), named Query/View/Transformation the requirements of our mechanisms. One important (QVT) [14]. It adresses only models compatible with modification is related to the target PSM model. Our MOF 2.0 metamodel, like UML, for example. Transmechanism generates stored procedures to convert tarformations from model to text or text to model are not get business rules from OCL to SQL code. The toolkit in the scope of QVT. As in our approach we propose generates SQL code in form of database views. SQL code generation from an abstract UML model, it In [10], the authors propose a framework for generis not possible to define transformation using QVT. So, ating queries code that allows mapping invariants rules we preferred to implement them using an XML file, like written in OCL to sentences in query language as VIEWS is was done in Dresden OCL Toolkit [8]. in database. The advantage of the approach presented There is another language used to define model transin this article is related to the implementation of rules formations, named Atlas Transformation Language (ATL) in DBMS, which is performed using stored procedures [7]. It is a declarative-imperative language which is (instead of VIEWS) and provides more expression power, used to define transformation between source and tarsince it is based on a procedural language. Furtherget models, according to some mapping patterns. The more, the framework can represent only OCL invariXML file used in our approach is also used to define ants, while in our approach it is possible to represent model transformations, but in a simpler way. ATL is query expressions, pre and post conditions. more structured and allows the definition of transforWe have also investigated other similar tools, such mations based on different metamodels. as AndroMDA and Atenas. In AndroMDA [17], which is an open source framework, it is possible to execute PIM to PSM transformation and also to transform OCL 7 Conclusions to other languages. OCL transformation mechanism al- This article presented an approach for modeling, implelows generating code in HQL (Hibernate Query Lan- menting and maintaining business rules in Information guage) [16] and EJB-QL (Enterprise Java Bean Query Systems. In this approach, business rules are modeled Language) [11]. Our software mechanism differs from using OCL and they are translated automatically to SQL AndroMDA, mainly, in the choice of the target technol- code using mappings between these languages. ogy of the transformation, which, in our case, is SQL According to [5], using UML together with OCL to code generated as stored procedures for relational database. define BR has several advantages: The Atenas tool, presented in [2], was developed for • UML is a standard language for defining objectdefining and evaluating BR using OCL. It implements a oriented models; BR repository, an OCL editor that is used to write business rules as invariants, and it has an SQL code genera• OCL is a precise and abstract language for specifytor that converts the rules from OCL to SQL. The code ing integrity restrictions. The layout of the navigagenerator takes OCL invariants as inputs and transforms tion model is similar to the concepts of databases; them to CHECK constraints in the database. There is a set of database events that are responsible for trigger• Using OCL to define BR allows achieving indeing the BR evaluation, like insertion, deletion, or update pendence in relation to technology implementation; commands. • The automatic generation of SQL code from the Our software mechanism also shares some ideas with OCL reduces development time and facilitates the the Atenas tool, like the adoption of a single BR reposmaintenance of the IS; itory and the OCL editor. The main difference is related to the kind of OCL expressions that can be used • The specification of operations with pre and posto model BR and the generation of the target SQL code. conditions is a good starting point for the autoOur proposal is to use query expressions to model valimatic generation of SQL queries in the database dation and derivation rules, and then generate SQL code as stored procedures corresponding to those operas stored procedures to implement them, while Ateations. nas represents BR as OCL invariants and implement them as CHECK constraints. Therefore, our mechaHowever, since it is a formal language, using OCL nism has more expression power, since it allows mod- to represent business rules hinders their understanding eling derivation rules and it is based on a procedural from business owners. Besides, it is not trivial for a
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 17–28, fev. 2010
Modeling, Implementation and Management of Business Rules in Information Systems 27 _______________________________________________________________________________________________________
systems developer to learn the language. Besides, some rules that could be written in a simple sentence using natural language, when converted to OCL, need several lines of definition. Thus, we decided to keep a natural language description of every OCL rule, as a user oriented documentation for the rule. The approach presented in this article was empirically validated in a research project that has been implemented from 2005 to 2009 in the Informatics Institute of Federal University of Goiás, with financial support of CNPq. The final goal of the project is the development of a comprehensive information system for the optimization of agricultural activities. In this system, the application code and business rules code are specified as individual aspects. The system was implemented in Java and has approximately 67,000 lines of code. There are about 140 BR, all implemented as stored procedures in PostgreSQL DBMS using a transformation tool from OCL to SQL, which was developed by the authors of this article. We are currently working on the implementation of new mappings of OCL to different DBMS platforms. We are also working on the evolution of the components mechanism described in this paper into a framework architecture in order to make it easier to reuse our approach to IS implementation. Throughout the development of the IS, the advantages of using UML and OCL for modeling BR described in [5] were noticed by the development team. The main advantages of the approach are related to the portability and maintainability of the IS, in addition to the automatic generation of SQL code (stored procedures), reducing the programming effort for the development and maintenance of the IS. The portability is improved because BR are represented in an abstract declarative language (OCL), which is a standard of OMG. Rules are automatically converted to a specific platform using a transformation tool. The separation of business rules from application code improves the maintainability of the IS by applying the principle of separation of concerns. Rules are documented in a single model and are not mixed with the application code. This centralized organization improves the organization of the code and facilitates the assessment of changes impact in business rules. References [1] Boff, G. and de Oliveira, J. L. Modelagem, implementação e manutenção de regras de negócio em sistemas de informação. In VIII Simpósio Brasileiro de Qualidade de Software - VI WMSWM (Workshop de Manutenção de Software
Moderna). Ouro Preto, Minas Gerais, Brasil, 2009. [2] da Silva, G. Z., de Souza, J. M., de Almeida, V. T., and Sulaiman, A. Atenas: Um sistema gerenciador de regras de negócio. In Seção Técnica de Ferramentas do XV Simpósio Brasileiro de Engenharia de Software (SBES), pages 338–343, Rio de Janeiro, Brasil, 2001. [3] Date, C. J. What Not How: The Business Rules Approach to Application Development. AddisonWesley Professional, April 2000. [4] de Almeida, A. C., Boff, G., and de Oliveira, J. L. A framework for modeling, building and maintaining enterprise information systems software. In XXIII Simpósio Brasileiro de Engenharia de Software (SBES). Fortaleza, Ceará, Brasil, 2009. [5] Demuth, B. and Hussmann, H. Using uml/ocl constraints for relational database design. In UML, pages 598–613, 1999. [6] Gagnon, E. M. and Hendren, L. J. Sablecc, an object-oriented compiler framework. In TOOLS ’98: Proceedings of the Technology of ObjectOriented Languages and Systems, page 140, Washington, DC, USA, 1998. IEEE Computer Society. [7] Group, A. Atlas transformation language. [Online]. Available: http://www.eclipse. org/m2m/atl/. [Accessed: November, 2009], 2009. [8] Group, S. T. Technische universität dresden. dresden ocl toolkit. [Online] Avaiable: http: //dresden-ocl.sourceforge.net. [Acessed: August, 2009], 2009. [9] Group, T. B. R. Defining business rules - what are they really? [Online] Avaiable:http://www. businessrulesgroup.org. [Acessed: August, 2009], 2000. [10] Heidenreich, F., Wende, C., and Demuth, B. A framework for generating query language code from ocl invariants. ECEASST, 9, 2008. [11] Microsystems, S. Enterprise java bean query language. [Online]. Available: http://java.sun.com/j2ee/1.4/ docs/tutorial/doc/EJBQL.html. [Accessed: August, 2009], 2009.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 17–28, fev. 2010
28 Boff, G.; Oliveira, J. L. de _______________________________________________________________________________________________________
[12] Miller, J. and Mukerji, J. Mda guide version 1.0.1. Technical report, Object Management Group (OMG), 2003. [13] Morgan, T. Business Rules and Information Systems: Aligning IT with Business Goals. AddisonWesley Longman Publishing Co., Inc., 2001. [14] OMG. Query/views/transformations rfp. Request for proposal, Object Management Group, 2002. [15] OMG. Ocl 2.0 specification. Formal specification, Object Management Group, 2005. [16] Red Hat, Inc. Hibernate Reference Documentation v3.3.1, 2008. [17] Team, A. P. AndroMDA. [Online]. Available: http://andromda.org/. [Accessed: August, 2009], 2009.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 17–28, fev. 2010
Identifying Collaboration Patterns in Software Development Social Networks Taisa Alves Lacerda dos Santos1,2 Renata Mendes de Araujo1,2 Andrea Magalhães Magdaleno2,3 1
Department of Applied Informatics - Federal University of the State of Rio de Janeiro (UNIRIO), Rio de Janeiro, RJ, Brazil 2 NP2Tec – Research and Practice Group in Information Technology (UNIRIO) , Rio de Janeiro, RJ, Brazil 3 Federal University of Rio de Janeiro (UFRJ), COPPE – System Engineering and Computer Science, Rio de Janeiro, RJ, Brazil {taisa.santos, renata.araujo}@uniriotec.br,
[email protected] Abstract. Software development is a collaborative activity which involves the effective coordination of groups displaying variations in their skills and responsibilities. This paper argues that, by understanding the way collaboration is performed, participants and managers can better understand the development process in order to conduct their activities. This paper proposes an approach based on social networks analysis to identify collaboration patterns in software development process instances, which can be used as a resource for collaboration awareness and understanding. Keywords: Collaboration, software process, awareness, social networks. (Received October 30, 2009 / Accepted January 22, 2010) 1. Introduction Software development is characterized as a collaborative activity [9]. One of the main challenges of coping with collaboration both in distributed and colocated settings is how to make the work visible to all participants, making them aware of what is happening in the development process [1]. To face this challenge, proposals for collaborative support through computational tools have been suggested [5] [17] wherein collaborative supporting aspects are provided, such as coordination, communication, group memory, and awareness [1] [6] [8]. This work suggests that the social network [21] achieved as a result of software development interactions can provide information about the collaboration existing therein. However, only the view of a social network topology using visualization tools [1] [20] may not be enough to help participants and project managers to understand and analyze the collaboration level of the team. This work proposes the possibility of identifying collaboration patterns through the analysis of social networks properties. According to the collaboration patterns and with the help of social network visualization tools, developers and project managers will be able to interfere, change, redistribute or reflect about the process and work being conducted. This paper is organized as follows: Section 2 reports the research work on how to provide awareness in
software development processes; Section 3 summarizes social network properties and the tools that can be used to identify collaborative patterns; Section 4 discusses CollabMM as a reference for identifying collaboration levels in business processes; Section 5 presents preliminary essays in identifying collaboration patterns using social network properties based on CollabMM. Section 6 concludes the paper and outlines future work. 2. Awareness processes
in
software
development
In collaborative support, awareness can be defined as being conscious of the presence of other users and of their actions while interacting through applications [6][15][18]. Awareness aims to reproduce or even increase, in a virtual environment, the elements of a real, face-to-face interaction. To achieve this objective, awareness mechanisms can be used to represent, for instance, the presence of a group member, the position of each participant in the shared workspace, or even to distinguish each participant by using different colors [8]. These mechanisms are used to extend user awareness about information they cannot notice alone or information that they would possibly not consider as relevant for the work [16]. Based on a literature review, Araujo and Borges [1] proposed the classification of awareness information in software development in three categories: social, process and collaboration. Social awareness allows
30 Santos, T. A. L. dos et al. _______________________________________________________________________________________________________
users to recognize the group in which they are included for a possible interaction. Process awareness involves acknowledging the current process enactment state, the activities complete, the activities being performed, which activities are waiting to be performed by an individual and which should be performed by the entire group. Collaboration awareness focuses collaboration among group members, contributes to the understanding of their interactions inside the group and fosters future improvements in process interactions. All these kinds of awareness have been studied by different researchers and were implemented in collaborative tools to support group work in software development. The following paragraphs present examples of tool proposals for each type of awareness. The OpenMessenger tool [5] represents the social awareness by “tickets” considered a user’s photo avatar. Users can rotate their avatar to indicate how busy they are. An avatar in full view indicates that the user is available, and the more the picture is turned away, the busier the user is (Figure 1).
Figure 1 – Avatars and rotation in OpenMessenger [5]
The PIEnvironment [1] explores the possibility of extracting information about participants’ interactions from the software process models defined to be enacted in a workflow system. In this case, process awareness can be understood through the sequence of activities performed by the group; collaboration awareness is presented by modeling user interactions extracted from process enactment (Figure 2).
1), obtained from workflow tools, the social interactions which occurred in the work environment of a particular team can be understood and visualized. Figure 3 shows a social network mined from the event log in Table 1. The first graph (Figure 3a) shows the control-flow structure expressed in terms of a Petri net. The second (Figure 3b) is the organizational structure expressed in terms of an activity-role-performer diagram, and the last one (Figure 3c) is a sociogram based on transfer of work done. The approaches presented above rely on the possibility of collecting data to be presented as awareness information for process participants interacting through computational tools. They focus on how to provide development teams with the resources for being aware of the process they execute. These awareness resources play a fundamental role in helping people recognize and learn the way they actually work, as well as recognize problems and improvement possibilities. Table 1 – Event log [19]
Case Case 1 Case 2 Case 3 Case 3 Case 1 Case 1
Activity Activity A Activity A Activity A Activity B Activity B Activity C
ID John John Sue Carol Mike John
Timestamp 9-3-2004:15.01 9-3-2004:15.12 9-3-2004:16.03 9-3-2004:16.07 9-3-2004:18.25 10-3-2004:9.23
(a)
(b)
(c)
Figure 3 – Some mining results for the process perspective (a) and organizational (b and c) perspective based on the event log shown in Table 1 [19]
Figure 2 – Interaction graph (Collaboration awareness) [1]
Processes mining approaches [19] also strive to obtain, from the process execution data, the identification of interactions which may occur within process participants. According to an event log (Table
In this work, we discuss how this information can be used for understanding the levels of collaboration being achieved by a team. We claim that it is possible to identify collaboration patterns from the analysis of process interactions. These collaboration patterns can help participants and project managers to understand the different levels of collaboration existing and make decisions about changes on improving the process.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 29–38, fev. 2010
Identifying Collaboration Patterns in Software Development Social Networks 31 _______________________________________________________________________________________________________
3. Social networks The concept of ‘network’ is as simple as: a set of links among nodes. A social network means the set of links among people [21], where a node represents an actor and links among actors1 represent possible relationships among them. The semantics of a link depends on which analysis we wish to conduct. This can be communication, relationship, friendship and so on. Social network analysis is a way to understand the interaction and social organization within a group [3]. 3.1. Social networks properties Social networks can be examined through the analysis of its properties [21]. For the purpose of this work we have selected an initial set of properties which we believe have potential to provide information about collaboration patterns. Properties related to actor centrality are based on the links one actor bears with other actors [21]. Therefore, each actor has a value within the network which can be considered when comparing it to the other nodes. These properties render the node more visible to other actors. There are three types of actor centrality [21]: Degree centrality: the degree centrality of the actor is measured by the inputs and outputs of the node, i.e. it sums the number of its relationships [7]. The actor with high degree centrality will be in direct contact with more actors, occupying a central role in the network. The node which has the greatest value is called a central node. Central nodes in a network are called hubs [3] [21]. The Figure 4 is an example of a social network with four actors. Node 3 is the central node because its degree centrality is equal to 3.
Figure 4 – Degree centrality
Betweness centrality: measured by the number of times a node appears in the path of other nodes [7]. Actors which are between two nodes which are not neighbors have control over the link between them [21]. To have high betweness centrality, an actor must be in the path of different actors. In the Figure 5 the node 2 is the actor with the higher betweness centrality because it is on the way of 1, 3 and 4 actors. In spite of node 4 is between nodes 2 and 5, node 4 has a smaller betweness centrality than node 2. 1 For the purpose of this work, the terms actor and node will be used as synonyms.
Figure 5 – Betweness centrality
Closeness centrality: calculated by the inverse of the sum of distance between one source node to different destination nodes [7]. This property is based on distance and represents how close or far an actor is to other nodes [21]. A central node, for instance, can interact quickly with other nodes and can be highly productive in information sharing with the overall group, as they have a fast communication path with other nodes. In the star network, presented in Figure 6, the node 2 is adjacent to all others. Therefore is has maximum closeness centrality – starting from node 2, any other node can be reached following just one link.
Figure 6 – Closeness centrality
3.2. Social network tools The properties presented in the last session can be used as a basis for social network mining and visualization tools [1][14][20]. The SVNNAT tool gives evidence of collaboration awareness to software development over the data analysis of data extracted from Subversion (SVN) configuration system [14]. Figure 7 shows an example of a social network exported from SVNNAT showing developers as nodes and links between them meaning that they modified the same code artifact. This is network showing a technical aspect of the development, focusing on process and collaboration awareness. The OSSNetwork tool [1] extracts from open source development communities the interactions which occur among group members and the source code, mail lists and forums. Figure 8 shows the social network exported and the properties for this network provided by the tool. In this social network, the nodes represent the developers and the links represent theirs communication to discuss or solve a bug. This network has information that can be used to provide social and collaboration awareness.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 29–38, fev. 2010
32 Santos, T. A. L. dos et al. _______________________________________________________________________________________________________
The MiSoN tool (Figure 9), which is part of the ProM framework [20], is used to mine social networks extracted from workflow event logs, the members are represented by nodes and the links are the activities performed by them. This network presents mainly process awareness information. The event log (Table 1) is the input data used to generate the social network. This tool allows the analysis of mined networks using the above mentioned properties. The visualization of a social network topology and the availability of its properties provided by these tools are relevant information to allow for understanding relationships in a work group setting. However, they are not enough to understand the level of collaboration therein. In this work, we argue that social network topology and properties can be associated to different levels of collaboration maturity. In order to evaluate that, a collaboration maturity framework – CollabMM – was used and will be detailed in the next section. Figure 7 – SVNNAT tool [14]
(b)
(a)
Figure 8 – OSSNetwork tool [1] (a) Social network mined, and (b) properties to analyze social networks.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 29–38, fev. 2010
Identifying Collaboration Patterns in Software Development Social Networks 33 _______________________________________________________________________________________________________
Figure 9 – MiSoN tool [20]
4. The CollabMM model Magdaleno et al. [11] proposed a collaboration maturity model for business processes – CollabMM that aims to organize a set of practices which can enhance collaboration in business processes. CollabMM describes an evolutionary path in which processes can progressively achieve higher capability on collaboration, organized in four maturity levels: Ad-hoc, Planned, Aware and Reflexive, as shown in Figure 10. Levels are
a way of prioritizing practices for improving collaboration in a process, according to the collaboration support aspects (communication, coordination, group memory and awareness). A specific level comprises a group of related activities which can be executed together, aiming at improving process collaborative capability (Figure 10). The CollabMM collaborative levels can be summarized as follows:
Figure 10 – CollabMM model [11]
• Ad-hoc level: In this level, collaboration is not explicitly represented in a process. However, processes in this first level cannot be featured as with total absence of collaboration. Collaboration may happen, but it is still dependent on individual initiative
and skills, and its success depends on the relationship and/or affinity among people. The aspects of communication, coordination, group memory and awareness are present, but they occur in an ad-hoc manner. Figure 11 presents a metaphor of individual effort where people do not act like a group.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 29–38, fev. 2010
34 Santos, T. A. L. dos et al. _______________________________________________________________________________________________________
are aware of the way in which the group collaborates during process execution, while process tacit knowledge is shared through ideas, opinions and experiences, thereby enhancing group memory. Figure 11 – Metaphor for Level 1 – Ad-hoc [11]
• Planned level: In this level, business processes start to be modified aiming at including basic collaboration activities. The coordination is a strong aspect in this level. This coordination is mostly centralized where groups need leadership and management in order to work well. Work groups – created to execute a project, process or a specific activity – are formally established. Figure 13 shows a metaphor for this level.
Figure 12 – Metaphor for Level 2 – Planned [11]
• Aware Level: In this level, the process includes activities for monitoring and controlling how collaboration occurs. Centralized coordination is not highly relevant, since group members are aware of their tasks and responsibilities and are committed towards them. Group members understand the process in which they are engaged and, its main objectives, as well as their roles and responsibilities and how their activities are related with others to perform these objectives. Additionally, processes at this level are characterized for decentralized coordination and shared knowledge, mainly through the artifacts produced by the group. Figure 14 shows a metaphor for this level.
Figure 13 – Metaphor for Level 3- Aware [11]
• Reflexive level: In the reflexive level, processes are designed to provide self-understanding, identifying the relevance of the results which had been produced and sharing this knowledge inside the organization, this can be represented by metaphor of collective disseminated effort in Figure 14. Considering communication, processes must be formally concluded and their results communicated. Lessons learned can be captured; strengths and weaknesses are analyzed; successes and challenges are shared; ideas for future improvements are collected; and workgroup results are published and celebrated. Group members
Figure 14 – Metaphor for Level 4 – Reflexive [11]
CollabMM has been used to assist organizations in introducing different levels of collaboration in their business process models [12]. It also has been discussed as a framework for assessing collaboration levels in a business process [11]. Our aim in this work is to use CollabMM as a guide, based on the properties of the social network produced in a development process, for identifying collaboration patterns or levels, as discussed in the next section. 5. Identifying collaboration patterns The purpose of this section is to discuss our hypothesis on how collaboration patterns can be identified from social networks, reviewing and detailing previous ideas presented in [13]. The main idea is that social network properties can be associated to the characteristics of the different collaboration levels suggested by CollabMM. Before validating our hypothesis, we needed to perform exploratory studies to understand if the information obtained through the analysis of social networks properties could lead us to a shallow classification of the network into collaboration levels. Further analysis would be necessary to discuss deep relationship among social network properties values, collaboration patterns and the community or team development maturity. All social networks exemplified in this paper were mined since September 2009 from Sourceforge.net (http://sourceforge.net/). The software development projects selected should meet the following criteria: more than 5 years of community activity; and an expressive number of downloads, characterizing their stability as development communities. From these mined social networks, developer’s interactions can be perceived using information obtained from the online source code repository history. In these networks, the nodes represent the developers and the relationships are established among the individuals who work in the same part of the code, modifying it. The SVNNAT tool [14] was used to mine and visualize these social networks that represent the intrinsic collaboration.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 29–38, fev. 2010
Identifying Collaboration Patterns in Software Development Social Networks 35 _______________________________________________________________________________________________________
Degree, betweness and closeness centralities are some social network properties which emphasize the coordination aspect and can prove our hypothesis. The CollabMM levels explore the coordination as a strong aspect of collaboration and the collaboration patterns will be divided according to these levels. The social networks classified as planned level will be fitted in the collaboration pattern that seeks for centralized coordination. In other words, in this type of network there is the presence of hubs, which predominate in the three centralities properties (degree, betweness and closeness), and give evidence of the “winner takes all” pattern [3]. This pattern describes the idea that a single node becomes so strong that it may dominate the network. In the aware level, social networks will show a decentralized coordination. This type of network will have more than one hub, so new central nodes will appear according to degree, betweness and closeness centralities. For example, in open source projects, the core development team shares the coordination of the project. In the reflexive level, the coordination tends to be distributed and the figure of the central node disappears. The degree, betweness and closeness centralities values are too close to each other, where the existence of hubs is not clear. 5.1. Ad-hoc level In this level of collaboration, social networks may not show specific collaboration patterns. The relationships among the participants of this network vary extensively, with instability and, possibly, lack of patterns for analysis. 5.2. Planned level As described in the CollabMM model, the coordination is an important aspect of the planned level. Coordination at this level is characterized by strong leadership and management in order to guide all the work. The collaboration pattern of this level is characterized from the degree, betweness and closeness centrality properties. The existence of a strong central node or hub may characterize the network at this level as a centralized social network [4]. The social network obtained for the devkitPro project (Revision 3846) is an example. Figure 15 shows the devkitPro social network and the Table 2 details the data analyzed for each of nodes that represent the devkitPro social network. As noticed by the properties in Table 3, wntrmute is a central node that stands out among the rest. This characterizes this developer as a key node for project and work coordination, like a project administrator.
Figure 15 – DevKitPro Social Network Table 2 – DevKitPro project data
Actor wntrmute shagkur dovoto tantricity
Degree 828.00 272.00 259.00 256.00
Betweness 60.00 0.66 2.50 0.66
Closeness 100.00 61.90 61.90 61.90
5.3. Aware level At this level, the group members are aware of their tasks and responsibilities and can act more autonomously. So, the main characteristic of the aware level is the existence of more than one hub, differently of the previous level, a few number of nodes distinguish according to betweness and closeness centrality. The coordination becomes decentralized, since the existence of more than one actor representing the central node [4]. Figure 16 illustrates this collaboration pattern and the software project WinMerge data (Revision 7044) is detailed below in the Table 3. The nodes kimmov and gerund are hubs according to degree, betweness and closeness centrality. This emphasizes the collaboration pattern that represents the aware level. If we compare this observation with the information provided in the development community, we will find that kimmov is classified as the project administrator and gerund as the web designer, roles which share the coordination of the development group. 5.4. Reflexive level In the reflexive level, the main characteristic is knowledge exchange and the self-understanding about the group work. The collaboration pattern that represents this level can be perceived by the absence of hubs. Different nodes have very close degree, betweness and closeness centralities values.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 29–38, fev. 2010
36 Santos, T. A. L. dos et al. _______________________________________________________________________________________________________
Figure 16 – WinMerge Social Network Figure 17 – NHibernate Social Network Table 3 – WinMerge project data
Actor
Degree
kimmov gerundt puddle elsapo laoran
732.00 567.00 547.00 537.00 535.00
Betwenes s 12.92 12.92 1.75 1.75 1.75
Table 4 – NHibernate project data
Closeness
Actor fabiomaulo justme84 ayenderahien darioquintana kevinwilliams fabiomaulo
100.00 100.00 88.88 88.88 88.88
Figure 17 shows the NHibernate project social network (Revision 4898) that may represent a network at the reflexive level and Table 4 details the values of its nodes properties and demonstrates the reflexive social network level has a distributed coordination between nodes [4]. In NHibernate project, different developers and administrators have centrality’s values close to each other.
Degree 687,00 644,00 596,00 564,00 561,00 687,00
Betweness 4,86 4,86 4,86 4,86 1,16 4,86
Closeness 100,00 100,00 100,00 100,00 79,16 100,00
Table 5 summarizes our hypothesis of the relationship between social network properties and CollabMM maturity levels. Each of the social networks properties emphasizes the aspect of coordination. The various levels of the CollabMM model attend the coordination aspect in different ways which is also the goal of our proposal.
Table 5 – Collaboration patterns as CollabMM levels
Level Planned Aware Reflexive
Degree Centrality Single central node Few central nodes No central nodes
Betweness Centrality Single central node Few central nodes No central nodes
6. Conclusion In this paper, we discuss the potential of social networks analysis in the identification of collaboration patterns in software development processes. Our aim is to contribute to research related to the understanding of collaboration in different development models – in house/distributed, disciplined/agile/open source arguing that the understanding of collaboration can be a way to promote balance between these different
Closeness Centrality Single central node Few central nodes No central nodes
approaches, as well as, a tool for management purposes [10]. As future work, it will be necessary to evaluate and detail our hypothesis by conducting different analysis over different development process settings. Further, the information about collaboration patterns derived from this analysis can be used as input for development process enactment or management tools in order to help managers and participants to be aware of collaboration they participate in.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 29–38, fev. 2010
Identifying Collaboration Patterns in Software Development Social Networks 37 _______________________________________________________________________________________________________
Acknowledgments This work is partially funded by CNPq under process nº. 142006/2008-4 and by FAPERJ (Carlos Chagas Filho Foundation for Research of the State of Rio de Janeiro) under process nº. 101.252/2008.
[10]
References [1] ARAUJO, R. M., BORGES, M. R. S., 2007, "The role of collaborative support to promote participation and commitment in software development teams". Software Process: Improvement and Practice, v. 12, n. 3, pp. 229246. [2] BALIEIRO, M. A., SOUSA JÚNIOR, S. F. DE, DE SOUZA, C. R. B., 2008, “Facilitating Social Network Studies of FLOSS using the OSSNetwork”, In: Open Source Development, Communities and Quality: International Conference on Open Source Systems, Milan, Italy. Proceedings of International Conference on Open Source Systems. Spring Street, NY: IFIP International Federation for Information Processing, a Springer Series in Computer Science, 2008. p. 343-350. [3] BARABASI, A. L., 2002, Linked: The New Science of Networks. Cambridge, MA, Perseus Publishing. [4] BARAN, P., 1964, “On distributed communications network”, IEEE Transactions on Communications, Vol. 12, No. 1, March, pp. 1-9. [5] BIRNHOLTZ, J. P., GUTWIN, C., RAMOS, G., WATSON, M., 2008, “OpenMessenger: gradual initiation of interaction for distributed workgroups”, In: Proceedings of CHI’2008, pp. 1661-1664. [6] DOURISH, P., BELLOTTI, V., 1992, "Awareness and coordination in shared workspaces". In: Proceedings of the 1992 ACM conference on computer supported cooperative work, pp. 107114, Toronto, Ontario, Canada, November. [7] FREEMAN, L. C., 1979, "Centrality in Social Networks: I. Conceptual Clarification." Social Networks, 1, 215-239. [8] GUTWIN, C., PENNER, R., SCHNEIDER, K, 2004, “Group awareness in distributed software development”. In: Proceedings of the ACM Conference on Computer Supported Cooperative Work, CSCW, p 72-81, Computer Supported Cooperative Work - Conference Proceedings, CSCW 2004. [9] KRISHNAMURTHY, B., NARAYANASWAMY, K., 1994, “CSCW 94 Workshop to Explore
[11]
[12]
[13]
[14]
[15]
[16]
[17]
Relationships between Research in Computer Supported Cooperative Work & Software Process – Workshop Report”, Software Engineering Notes, v. 20, n. 2 (Abr.), pp.34-35. MAGDALENO, A. M., WERNER, C.M.L., ARAUJO, R. M., 2010, “Balancing Collaboration and Discipline in Software Development Processes”. In: International Conference on Software Engineering (ICSE) Doctoral Symposium, Cape Town (to appear). MAGDALENO, A. M., ARAUJO, R. M., BORGES, M. R. S., 2009, “A maturity model to promote collaboration in business processes”. In: International Journal of Business Process Integration and Management, v. 4, p. 111-123 MAGDALENO, A. M., CAPPELLI, C., BAIAO, F. A., SANTORO, F. M., ARAUJO, R. M., 2008, “Towards Collaboration Maturity in Business Processes: An Exploratory Study in Oil Production Processes”. In: Information Systems Management, v. 25, p. 302-318. SANTOS, T. A. L., ARAUJO, R. M., MAGDALENO, A. M. “Padrões para Percepção da Colaboração em Redes Sociais de Desenvolvimento de Software”. In: Workshop de Desenvolvimento Distribuído de Software (WDDS), 2009, Fortaleza, Brasil (in portuguese). SCHWIND, M., WEGMANN, C., 2008, “SVNNAT: Measuring collaboration in software development networks”, In: Proceedings 10th IEEE Joint Conference on E-Commerce Technology and the 5th Enterprise Computing, ECommerce and E-Services, CEC 2008 and EEE 2008, p. 97-104. SOHLENKAMP, M., 1998, “Supporting group awareness in multi-user environment through perceptualization”. Paderborn: Fachbereich, Mathematik-Informatik der Universität Gesamthochschule. STOREY, M-A. D., CUBRANIC, D., GERMAN, D. M., 2005, “On the use of visualization to support awareness of human activities in software development: A survey and a framework.” In Proceedings of 2nd ACM Symposium on Software Visualization, 2005, St. Loius, Missouri, USA. TEE, K., GREENBERG, S., GUTWIN, C., 2006, “Providing artifact awareness to a distributed group through screen sharing”, In: Proceedings of the ACM Conference on Computer Supported Cooperative Work, CSCW, p 99-108, Proceedings of the 20th Anniversary ACM Conference on Computer Supported Cooperative Work, CSCW 2006.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 29–38, fev. 2010
38 Santos, T. A. L. dos et al. _______________________________________________________________________________________________________
[18] TOLLMAR, K., SUNDBLAD, Y., 1995, “The Design and Building of the Graphic User Interface for the Collaborative Desktop”, In: Computers & Graphics. 19(2). 179-188. [19] VAN DER AALST, W., REIJERS, H., WEIJTERS, A., DONGEN, B., MEDEIROS, A., SONG, M., VERBEEK, H., 2007, “Business Process Mining: An Industrial Application”, Information Systems, v. 32, n. 5, pp. 713-732. [20] VERBEEK, H.M.W., DONGEN, B.F. van, MENDLING, J., AALST, W.M.P. van der, 2006, “Interoperability in the ProM framework”. In: Proceedings of the EMOI-INTEROP Workshop at the 18th International Conference on Advanced Information Systems Engineering (CAiSE'06), pp. 619-630, Namur University Press. [21] WASSERMAN, S. and FAUST, K., 1994, Social Network Analysis, Cambridge University Press, New York, NY.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 29–38, fev. 2010
Distributed Software Development with Captive Centers Rafael Prikladnicki, Jorge Luis Nicolas Audy Faculdade de Informática (FACIN) Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS) 90.619-900 – Porto Alegre – RS – Brasil {rafaelp, audy}@pucrs.br Abstract. In this paper, we describe a capability model that captures patterns of evolution in the practice of distributed software development in internal offshoring projects. In our research we seek to understand how the practices of organizations involved in the internal offshoring of software development evolve over time, from a software engineering perspective, and from the point of view of the subsidiaries (als known as captive centers). We propose a capability model that encompasses the evolution of software development activities within and among several subsidiaries owned by an organization. This model can be useful for those companies beginning DSD operations with captive centers, and can benefit from the knowledge and practices that had been applied by organizations in the past Keywords: Distributed software development, global software engineering, internal offshoring (Received October 30, 2009 / Accepted January 22, 2010) 1. Introduction The Software Engineering (SE) community has witnessed a significant change in the way software projects have been developed in the last years [1, 20]. Software project teams have become geographically distributed [2, 3], and the term Distributed Software Development (DSD) is frequently used in industry [21, 22]. However, the search for competitive advantage in DSD has forced organizations to seek external solutions in other countries, leading to what has been referred to as Global Software Development (GSD) or Global Software Engineering (GSE) in the literature, also known as offshore sourcing, or offshoring [4]. To embark on a DSD journey, a company usually defines a strategy based on a DSD business model. The two main models in a global development context include offshore outsourcing (contracting services with an external company) and internal offshoring (contracting with a wholly owned subsidiary – also known as a captive center). In our research, we seek to understand and identify patterns in the evolution of practices of organizations involved in internal offshoring. This is particularly useful for companies starting DSD operations, and which would benefit from knowledge of which practices were applied in other organizations. One example is the decision made about the type of projects to be developed offshore. Usually, offshore software development centers are responsible for the coding activity when companies begin a DSD operation [4]. As time goes by, trust between the headquarters and
the offshore location is developed, the relationship is improved, and offshore activities may become more complex. On the other hand, an inexperienced organization might consider that the entire project lifecycle can be developed offshore, scenario that could lead to failures and significant problems instead of cost savings and other benefits. In our research we develop a model that describes patterns of evolution in the practice of distributed software development activities within and among several subsidiaries owned by an organization. In our model this evolution has been organized in a threedimension model: capability levels, capability attributes, and the type of attributes (people, project, portfolio and subsidiary). To develop our model, an extensive literature review was conducted to develop an interview guide, our main data collection instrument. An exploratory case study was planned and executed, where several people were interviewed in five companies in order to understand how software engineering activities are performed in different DSD settings, and particularly in the internal offshoring of software development. Using content analysis methods, attributes of evolution were identified and classified. In the subsequent phase, another case study (confirmatory) was developed in order to evaluate the attributes and their evolution patterns. Based on the results of this case study, as well as a systematic review on studies about models of DSD evolution and the internal offshoring of software development, our model is presented in this paper. We first begin by providing background information and related work in the area DSD business models, and
40 Prikladnicki, R.; Audy, J. L. N. _______________________________________________________________________________________________________
in particular internal offshoring, in Section 2. Section 3 describes our research methodology. We describe the capability model in Section 4. In Section 5 we discuss practical implications and limitations, and conclude the paper in Section 6. 2. Background and related work Many companies are distributing their software projects both locally and globally, aiming at cost reduction, access to skill resources, flexibility, and competitive advantages. For these reasons, DSD has attracted a large research effort in software engineering, and the factors that contributed to DSD have been well documented in literature [3, 4, 5, 6, 7, 8, 9, 10]. According to Herbsleb & Moitra, DSD has diverse effects on many levels: strategic issues (decision for developing a distributed project); cultural issues; technical issues (technological infrastructure, and technical knowledge); and knowledge management issues [8]. When organizations explore DSD, it is important to correctly characterize the distribution. As described by Kumar & Willcocks, DSD options are classified based on the geographic location of the personnel, and the relationship of the organizations involved [11]. Later, Robinson & Kalakota refer to these options as DSD business models [12], and four of them are summarized in Prikladnicki et al [13]. Two business models are considered in our research: - Offshore Outsourcing: involves a relationship with an external company (outsourcing) for software development, and this company is not located in the client’s country (offshore). - Internal Offshoring: a company creates its own software development center (subsidiary, also known as captive center) to supply the internal software demand (insourcing). This subsidiary is located in a different country than the company’s headquarters. Our research is focused on the study of the internal offshoring business model. 2.1. Internal offshoring Offshore outsourcing is a well-known practice being adopted by companies to cut operational costs and gain competitive advantage [4, 25, 40]. But according to Ramamani [14], another common practice is setting up wholly owned subsidiaries for software development in low cost countries (internal offshoring). This way, companies retain their operations “in-house”. As an example, from the 900 companies that are member of NASSCOM (National Association of Software Companies) in India, more than 300 are wholly owned subsidiaries [14]. Ramamani says “the internal offshoring is a vertically integrated model where the operation is “in – house” and does not involve dependency on complex contracts with external agents. The basic rationale
behind having an offshore subsidiary is that of vertical integration – where it is desirable to own the residual rights rather than indulge in cumbersome contracts which increases sourcing complexities. However, by integrating the operations firms do not automatically derive the benefits of success. The nature of the capabilities that the subsidiary develops over time and how the subsidiary capabilities fit into the value – chain of the parent company governs the sourcing effectiveness” [14]. According to Herbsleb [15], the processes employed in offshore outsourcing might be different than those employed in internal offshoring, and the characterization in this case could make a difference for the practice of DSD. Moreover, research conducted in one type of distribution is not necessarily valid for all types of DSD. Research, however, has not addressed this model sufficiently in the literature, and the possible impacts on SE activities [42]. Our research is then focused on the study of the internal offshoring of software development from an evolutionary perspective. 2.2. Patterns of evolution Patterns of evolution, in our study, mean a set of standard steps (or stages) that were successfully followed in the past by individuals, project teams, or organizations, and were documented and shared to being followed by other peers as a successful software development practice. Carmel [16] defines stage models as powerful frameworks in understanding a phenomenon, because they capture evolution and growth, and also reflect learning curves and diffusion. Carmel argues that such models are useful for both research and practice: practitioners can use stage models to understand where they are, where the competition is, and what they can do to evolve. On the other hand, researchers can not only identify and propose the patterns, but also use it to better understand the behaviors behind a given phenomenon. Such evolution patterns (or stages) can also be defined as maturity and capabilities levels in an evolution model. Chrissis et al [17] define capability as the predictability of the process and its outcomes, or the range of expected results that can be achieved by following a process. Maturity is defined by the authors as the growth in the process capability, a well-defined evolutionary path toward achieving a mature process, where each maturity level provides a layer in the foundation for continuous process improvement. Achieving each level of a maturity framework means an increasing in the process capability. But despite the utility of such models, they have always been easy target for criticism, as stated by Carmel [16]. Some critics include that they are heuristically developed, usually not validated, they are incomplete, and assume a linear evolution through each
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 39–48, fev. 2010
Distributed Software Development with Captive Centers 41 _______________________________________________________________________________________________________
stage. While the critics are valid, the author also states that, in the end, the collective understanding of a phenomenon would be poorer if these patterns are not identified. In addition, the author also argues that these models are more useful at early stages of the phenomenon. Once the phenomenon is mature, the interest is not so evident. The use of evolution patterns or stage models is not new in Computer Science. They are also common and can be found in the Social Sciences, where Bruce Tuckman proposed a well-known model [18]. Tuckman developed a model to describe the stages (or sequences) of group development. In Computer Science, within the Information Systems domain, one of the first stage models was proposed by Richard Nolan, with the purpose to analyze the evolution of managing the computer resource [19]. In software engineering, the work of Nolan’s has been influential in the development of models such as the SW-CMM and CMMI [17], among many others. During the development of his work, Nolan also argues that stage theories have proved useful for developing knowledge in diverse fields during their formative periods, which is exactly the case of Distributed Software Development [19]. 3. Research Methodology To develop our model, we have followed a two-phase research methodology (Figure 1). The first phase included an ad hoc literature review and an initial case study [13, 35]. The second phase involved a systematic review of the DSD literature [38] and a second case study [35, 41].
Figure 1. Research design
In the first phase specifically, we conducted an informal literature review on DSD, DSD business models, and DSD evolution / maturity / capability models. Few studies were found [12, 23, 31], focusing
mainly on business strategies and decision, none of them focusing DSD evolution from a software engineering perspective, which reinforced the necessity of collecting empirical data through a case study. We then designed a case study [32] and executed it with five subsidiaries [13], with the overall goal of understanding the differences among several DSD business models. From this case study, a subset of data from three subsidiaries (those involved in the internal offshoring model) was selected for further analysis. The interviews in Canada were conducted in February, 2006, and in Brazil were conducted in April-May, 2006. In this paper, we describe our analysis in relation to the development of the model of DSD evolution. Our main purpose was to identify a preliminary list of evolution attributes for the internal offshoring of software development. To do so, the interviews were planned to identify characteristics of evolution of DSD practices in the organizations studied. The overall research question was: “What are the critical attributes from an evolutionary point, in the internal offshoring model, when it is employed by way of captive centers?” As a way of example, a number of attributes we identified include cultural differences, risk management, configuration management, software process improvement (also shown in the model proposed in Figure 3). Once a set of evolution attributes have been identified, the second phase in our research was planned, and involved a systematic review of the DSD literature, and another case study. The purpose of the systematic review was to gain a deeper understanding of the existing models of DSD evolution, in order to aid in the development of our model. The initial literature review, conducted in 2004 and 2005, did not identify useful studies. The rapid development of the DSD area, however, is a reason for which we aimed at identifying newer research useful in our development of the model. In addition, the concept of systematic review was introduced to the area of SE for the first time in 2004 [29]. For this reason, we decided to conduct this systematic review in DSD. A final step in our second phase and the overall methodology was the deployment of a second case study, with the purpose of evaluating the attributes identified in the first phase, together with their evolution patterns. The case study included three subsidiaries: two in Brazil (one owned by an American organization and one owned by an organization from Portugal), and one in India (owned by a German organization). The purpose was the identification of the evolution of each attribute, based on the experience of each subsidiary. Respondents were selected based on their experience within the company. The instrument was a questionnaire organized by attributes and their evolution sequences. For each
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 39–48, fev. 2010
42 Prikladnicki, R.; Audy, J. L. N. _______________________________________________________________________________________________________
attribute, the respondent was asked to order sequences, justify the reasons for such evolution, and comment if the evolution should be different in the future. As an example, we illustrate how the case study was planned showing the attribute “configuration management” (CM) and its possible evolution steps: - ( ) No CM infrastructure - ( ) Local CM infrastructure - ( ) Global CM infrastructure, but not integrated - ( ) Global and integrated CM infrastructure The respondents were asked to order the evolution reflecting the experience of their own subsidiary, as well as report their experience within the subsidiary. In total, 41 people agreed to answer the questionnaire (14 from the subsidiary owned by the American organization, 10 from one owned by the organization from Portugal, and 17 from the Indian one), and we received 31 valid answers. In this paper we describe the model of which dimensions and elements have been developed through the analysis of this rich data.
evolution is dependent not only on people, but also on the context of the projects, a set of projects, and even the entire subsidiary in discussion. We also identified that in the early stages of the evolution the effort is more concentrated on the capability of attributes related to people and projects. And this differs to the offshore outsourcing model, where the initial investment is more concentrated on the organizational level.
4. The Capability Model - WAVE Figure 2. Generic model structyre
The model (named as WAVE because of the different waves of capabilities companies should have) was proposed with the purpose of help companies initiating their operations in internal offshoring and explores three of the main differences of this model in comparison with offshore outsourcing [39]: the initial investment on training people, the long-term relationship between headquarters and subsidiary, and the integrated work among all subsidiaries [39]. For this reason, the model proposes three dimensions: capability levels, capability attributes, and the type of attributes. 4.1. Model dimensions and data analysis In this section we describe the first version of this model that was developed from the data we collected. We describe the model, together with the data analysis techniques we employed. The model has three dimensions: type of attributes, capability attributes, and capability levels, and Figure 2 shows the layering of its levels. To identify each of the elements of these three dimensions, we used data from the different sources in our research. Table 1 indicates the data source from which each dimension was identified: initial literature review (ILR), systematic literature review (SLR), case study 1 (CS1), or case study 2 (CS2), and we describe these dimensions in detail next. Type of attributes: based on the interviews and the literature review, it became clear that the evolution was based on different perspectives, the role, the experience, and the responsibilities involved. For this reason, four types of attributes were considered: people, project, portfolio, and subsidiary. We identified that the
Table 1. Model structure and source
Dimension Type of attributes Capability attributes Capability levels
Source ILR, CS1 CS1, CS2 SLR, CS1, CS2
Capability levels: four levels were defined, inspired on the eSCM model [34], and on the results of the interviews, following the same strategy for content analysis explained previously. Level 1 is the initial level of every subsidiary and characterized by ah hoc development. There are no specific practices defined. In level 2 a company usually has one subsidiary and basic capabilities need to be developed in order to better sustain not only individual capabilities to deal with DSD challenges, but also the development of projects. Generally, this involves local training programs, training for a specific group of project team members, and the improvement on software engineering and project management techniques on demand. Sometimes, due to strategic reasons, more than one subsidiary can be created at the same time, but based on what we found, this is not recommended from a technical perspective, since the organization will need to spend a lot of effort in the beginning synchronizing the work and defining the role of each subsidiary. Our proposal suggests the creation of one subsidiary as the first step, which is corroborated by recent experience reports [24, 33]. Once the organization experiences the challenges of working with a distributed
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 39–48, fev. 2010
Distributed Software Development with Captive Centers 43 _______________________________________________________________________________________________________
subsidiary, it can plan the creation of other independent subsidiaries or subsidiaries that are dependent on each other. But this is recommended for levels 3 and 4. In level 3, the basic capabilities are improved. The training executed for a specific demand might be incorporated in the subsidiary training program, or a specific process might be improved to be used by other
projects and portfolios. Finally, in level 4 there is a constant motivation for the development of subsidiary performance, in an organizational level, where organizational standards are put in place. Table 2 presents the relationship between the type of attributes and the capability levels.
Table 2. Relationship between capability levels and type of attributes
Type of attributes
Capability levels 1 – ad hoc
2 – basic capabilities
3 – improvement
4 – integration
People
Isolated improvement of people’s capability based on their own initiative
Local improvement of people’s capability
Sporadic integration among people’s capability
Global integration and improvement of people’s capability
Projects
Ad hoc development of projects
Development of basic capabilities within projects
Sporadic global development of projects
Global and integrated development of projects
Portfolio
Informal project portfolio management within the subsidiaries
Local project portfolio management
Integrated project portfolio management initiated
Global project portfolio management established
Lack of standard in the operation of the subsidiaries
Development of basic capabilities within the subsidiaries
Informal integration of the operations of the subsidiaries
Global and formal integration of all subsidiaries
Subsidiary
Capability attributes: a list of attributes was identified. To identify the attributes, we analyzed the interviews conducted in the case study 1. Content analysis was employed, with category identification, until we reached category stability [36, 37]. First, a careful reading on the data was performed, to familiarize the researcher with the data collected before starting the coding. After that, we started the coding in all interviews, following a two step approach. The first step was the classification of subsets of all interviews where the respondents referred to evolution. Then, a new text was generated, and a new coding started (second step). Categories were then identified, representing specific attributes. For data analysis, we used Microsoft Excel and Atlas. Table 3 lists the attributes, together with the type of attribute. In total, 25 attributes were selected to be part of the model. The attributes “levels of dispersion” and “project activities (dependency)” was not used directly into the proposed model, but to identify the evolution of subsidiary interdependency. The revised model showing the identified attributes, its type and the capability levels is illustrated in Figure 3. It has 73 practices (35 in level 2, 25 in level 3, and 13 in level 4); 29 practices are related to people, 28 are related to projects, 10 are related to portfolio, and 6 are related to the subsidiary.
Table 3. Capability attributes # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Attribute Cultural differences Trust acquisition Awareness of activities Awareness of process Awareness of availability Knowledge management Levels of dispersion Learning Training Perceived distance Requirements engineering Communication tools Collaboration tools Infrastructure Project mgmt activities Sw development life cycle Risk management Effort estimation Configuration management Project activities (dependency) Type of projects Project allocation PMO Sw process improvement Policies and Standards
Type People People People People People People People People People People Project Project Project Project Project Project Project Project Project Project Portfolio Portfolio Portfolio Subsidiary Subsidiary
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 39–48, fev. 2010
44 Prikladnicki, R.; Audy, J. L. N. _______________________________________________________________________________________________________
4.2. Evolution of attributes In the model, the detailed evolution of each attribute is shown as practices that follow the same pattern that is found in existing models, such as the CMMI-SW. These practices assure a gradual and coherent evolution of a company and its subsidiaries. Unfortunately we don’t have enough pages to present all the practices in detail. For this reason, we have selected two capability attributes as examples: “training”, and “collaboration tools”. The notation used to document the details of each attribute is presented as following:
Attribute (): () Purpose: () Practice :
In order to define in which level a given practice is located, we have used the data from the case study 2. This way, while practices found in the least experienced company formed the practices proposed for level 2, practices identified in the most experienced company formed the practices proposed for level 4.
Figure 3. The proposed model
Attribute (people): Training (Trai) Purpose: To understand the training needs within the subsidiaries, looking for improvements in the training offered to support distributed activities. (2) Practice 1: Training is technical and non-technical, on demand (2) Practice 2: There is a program within each subsidiary that provides technical and non-technical training (3) Practice 3: There is a global programa that supports technical and non-technical training needs
For this attribute the purpose is to improve the training policy for distributed teams. In this case, companies have to invest on technical and non-technical training at level 2, evolving towards a global training program (when more than one subsidiary is created). For this attribute the purpose is to leverage the usage of collaboration tools in the context of distributed projects developed within the subsidiaries. In this case, we have identified that in the beginning of an internal offshoring operation companies don’t have a formal plan to use such tools, or use them on demand, which
characterizes the first practice proposed. The natural evolution of this attribute is the definition of standard collaboration tools within each subsidiary (level 3), and after that the usage of the same tools in a global scale (level 4). Attribute (projects): Collaboration tools (Colb) Purpose: To leverage the usage of existing collaboration tools seeking for improvements in the activities performed by distributed teams (2) Practice 1: There are collaboration tools on demand or communication tools are used to collaborate (3) Practice 2: There are standard tools to support collaboration within each subsidary (4) Practice 3: There are standard tools in a global level to support collaboration among distributed team members
It is important to notice that the model proposed, its capability attributes and practices were proposed based on a limited sample of companies and subsidiaries. It is the description of how certain practices evolve over
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 39–48, fev. 2010
Distributed Software Development with Captive Centers 45 _______________________________________________________________________________________________________
time. For this reason, improvements and updates are expected with its usage. 5. Discussion and main contributions The model proposed in this paper has the purpose of helping subsidiaries of organizations starting global operations in the internal offshoring business model. The main contribution of this model is a set of evolution steps to guide one or more subsidiaries in several tasks related to their daily activities, from a software engineering perspective. The main motivation on the development of this capability model was the lack of studies on the internal offshoring of software development, from a technical perspective, and the difficulties of existing companies on dealing with the natural evolution of software development activity in this environment. This is the first model in this context, and has important contributions to the GSE theory and practice. 5.1. Practical implications It is important to say that this model has no intention of evaluating the organization maturity on global development. The main contribution is to help
organizations and their subsidiaries in understanding the several steps and evolution in such environment, involving decisions, processes, standards, training, way of work, and other aspects that more experienced organizations have faced in the past. In addition, there is no predefined goal of reaching the third capability level, or the third phase on subsidiary interdependency. As a capability model, it predicts the range of expected results that can be achieved by following certain processes, but does not assume that a subsidiary must follow all processes defined. As an example, one organization might have one subsidiary, and want to invest only on people development in the beginning. There is no intention of investing in more subsidiaries in the near future. In this case, the model will provide specific guidelines that have worked in the past with other organizations, focusing on the scope defined, as illustrated in Figure 4 in the next page. As the organization strategy evolves, the model could guide on specific software engineering practices that the subsidiary will face, applied to the distributed scenario, and combined with other well known models in industry, such as the CMMI-SW.
Figure 4. Using the model
5.2. Limitations and future work This study has some limitations, which exist in any research project. Regarding the case study conducted in the first phase, the limitations are typical to qualitative studies, in particular the generalization of the results outside the companies studied. This was minimized with the planning and execution of a case study in the second phase, where the attributes identified were evaluated by different respondents.
Regarding the case study in the second phase, one limitation is the small number of organizations (three) and respondents (forty one). The main difficulty is that research on DSD is not easy, in particular this study, where empirical data was needed, and usually is hard to find companies that are willing to participate. Regarding the systematic review conducted, there are limitations related to the number of digital libraries searched, the quality of the search engines, and the bias of the researchers on the classification of the papers found in this process. The bias of the researchers in this
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 39–48, fev. 2010
46 Prikladnicki, R.; Audy, J. L. N. _______________________________________________________________________________________________________
case can also be considered a general limitation, since was identified in the case study as well. Regarding threats to external validity, in our study some results might be based on the company culture, and not based on the software development practice. To minimize this threat, the second phase was intended to evaluate the findings reported in the first one. Future steps are being planned in order to evaluate the model in industry. Focus groups and additional case studies are planned to be conducted in the next months, in order to improve the model structure and evaluate the practices defined for each attribute. Future reports will encompass this evaluation and the feedback from the practitioners. 6. Conclusions DSD has been a growing field within the SE domain. Many companies are distributing their software development facilities, looking for competitive advantages in terms of cost, quality, and qualified professionals [12]. According to Carmel & Tija, the DSD phenomenon started in the early 90’s, but only during the last ten years it has been recognized as a powerful competitive strategy [4]. No matter if it is a local (onshoring) or global (offshoring), and if it is within the same company (insourcing) or as a third-party relationship (outsourcing), organizations are facing several and important challenges from a SE perspective [27]. The DSD research and practice community will benefit from an understanding of how these practices have evolved over time, and if there are patterns that can be learned by organizations starting development in DSD. The model proposed in this research may help those companies starting offshore software development operations with wholly owned subsidiaries to better understand some of the steps that can be followed in order to increase the chance to succeed in this environment. Acknowledgements This study was developed in collaboration between the MuNDDoS research group in Distributed Software Development, at PUCRS, Brazil, and the SEGAL Lab, at UVIC, Canada. The study was also partially supported by the Research Group on Distributed Software Development of the PDTI program, financed by Dell Computers of Brazil Ltd. (Law 8.248/91), and partially supported by CAPES (Brazilian Ministry of Education), financed by the CAPES PhD Internship Program, process number 426006-6. References [1] Sangwan R., Bass, M., Mullick, N., Paulish, D. J., Kazmeier, J. “Global Software Development
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
Handbook”, Boca Raton, NY, Auerbach Publications, 2007. Aspray, W., Mayadas, F., Vardi, M. Y., Editors, “Globalization and Offshoring of Software,” A Reporto of the ACM Job Migration Task Force, Association for Computing Machinery, 2006. Boehm, B., “A View of 20th and 21st Century Software Engineering”, In the 28th International Conference on Software Engineering, Shanghai, China, pp. 12-29, 2006. Carmel, E., Tija, P., “Offshoring Information Technology: Sourcing and Outsourcing to a Global Workforce”, UK: Cambridge, 2005. Damian, D., Moitra, D., “Guest Editors' Introduction: Global Software Development: How far Have We Come?”, IEEE Software, 23(5), pp.17-19, 2006. Prikladnicki, R., Evaristo, R., Audy, J. L. N., Yamaguti, M. H., “Risk Management in Distributed IT Projects: Integrating Strategic, Tactical, and Operational Levels”, International Journal of e-Collaboration, special issue on Collaborative Project Management, Idea Group Inc., 2(4), 1-18, 2006. Prikladnicki, R., Audy, J. L. N., Evaristo, J. R., “Global Software Development in Practice: Lessons Learned”, Journal of Software Process: Improvement and Practice. Wiley, 8(4), 267-281, 2003. Herbsleb, J. D., Moitra, D., “Guest Editors' Introduction: Global Software Development”, IEEE Software. 18(2), 16-20, 2001. Carmel, E., “Global Software Teams – Collaborating Across Borders and Time-Zones”, Prentice Hall, 1999. Karolak, D. W., “Global Software Development – Managing Virtual Teams and Environments”, Los Alamitos: IEEE Computer Society, 1998. Kumar, K., Willcocks, L., “Offshore Outsourcing: A Country to Far?,” European Conference on Information Systems, pp. 1309-1325, Lisbon, Portugal, 1996. Robinson, M., Kalakota, R., “Offshore Outsourcing: Business Models, ROI and Best Practices”, Mivar Press, 2004. Prikladnicki, R., Audy, J. L. N., Damian, D., Oliveira, T. C., “Distributed Software Development: Practices and Challenges in Different Business Strategies of Offshoring and Onshoring”, In the IEEE 2nd International Conference on Global Software Engineering, Munich, Germany, pp. 262-271, 2007.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 39–48, fev. 2010
Distributed Software Development with Captive Centers 47 _______________________________________________________________________________________________________
[14] Ramamani, M., “Offshore Subsidiary Engagement Effectiveness: The Role of Subsidiary Capabilities and Parent – Subsidiary Interdependence,” Conference of Midwest United States Association for IS, pp. 75-80, 2006. [15] Herbsleb, J. D. “Global Software Engineering: The Future of Socio-technical Coordination,” 29th International Conference on Software Engineering, 188-198, Minneapolis, USA, 2007. [16] Carmel, E., “The Offshoring Stage Model: an epilogue,” Available online at auapps.american.edu/~carmel/papers/epilogue.pdf, April, 2005, accessed on November, 2007. [17] Chrissis, M. B., Konrad, M., Shrum, S., “CMMI: Guidelines for Process Integration and Product Improvement”, 2nd Edition, SEI Series on SE, US: Addison-Wesley, 2006. [18] Tuckman, B., “Develomental Sequence in Small Groups,” Psychological Bulletin, 23, pp. 384-399, 1965. [19] Nolan, R., “Managing the Computer Resource: A Stage Hypothesis,” Communications of the ACM, 16(7), pp. 399-405, 1973. [20] Damian, D., Zowghi, D., “The impact of stakeholders' geographical distribution on requirements engineering in a multi-site development organization,” 10th IEEE Int'l Conference on Requirements Engineering (RE'02), Essen, Germany, pp. 319-328, 2002. [21] Ramasubbu, N., Balan, R. K., “Globally Distributed Software Development Project Performance: an Empirical Analysis,” ACM SIGSOFT Symposium on the Foundations of Software Engineering, pp. 125-134, 2007. [22] Sengupta, B., Chandra, S., Sinha, V., “A Research Agenda for Distributed Software Development”, In the 28th International Conference on Software Engineering, Shanghai, China pp. 731-740, 2006. [23] Carmel, E., Agarwal, R., “The Maturation of Offshore Outsourcing of Information Technology Work”, MIS Quarterly Executive, 1(2), pp. 65-77, 2002. [24] Höfner, G., Mani, V. S., “TAPER: A Generic Framework for Establishing an Offshore Development Center,” International Conference on Global Software Engineering, 162-172, Munich, Germany, 2007. [25] Ebert, C., “Optimizing Supplier Management in Global Software Engineering,” International Conference on Global Software Engineering, 177185, Munich, Germany, 2007. [26] Mirani, R., “Client-vendor Relationship in Offshore Applications Development: an
[27]
[28]
[29]
[30]
[31] [32]
[33]
[34]
[35]
[36] [37] [38]
[39]
Evolutionary Framework,” Information Resources Management Journal, 19(4), pp. 71-86, 2006. Meyer, B., “The Unspoken Revolution in Software Engineering,” IEEE Computer, 39(1), 124, 121123, 2006. Ramasubbu, N., Krishnan, M. S., Kompalli, P., “Leveraging Global Resources: A Process Maturity Framework for Managing Distributed Development”, IEEE Software, 22(3), pp. 80-86, 2005. Kitchenham, B., “Procedures for Performing Systematic Reviews,” Joint Technical Report SE0401 and NICTA technical report 0400011T.1, Software Engineering Group, Department of Computer Science, Keele University, 2004 Prikladnicki, R., Damian, D., Audy, J. L. N., “A Systematic Review on Patterns of Distributed Software Development Evolution”, Tech Report, University of Victoria, Canada, 2008. Borland. “Putting your own house in order before Offshoring”, White Paper, 2004. Evaristo, R., Audy, J. L. N., Prikladnicki, R., Avritchir, J., “Wholly Owned Offshore Subsidiaries for IT Development: A Program of Research”, In the 38th Hawaii International Conference on System Sciences (HICSS’05). IEEE Computer Society Press, Hawaii, USA, 2005. Szymanski, C. H., Prikladnicki, R., “The Evolution of the Internal Offshore Software Development Model at Dell Inc”, In the IEEE 2nd International Conference on Global Software Engineering, Munich, Germany, pp. 40-47, 2007. Hyder, E. B., Heston, K. M., Paulk, M. C., “The sCM-SP v2.01: The eSourcing Capability Model for Service Providers (eSCM-SP) v2.01 – Model Overview”, Technical Report CMU-ITSQC-06006, Carnegie Mellon University, Available at http://itsqc.cs.cmu.edu/, Pittsburgh, 2006. Yin, R. K., “Case Study Research: Design and Methods”, 3rd edition, USA: Sage Publications, 2003. Oates, B. J., “Researching Information Systems and Computing”, London: Sage Publications, 2006. Krippendorf, K., “Content Analysis: An Introduction to its Methodology,” Sage, 2004. Prikladnicki, R., Damian, D., Audy, J. L. N., “Patterns of Evolution in the Practice of Distributed Software Development: Quantitative Results from a Systematic Review”, In Evaluation and Assessment in Software Engineering (EASE), Bari, Italy, 2008. Prikladnicki, R., J. L. N. Audy, "Comparing Offshore Outsourcing and the Internal Offshoring
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 39–48, fev. 2010
48 Prikladnicki, R.; Audy, J. L. N. _______________________________________________________________________________________________________
of Software Development: A Qualitative Study," Proc. of the Americas Conf. on Information Systems, São Francisco, 2009. [40] Smite, D., Wohlin, C., Feldt, R., Gorschek, T., “Reporting Empirical Research in Global Software Engineering: a Classification Scheme,” Int’l Conf on Global Software Engineering, Bangalore, 2008. [41] Prikladnicki, R., Damian, D., Audy, J. L. N., “Patterns of Evolution in the Practice of Distributed Software Development in Wholly Owned Subsidiaries: A Preliminary Capability Model”, In the IEEE 3rd International Conference on Global Software Engineering, Bangalore, India, 2008. [42] Madlberger, M., Roztocki, N. “CrossOrganizational and Cross-Border IS/IT Collaboration: A Literature Review,” Proc. of the Americas Conf on Information Systems, 2008.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 39–48, fev. 2010
Collaborative and Distributed Software Process Improvement (SPI): Strategies and Infrastructure Viviane Malheiros1,3 Carolyn Seaman2 José Carlos Maldonado1 1 Instituto de Ciências Matemáticas e de Computação – USP Caixa Postal 668 – 13560-970 – São Carlos – SP – Brazil 2
Department of Information Systems – University of Maryland, Baltimore County 1000 Hilltop Circle, Baltimore, MD, USA, 21250 3
Serpro – Serviço Federal de Processamento de Dados Av. Luiz Vianna Filho, 2355, Paralela, Salvador/BA, Brazil 49010-000 {viviane, jcmaldon}@icmc.usp.br,
[email protected] Abstract. Software Process Improvement (SPI) is a difficult to software development organizations. Scenarios of geographically distributed software development reinforce SPI challenges. This paper describes ColabSPI, a distributed and collaborative strategy and infrastructure to support SPI teams and developers in handling different phases of a typical SPI lifecycle related to process evolution and institutionalization. ColabSPI environment architecture is presented together with some results and ongoing efforts. Preliminary analysis and observations on field indicates that some software development practices may be successfully applied to SPI. In addition, ColabSPI is being considered as a feasible strategy to evolve not only a process but also an open source maturity model. Keywords: Software Process Improvement (Received October 30, 2009 / Accepted January 22, 2010) 1. Introduction Software Process Improvement (SPI) has become an important challenge to software development organizations. Different advances have been made in the deployment of SPI standards and models, for instance: CMMI (Capability Maturity Model Integration) [24], SPICE (Software Process Improvement and Capability dEtermination) [10], IDEAL model [16], MPSBr (Brazilian Software Process improvement model) [26] and The Experience Factory [2]. However, the current problem with SPI is not a lack of a standard or model, but rather a lack of an effective strategy to successfully implement these standards and models [18]. Much attention has been paid to which SPI activities to implement instead of how to implement these activities efficiently. Understanding how to implement SPI successfully is a hard task. From SPI literature and field observations, we’ve been identifying possible factors contributing to diminished SPI process performance and compliance regarding quality and time. So far, we’ve found that many influences on the success of SPI programs are related to coordination, collaboration and communication, and mostly to the degree of developers’ motivation and participation in SPI initiatives. Such
influences are briefly mentioned in Section 2 and will be detailed elsewhere. Scenarios of geographically distributed software development (distributed software teams) highly reinforce the need for dealing with such influences as: • Processes for distributed software development (DSD) are more complex and challenging, as they are supposed to deal with communication and coordination issues. According to [22], the effect of dispersion can be significantly mitigated through the use of structured software engineering processes turning the development process a critical success factor for DSD [21] and making SPI extremely important on DSD context. However, continuous improvement (SPI) of complex processes are more complicated; • As developers’ participation on SPI initiatives is a key success factor, it is important to provide ways to geographically distributed developers contribute to process improvement. As DSD becomes more common the relevance of a distributed and collaborative SPI increases. Bearing this in mind, we propose a collaborative and distributed SPI approach to: (a) enhancing the communication and
50 Malheiros, V. et al. _______________________________________________________________________________________________________
collaboration among SPI stakeholders; (b) increasing developers’ participation in improving software development process; and (c) allowing coordination of SPI initiatives. Our hypothesis is that SPI programs can benefit from a distributed and collaborative strategy and an infrastructure that may create a knowledgebase about a software development process and its improvements, and may allow SPI stakeholders to communicate and organize their work. Our focus is on large organizations that deal with DSD and aim to apply a standard processes to distributed development units. By providing structured support, ColabSPI may foster the emergence and progress of a cooperative environment for SPI. It can address major influences to SPI success or failure such as knowledge exchange and support; staff involvement and motivation; and communication and collaboration. ColabSPI aims to promote geographically distributed SPI initiatives, supporting communication and collaboration, handling of process improvement proposals, technical support on using the process, as well as process documentation. It contributes to two major aspects of SPI: process evolution and compliance. Important influences to our proposal are [15]: concepts of (1) DSD, as in the Open Source development paradigm, and (2) Knowledge Management (KM) practices; and software infrastructure tools such as (3) Wikipedia; and (4) Issue Tracking tools. This paper details the SPI strategy and environment architecture, complementing the introduction to ColabSPI presented in [15]. It also brings initial results and discussions on how software development practices may contribute to SPI. Furthermore, the paper considers applying ColabSPI not to process evolution, but also to evolving a maturity model. The remainder of the paper is structured as follows: a consolidation of main SPI success factors and the rationale to derive key requirements for an SPI infrastructure are presented in Section2. ColabSPI strategy and the related environment architecture are presented in Section 3; Sections 4 and 5 respectively present initial analysis and ongoing efforts on observing and applying software development practices to SPI; related works are discussed in Section 6. Finally our conclusions are presented in Section 7. 2. Main SPI success factors infrastructure requirements
and
SPI
To define infrastructure requirements, we took key factors as our starting point. We have been collecting factors from primary and secondary studies, comparing them with our own observations in the field, and grouping them according to their nature and relationship. We have conducted a systematic review on reported SPI factors/ motivators/ barriers and we have
observed a SPI program in practice. More than a hundred factors/motivators (or barriers) were reported on different empirical studies. Examples of recurrent factors are: (i) the need of staff motivation and involvement; (ii) the benefits of feedback, support for discussions and clear establishment of goals; and (iii) the availability of resources. In addition, Observations on field also led to the identification of SPI factors. After a thorough analysis, impact factors were categorized into five groups: (i) Collaboration and Communication; (ii) Organizational Aspects; (iii) Compliance; (iv) Continuous Improvement s; and (v) Staff Motivation and Participation. This grouping has given us a clearer idea of how to convert some of the identified factors into positive influences on SPI. In addition, in previous work, an organizational structure that would improve developer’ participation in SPI efforts was mapped [13] based on experimental experiences, a first step towards distributed SPI. Both literature review and observations on field have inspired the definition of the infrastructure requirements. Figure 1 presents the basic rationale to unfold each one of the five factor groups associating them to general functional requirements; a possible approach and expected results.
Figure 1: Identifying ColabSPI requirements
For each factor group (e.g. Staff Motivation and Participation), the major SPI needs were defined (e.g. to promote developers participation in SPI initiatives). Such necessities are strongly related to the SPI factors identified both in literature review and field observations. To deal with such necessities some different approaches from software development were selected (e.g.: providing an issue tracker for handling process improvement proposals (PIP)), based on the assumption that software process are software too [20]. We have also declared the expected result for each factor group (e.g.: developers participate on SPI initiatives by submitting suggestions in the PIP tracker). Following such rationale, major requirements for ColabSPI were established (see Table 1). 3. ColabSPI strategy environment architecture
and
the
related
In order to orchestrate a solution that, if adopted, would to contribute to attend the SPI necessities, following the proposed approaches and achieving the expected results, ColabSPI conception was organized into four major
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 49–57, fev. 2010
Collaborative and Distributed Software Process Improvement (SPI): Strategies and Infrastructure 51 _______________________________________________________________________________________________________
elements (Figure 2): (i) principles; (ii) process; (iii) SEPG (Software Engineering Process Group) organization; and (iv) SPI environment architecture (to be implemented as an SPI infrastructure).
SPI Principles The principles that nurture the evolution of a process supported by ColabSPI are:
Table 1 – Major requirements for ColabSPI
• Process users (mainly developers) will participate on process evolution;
1
Communication mechanisms that can enable cooperation such as discussion forums and mailing lists. In addition, the possibility of communicating events, news or SPI needs; publishing information on a message board or informing a particular community of interest;
2
Access to SPI information through a unique starting point;
3
Process under version control and the possibility of changing the process by more than one person from more than one place;
4
Collaborative SPI Strategy, with focus on empowerment and guidelines on how to contribute. Collaborative mechanisms may enhance the availability of resources;
5
PIP handling process (workflow, roles and functionalities) and the possibility of tracking each proposal status until its conclusion;
6
Collaborative support request handling;
7
Transparent backlog of PIP and support request allowing anyone to contribute to improvement analysis and to clarify questions; and
8
User spaces filtering information regarding user actions, such as a list of PIPs submitted by the user.
• Process definition and evolution will be iterative.
• Process discussions, minutes, deliberations, plans for new features, and other artifacts are public, and easily accessible by developers; and • Process evolution decisions will consider both local (software project) and organizational necessities. SEPG Organization ColabSPI is intended to support and facilitate all stakeholders’ involvement (element “SEPG Organization” in Figure 1 deals with such aspect of SPI). Typical roles for distributed and collaborative SPI were defined based on: 1. Observations and experience in SPI at a large organization [13]. In such organization, developers are distributed in more than 10 sites and are expected to contribute to SPI. SEPG members, engineering specialist groups and developers are also distributed. A balance between local necessities and corporate necessities regarding the process is constantly required; and 2. Open Source development typical roles. For instance, in our proposal, software process specialist groups hold rights equivalent to committers’ rights. The parallel with open source organization was interesting due to its development nature of being distributed and flexible. Considering this two sources five potential roles (or stakeholders) were identified (Figure 3): SEPG, Specialist Groups, Contributors, Users – general developers and SPI sponsors. Details on each role are available in [28].
Figure 2: ColabSPI strategy elements
Each ColabSPI element is specified and aspects of their applicability in practice are being tested. The sum of such elements represents ColabSPI strategies and contributes to most of the CMMI goals from Organizational Process Focus and Organization Process Definition process areas. An introductory view of the four elements is presented in this section.
Figure 3: Main roles performing activities in ColabSPI strategy
As a developer contributes to SPI he/she may be promoted from a general developer up to SEPG member, being responsible for core decisions on SPI (meritocracy).
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 49–57, fev. 2010
52 Malheiros, V. et al. _______________________________________________________________________________________________________
SPI Process ColabSPI supports major activities of the SPI lifecycle. Guidelines for deploying SPI programs (for example PDCA or IDEAL) usually suggests an iterative SPI cycle, based on a gradual and incremental evolution of the process. They also refer to identifying, developing and evaluating improvement opportunities to next cycles of the process. ColabSPI process is compatible with those guidelines and allows (i) identifying and evaluating process improvement proposals (PIP); (ii) developing such PIPs, according to their priorities, and (iii) creating and deploying new versions of the process. It also allows SPI management (planning, monitoring and controlling). Handling SPI as a project is a common recommendation in models and guidelines. ColabSPI implements this recommendation. Each activity is performed by a role defined in the SEPG organization and supported by a mechanism implemented in the ColabSPI infrastructure. Such infrastructure is presented in Section 3.1. “SEPG Organization”, “SPI process” and “Environment architecture” are all aligned with the generic principles. 3.1. An environment architecture collaborative/ distributed SPI
for
To support a collaborative and distributed SPI strategy, SPI environment architecture was proposed. Different software development practices and supporting tools were considered to compose ColabSPI: • Mechanisms to support process maintenance in a controlled manner, as an open content management system; • Issue tracking tools adapted to deal with process improvement proposals; and • Communication, coordination and collaboration mechanisms to maintain and evolve the process. Bringing all SPI factors and current software development practices together led us to explore collaborative development environments (CDE) as a good starting point for a distributed and collaborative SPI approach, particularly to deal with the SPI factors related to communication, coordination and collaboration. A CDE is a virtual space wherein all the stakeholders of a project – even if distributed by time or distance – may negotiate, brainstorm, discuss, share knowledge, and generally work together to carry out some task, most often to create an executable deliverable and its supporting artifacts [3]. It provides an integrated access for different mechanisms and tools, creating a virtual project space focused on the particular goal of a team. If we consider SPI as this goal we can imagine a virtual project space for the software development process, wherein Software Engineering Process Group
(SEPG) members and developers can work together to carry out SPI. Likewise, they can negotiate, brainstorm, discuss and share knowledge about improvements toward a better software development process. Here we focus on software developers, SEPGs and specialist groups in their tasks of proposing, analyzing, and discussing process improvements; and implementing and deploying them, where they are physically separated and make use of the Internet as the medium for their interactions. From a functional point of view, mechanisms provided by our approach are classified into: (i) Collaboration and communication; (ii) PIP handling; and (iii) Process documentation and maintenance. The collaboration and communication module should augment SPI initiatives, by fostering the emergence and progress of a cooperative environment for SPI. This module may support requirements , requirements “1”, “2”, “4” and “6” (Table 1). According to [3], there is a spectrum of collaborative mechanisms that may be applied to a Web community, each with its own value. For our purpose, providing an infrastructure for SPI, we refer to some of these mechanisms particularly and add some mechanisms to Booch and Brown’s list [3]. Thus, “Mailing lists” will be applied for small groups with a common purpose, conversations that wax and wane over time, communities that are just getting started, and newsletters and announcements. “Message boards” will be useful for asking and answering questions, encouraging in-depth conversations, and providing context, history, and a sense of place. “News” will be useful for spreading novelties and communicating events. For instance, announcing the publishing of new software process releases or announcing that process version validations are taking place. The PIP handling module is designed to help SEPG to keep track of reported PIPs. It may be regarded as a sort of issue tracking system and was inspired by both Issue Tracking tools, such as Mantis, and Serpro’s tool for PIP handling, GM-PSDS [13]. This module will tackle all functionalities related to posting, diagnosing, developing and concluding a PIP, thus supporting requirements “5” and “7” (Table 1) Though majorly handled by one PIP tracker and its implemented workflow, we also intend to allow improvements through Wiki page editing tools, on a limited basis. Specialist groups could discuss assets and evolve their content through the Wiki. Also, specific components of the process, such as a guideline, could be evolved directly through a Wiki interface. However, placing the whole software development process into a Wiki could expose the process to the weaknesses of Wikis, such as: lack of a navigational principle, duplication of information, inconsistent target audience [5] and lack of meta-data.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 49–57, fev. 2010
Collaborative and Distributed Software Process Improvement (SPI): Strategies and Infrastructure 53 _______________________________________________________________________________________________________
Regarding process support (requirement “6”, Table 1), any developer may answer requests and workflow states for support requests are limited to: “submitted”; “need information”; and “answered”. The Process documentation and maintenance module supports process documentation, offering mechanisms to process authoring and version control (requirement “3”, Table 1). It is related to representing (writing) a software development process in an explicit manner. Process format may vary, though our focus is on making the process available in intranets and clearly separating process format from process content. The process will be under configuration management control, whatever format adopted (like software would be). The infrastructure must allow access to the configuration management system.
4. ColabSPI in practice ColabSPI (both strategy and infrastructure) is being incrementally forged based on initial hypothesis based on both literature review and field observations. Our approach is independent from technology and it is focused on mechanisms rather than on specific tools. Even so, in order to experiment with our ideas in practice, and evaluate their real value, we have decided to develop and adapt tools to SPI goals and to customize one of the available CDEs for open source. Such implementations are progressively being observed in practice to identify their benefits to SPI and to direct our research efforts. In addition, the usage of the SPI collaborative and distributed infrastructure is being explored to evolve not only software process, but also an open source maturity model particularly in the Qualipso competence centers (www.qualipso.org). 1
2
3
4
Figure 4 – The virtual project space of a software development process
4.1. ColabSPI infrastructure: strategy feasible
making
the
Figure 4 shows a mockup of the SPI collaborative and distributed infrastructure. The screenshot highlights the virtual space of the ColabProcess “project”. ColabProcess is an SPI project and all information about it and functionalities can be accessed through one
unique URL (Figure 4, id. 1). Accessibility through the Internet can enhance flexibility, distribution and easy connection of new tools. Once in the virtual project space, all communication and collaboration mechanisms are available to maintain and evolve the software development process: forums; latest news; reports; information about all contributors; and historical data about the process. From the virtual project space it is
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 49–57, fev. 2010
54 Malheiros, V. et al. _______________________________________________________________________________________________________
also possible to access the other modules mechanisms. For instance it is possible to access files from the software development process description itself or the PIP handling. At the main page, it is possible to know the latest news (Figure 4, id. 2) about the software development process. For instance, a SEPG can use this feature to inform when the next conference about SPI will take place or when a new major release of the process will be published. The software development process is under configuration management control (like software would be) and it can be accessed through a CVS (Concurrent Versions System) link (Figure 4, id. 3). CVS keeps track of changes in process files and allows several stakeholders to work independently. As an optimistic concurrency control, CVS may be less supportive in conflict detection. Even so, it can aid collaboration and communication as stakeholders can evolve together one given version of the process. In addition, stakeholders can be widely separated in space and/or time. We will monitor process evolution to observe how conflicts are frequent, as process specialists tend to contribute in specific process issues. Also, we expect wiki support to previous PIP discussion to reduce such conflicts. Also, from the main page, it is possible to access all trackers related to the improvement of the software development process. For instance, it is possible to access all previous PIPs, their status and who is handling each PIP (Figure 4, id. 4). To authoring the process different editing tools can be used. Two examples are Atabaque1 and EPF Composer (http://www.eclipse.org/epf/downloads/tool/tool_downl oads.php). Both of them are open source tools that support process documentation allowing all content managed to be published to HTML and deployed to Web servers for distributed usage. They both support segregation between process content and format. They both aim providing a knowledge base of intellectual capital that can be browsed, managed and deployed. They are designed for process engineers and project managers to author, tailor and publish methods and processes for development organizations and projects. The main PIP handling workflow presented in [13] was evolved to, among other improvements, allowing every developer to see every PIP, even those under evaluation and registering contributions to a PIP at any stage. The new workflow was implemented as a Mantis customization. Such customization was deployed recently and the new workflow results will be compared to the old workflow results after 3 months of usage. The definition of a PIP handling customization with Mantis was first described and developed in [12]. 1
Atabaque is a free software tool for process authoring available at http://sourceforge.net/projects/atabaque/ (Malheiros et al. 2008b).
4.2. Initial results and on-going efforts For analyzing ColabSPI we have created a SPI project in a GForge tool instantiated at a large software development software organization (the selection of Gforge is commented in [15]. Details on selecting Gfoge are available in [24]). In such instantiation, ColabSPI contains one general mailing list for SEPG, one mailing list for all developers, and one mailing list for each software engineering discipline covered in the software development process (e.g.:
[email protected]). All mailing list discussions are recorded and available for further search in the process project environment. SEPG coordinator is the environment administrator. A forum was created to clarify doubts, with different entrances per discipline. A specific forum was created to “New ideas for the process”. In order to experience the feedback mechanism, a question was posted by the SEPG in the message board related to process compliance: “Does your project recalculate software size when being closed?” Atabaque and EPF Composer [8] were tested to process documentation because they are open source tools; multi platform; generate the process in a webbased format and can be used integrated to a configuration management system for software process change control. Atabaque models the process according to RUP, while EPF Composer is based on SPEM – Software Process Engineering MetaModel [19]. Lessons learned on Atabaque piloting and case studies were published in [14]. After that the tool was applied to structure the whole project management process. Now the security process team people started using Atabaque to document the security process, demonstrating that the tool can be applied to different process other than software development processes. The security process was previously available in document file format and the migration to Atabaque is also allowing security process group to re-think the process. The usage of EPF Composer is currently being tested in the context of the development process. The testing of the new tool was motivated by the aim to restructure the software development process considering SPEM modeling. So far, both tools presented satisfactory results on generating new versions of the process and implementing the authoring mechanism. Atabaque is being considered easier to use, as the implementation of SPEM components adds some complexity to EPF Composer. On the other hand, EPF seems to offer more flexibility to collaborative work, due to its plug-in elements. In all cases process evolution was under configuration management. The CVS was applied. CVS keeps track of changes in process files (e.g.: XML or XMI files) and allows several stakeholders (e.g.: developers, testers, SEPG members, etc.) to collaborate.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 49–57, fev. 2010
Collaborative and Distributed Software Process Improvement (SPI): Strategies and Infrastructure 55 _______________________________________________________________________________________________________
In all pilots (both using Atabaque and EPF Composer) only SEPG members and specialist group members participated. Regarding the process modeling, the SPEM is being preferred, particularly due to its configuration elements and to the clear separation of method content and deliverable processes. The prototype of the infrastructure was analyzed by some members of the SEPG group. The forge customization and the rationale to develop it were presented and three users have participated simulating SPI activities. It was approved to be used in a pilot with all specialist group members. Piloting with specialist groups before making the solution available to all developers is compatible with SPI guidelines orientations and it is a possible planned step of “Develop SPI iteration”. In parallel, evaluation criteria are being detailed according to the GQM strategy [1] to measure quantitative the approach’s value. Main factors that influenced our approach (briefly mentioned in Section 2) are the basis for defining the evaluation goals. We foresee tree more major experimentation/ evaluation opportunities of our approach in practice, where the measures will be collected and analyzed: (i) in a commercial context, applying modules of our infrastructure to improve a current software development process; (ii) in the Demoiselle Process definition and evolution (www.frameworkdemoiselle.gov.br); and (iii) in the Qualipso project (http://www.qualipso.org). In this context, CDEs are being exploited as a means of managing Open Source Factory knowledge. Recently we are considering that the infrastructure may be useful for documenting and evolving its Open Source process too, following the approach presented here. Qualipso project aims to promote trustiness on open source development, and it is particularly interested in trustworthy elements related to development process. 5. Related Works Distributed software development is not a trivial task and it is becoming common both in national and international organizations. Different solutions have been proposed to deal with its complexity. For instance, some reports were found related to the usage of CDE inside large organizations and to the understanding of networks of communities around the development of software systems. However, their focus is on promoting an environment for developing the software itself, not for supporting the SPI endeavor. To the best of our knowledge most DSD studies focus on developers and their activities not in SPI professionals or SPI activities. Even so, the following experiences on using CDE or fostering software development communities in large
organizations were considered and adapted to SPI context: [23], [7], [4], [17] and [9]. Vanzin et al. [27] present practices to define a global software process for a distributed environment in a case study. It was useful for our approach as it brings factors that may impact process definition related to distributed development characteristics. Our approach is stronger related to SPI key success factors in addition. We could not find similar proposal of collaborative and distributed SPI strategies to large organizations. An initial idea of applying a hypermedia tool to monitoring level of the software process management model for distributed groups was introduced by [11].Existing SPI tools typically support assessors in collecting data during assessments. They provide reporting capabilities to aggregate the collected results. ColabSPI goes in a different/complementary direction and proposes a web-based project workspace to support SPI teams in handling different phases of typical SPI lifecycle. Some works focuses in the content of the distributed development process itself, for instance APSEE-Global [6], in the context of Process Based Software Engineering Environment, extends the environment to distributed software development characteristics. Our approach focuses on mechanisms to better improve software processes definition, evolution and support. 6. Conclusions This paper described ColabSPI, a SPI strategy and environment architecture that may support geographically distributed SPI initiatives. It is suitable to major SPI models and guidelines and helps many phases of the SPI life cycle. The ColabSPI strategy and infrastructure prototype was presented together with some preliminary results and on-going efforts. ColabSPI requirements were defined to handle major SPI critical success factors. Some observations on field were conducted to analyze ColabSPI contribution on mitigating some SPI barriers. Currently the evaluation of ColabSPI in three different contexts is being pursued. Experimental studies are being conducted to validate different mechanisms. For more precise results, the GQM paradigm is being considered. Further information on the strategy and each element will be available in the first author PhD thesis that is under construction [28]. The discussion of distributed and collaborative SPI in the Qualipso project context may open new directions for ColabSPI, for instance the maintenance and administration of a maturity model and not only the process. An OMM (Open Maturity Model) evolution is under construction and one major part of the project is defining the OMM evolution process. To defining such process, concepts of ColabSPI were applied. In Qualipso, CDEs are being exploited as a means of managing Open Source Factory knowledge. Recently we
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 49–57, fev. 2010
56 Malheiros, V. et al. _______________________________________________________________________________________________________
are considering that the infrastructure can be useful for documenting and evolving the Open Source process too. As ColabSPI requirements were defined based on SPI impact factors identified by a systematic review in literature, the collaborative and distributed strategy is expected to be useful to different organizations dealing with SPI, particularly large organizations with more than one software development unit. Acknowledgements The authors would like to thank Serpro, CAPES/PDEE program (BEX2979/07-1), FAPESP, CNPQ and Qualipso for partially supporting this work. Also, we are grateful to Serpro for allowing some SPI observations on field. References [1] Basili, V. R. Software Modeling and Measurement: The Goal/Question/Metric Paradigm. Technical Report UMIACS-TR-92-96. University of Maryland, 1992. [2] Basili, V., Caldiera, G. and Rombach, H.D. Experience Factory. In: Encyclopedia of Software Engineering, v. 1, pages 469—476, 1994. [3] Booch G., Brown A.W. Collaborative Development Environments. Rational Software Corporation, 2002. [4] Dinkelacker, J. Garg, P., Miller, R. Nelson, D. Progressive Open Source. Tech Report HP, 2001. [5] Fogel, K. Producing Open Source Software. O’Reilly Media. CA, USA. 2006. [6] Freitas, A.V. APSEE-Global: um Modelo de Gerência de Processos Distribuídos de Software. Master Thesis, UFRGS, 2005 (In Portuguese). [7] Gurbani, V., Garvert, A., Herbsleb, J. A Case Study of a Corporate Open Source Development Model. ICSE’06, May 20-28, 2006, Shanghai, China. [8] Haumer, P. Eclipse Framework Composer. Available at: http://www.eclipse.org/epf/general/EPFComposer OverviewPart1.pdf. April, 2007. [9] Hupfer, S., Ross, S., Patterson, J. Introducing collaboration into an application development environment. Proc. of the ACM conference on Computer supported cooperative work, 2004. [10] ISO/IEC 15504 Information Technology Software process assessment. 2003. [11] Maindantchick, C.; Rocha, A. Xexeo, G. Software process standardization for distributed working groups. In: Proc. 4th IEEE Int. Symp. And Forum on Soft. Eng. Standards, 1999.
[12] Malheiros, V. Cunha, L. Rehem, S. Mantis-PMP: uma ferramenta livre para gestão de mudanças em processos. ConSerpro, Brazil, 2007 (In Portuguese). [13] Malheiros, V. Paim, F. Mendonça, M. Continuous Process Improvement at a Large Software Organization. Software Process: Improvement and Practice, Wiley InterScience, 2008a. [14] Malheiros, V. Rehem, S. Maldonado, J.C. Atabaque: uma contribuição de sucesso na evolução de processos. SBQS, 2008b (In Portuguese). [15] Malheiros, V. Seaman, C. Maldonado, J.C. An Approach for Collaborative and Distributed Software Process Improvement (SPI). Workshop de Desenvolvimento Distribuído de Software (WDDS). 2009. [16] McFeeley, R. IDEAL: A User's Guide for Software Process Improvement. CMU/SEI-96-HB-001, ADA305472. Pittsburgh, PA: SEI, Carnegie Mellon University, 1996. [17] Melian, C., Ammirati, C, Garg P., Sevón, G. Building Networks of Software Communities in a Large Corporation. Tech Report HP, 2002. [18] Niazi, M., Wilson, D., and Zowghi, D. A framework for assisting the design of effective software process improvement implementation strategies. J. System Software. 78, 2. Nov., 2005). [19] OMG (Object M. Group) Software Process Engineering Meta-Model, version 2.0. Available at http://www.omg.org/technology/documents/formal /spem.htm. 2008. Last Access: October, 2009. [20] Osterweil, L. Software processes are software too (revised). In Proc. of ICSE, 1997. [21] Prikladnicki, Rl; Audy, J.; Evaristo, R. Global Software Development in Practice: Lessons Learned. Software Process Improvement and Practice, USA, v. 8, n. 4, p. 267-281, 2004 [22] Ramasubbu, N. and Balan, R. Globally Distributed Software Development Project Performance: An Empirical Analysis. ESEC-FSE’07, Croatia, 2007 [23] Riehli, D. et al. Open Collaboration within corporation using software forges. IEEE Software, v.26 n.2, 2009 [24] Rehem, S. Relatório de avaliação de ferramentas livres para Ambiente de Desenvolvimento Colaborativo. Tech Report - SERPRO, 2008 (In Portuguese). [25] SEI - Software Engineering Institute. Capability Maturity Model Integration (CMMI SM), Version 1.2. Technical Report CMU/SEI-2002-TR-011, SEI, 2002.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 49–57, fev. 2010
Collaborative and Distributed Software Process Improvement (SPI): Strategies and Infrastructure 57 _______________________________________________________________________________________________________
[26] Softex. MPS.Br - Guia de Avaliação. 2006. (In Portuguese) [27] Vanzin, M., Ribeiro, M., Prikladnicki, R., Ceccato, I., Antunes, D. Global Software Processes Definition in a Distributed Environment. In Proc. 29th Annual IEEE/NASA on Software Engineering Workshop. IEEE Computer Society, 2007. [28] Malheiros, V. Uma contribuição para a melhoria colaborativa e distribuída de processos de software. PhD thesis, University of Sao Paulo.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 49–57, fev. 2010
Applying Grounded Theory in Qualitative Analysis of an Observational Study in Software Engineering – An Experience Report Tayana Conte1, Reinaldo Cabral2, Guilherme Horta Travassos2 1
Departamento de Ciência da Computação Universidade Federal do Amazonas (UFAM) Av. Gen. Rodrigo Octávio 3000 – Campus Universitário – 69077-000 – Manaus – AM – Brasil 2 Systems and Computing Engineering Department – COPPE/UFRJ PO Box 68511 – 21945-970 – Rio de Janeiro – RJ – Brasil
[email protected],{cabral, ght}@cos.ufrj.br Abstract. Research methods used in Software Engineering have increasingly moved closer to those employed in Social Sciences, which involves research with human subjects in projects. Several knowledge areas have witnessed the development of qualitative research methods that are widely used to deal with the issues complexity of human behavior. However, despite its importance, these methods are not commonly used in Software Engineering research. This paper presents the application of Grounded Theory, a qualitative analysis method, in an observational study to evolve a usability inspection technique, aimed at contributing towards the understanding of the use of qualitative approaches in the Software Engineering field. Keywords: Qualitative Analysis, Grounded Theory, Observational Study, Usability Evaluation, Reading Technique, Software Inspection, Experimental Software Engineering (Received October 30, 2009 / Accepted January 22, 2010) 1. Introduction One should note that, in Software Engineering research, both the software development process as the use of the technology created are social-technical processes and that, as a result, require “a look that seeks to grasp Software Engineering without fragmenting it into ‘technical factors or aspects’ on one side and ‘nontechnical factors or aspects’ on the other” ([8]). According to [8], ‘the social-technical look seeks the intent to describe, that is, the desire to describe, in detail; to identify; to locate; to specify; to produce differences’. Proprietary research methods are needed to capture this social-technical look and describe research aspects and issues instead of merely simplify and produce similarities. In this sense, qualitative methods are the most indicated as they allow a wider understanding of the entire phenomenon under observation ([14]). [14] presents several research methods for the collection and analysis of qualitative data, describing how they can be used in experimental studies in Software Engineering. Since then it has been possible to see the growing use of qualitative methods in research done on different Software Engineering topics. Some examples are: [9] scrutinizes a qualitative study on how developers use APIs, presenting important points on collaborative development; [18] presents a qualitative study to understand the practice of software testing; [13] discusses the results of qualitative analyses aimed at
identifying the critical success factors in initiatives toimprove the software process. However, in spite of the examples mentioned, qualitative methods are still not commonly used in Software Engineering research. In several studies labelled as ‘qualitative’, researchers transformed the qualitative data collected into numbers, accounting for ‘numbers for X-type replies’ in comparison with ‘numbers of Y-type replies’. Further discussion is needed on how qualitative analysis methods can be used in Software Engineering research. This paper presents an application example for a qualitative research method, Grounded Theory, in an observational study in the domain of Software Engineering. The study at hand was carried out to understand the process of applying a usability inspection technique in Web applications, the WDP1. In this study, the Grounded Theory was used as a tool to assist the execution of the qualitative analysis. This paper is an experience report aimed at stimulating the discussion between peers and contributing with the dissemination of the method within the community of Software Engineering researchers. The remaining article is organized as follows: Section 2 presents a brief theoretical reference on the Grounded 1 WDP - Web Design Perspectives-based Usability Evaluation (CONTE et al. 2007)
Applying Grounded Theory in Qualitative Analysis of an Observational Study in Software Engineering – An Experience Report 59 _______________________________________________________________________________________________________
Theory qualitative research method. Section 3 describes the execution of the observational study as an example of applying Ground Theory procedures, apart from explaining the execution of the qualitative analysis and the results obtained. Finally, Section 4 presents the conclusions and lessons learned with this experience. 2. The Grounded Theory Method The Grounded Theory is a qualitative research method that uses a set of systematic data collection and analysis procedures to generate, prepare, and validate substantive theories on essentially social phenomena, or on wide social processes ([2]). Its authors, Glauser and Strauss, say that there are two basic types of theories: formal and substantive ([4]). The first type is made by the conceptual and wide-ranging theories, and the second one is specific to a given group or situation and does not aim at generalizing away from its substantive area. According to [3], theory is an integrated set of propositions that explain the variation of the occurrence of a social phenomenon adjoining to the behavior of a group or to the interaction between groups. The essence of the Grounded Theory Method is that the substantive theory emerges from the data, that is, a theory derived from systematically collected and analyzed data. Although the purpose of the Grounded Theory Method is the construction of substantive theories, its use does not necessarily need to remain restricted only to researchers who have this research goal. According to [17], the researcher may use only some of its procedures to meet one’s research goals. The Grounded Theory Method was initially introduced by GLASER and STRAUSS ([11]). Its creators diverged on some points and the method split into two branches. One of them, postulated by [10] emphasizes the emerging characteristic of the method and the inducting processes whose development was pioneered by the Department of Sociology at Columbia University in the 50s and 60s. The other line was developed by [16] and consolidated in [17], and aimed at systematizing the method for data collection and analysis. According to the line proposed by Strauss, the Grounded Theory is based on the coding idea which is the process of analyzing the data. Concepts (or codes) and categories are identified during the coding stage; a concept (or code) names a phenomenon that has the interest of the researcher; it abstracts an event, an object, an action or interaction that has a meaning to the researcher ([17]). Categories are clusters of concepts joined in a higher degree of abstraction. The coding process can be split into three stages: open, axial, and selective coding. Open coding involves the breakdown, analysis, comparison, conceptualization, and the categorization of the data. According to [3], in the early stages of open coding the researcher explores
the data with a detailed examination of what one deems as relevant due to the intensive reading of the texts. In the open coding stage the incidents or events are grouped in codes via the incident–incident comparison. It means that all incidents or events should be classified and grouped accordingly their meaning. The codes generated can be classified as: first-order coding, directly associated to the citations (referred to as live coding); and abstract or theoretical codes, associated to other codes, without necessarily being connected to some citation. Also in open coding, categories are created to cluster the codes and reduce the number of units the researcher will work with. After the identification of conceptual categories by the open coding, the axial coding examines the relations between the categories that form the propositions of the substantive theory ([3]). Causes and effects are detailed, new conditions and action strategies, in propositions that should be again tested in the data. The relations between codes (connectors, according to [10]) can be defined by the very researcher. In the line proposed by [17], these relations form what the authors call a paradigm: causal conditions, new conditions, consequences, and action strategies/interactions. Table 1, adapted from [1], presents a suggestion for connectors based on the line proposed by [17]. Table 1 – Code Connectors as adapted from [1] Symbol
Label
isa
Is a
*}
Is property of
=>
Is cause of
[]
Is part of
Description of Relations The source code is a type, or form, of the target code. It is defined by a pattern of dimensional variation throughout category properties (target code) The source code is the property of the category (target code) The source code (causal condition) causes the occurrence of the target code The source code is a part that, along with other parts, forms the target code
Finally, the selective coding refines the entire process identifying the core theory category to which all the others are related. The core category should be able of integrating all the others and express the essence of the social process that takes place between those involved. This core category can be an existing category, or a new category can be created. 3. Data Analysis in an observational study The need for sound Web usability techniques motivated our research goal, which is to define a usability inspection technique tailored to support the specific features of Web Applications ([6]). We propose the use of Web Design Perspectives (Conceptual, Presentation and Navigation) as a guide to interpret Nielsen’s
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 58–69, fev. 2010
60 Conte, T. et al. _______________________________________________________________________________________________________
heuristics. This derived technique we call Web Design Perspectives-based Usability Evaluation (WDP). Hints were provided for each related pair Heuristic x Perspective (HxP) to guide the interpretation of each heuristic from a perspective’s viewpoint. To support the development and validation of the WDP technique, we adopted the evidence-based methodology presented in [12] and [15], that is based on experimental studies as a means to determine what is functional or not in the application of the technology proposed. The description of the application of this methodology to define and improve the WDP technique is found in [6] and [19]. One of the stages of this methodology proposes the execution of an observational study aimed at understanding how the WDP technique can be applied by the inspectors during a usability evaluation of a Web application. According to [15], observational studies are useful to obtain a detailed understanding of how a new process is applied. Observation techniques can be used to understand work practices ([14]). Quantitative measures were collected in this observational study to gauge the results obtained in relation to the efficacy of the technique (number of defects detected and time spent by each inspector). However, these quantitative measures do not provide sufficient elements to understand the behaviour of the individuals during the interaction with the technology at hand. For this reason, a decision was made to collect qualitative data on the manner how the inspectors apply the technique and use the Grounded Theory Method to analyze them. In this context, the Grounded Theory Method supports the construction of knowledge on how the WDP technique can be applied. After defining the context of the study in the following subsection, the execution of the qualitative analysis via the application of the Grounded Theory is described. And, at the end of the Section, the results of the analysis are presented. 3.1. Study Execution Context The goal of this observational study is described in Table 2. Table 2 – Goal of the Observational Study according to the GQM paradigm To analyze With the intent of In relation to From the point-of-view In the context of
the application of the WDP technique understanding the order of application of the WDP technique of the Web application inspectors an usability evaluation of a real Web application by undergraduate students with knowledge of usability evaluation
Subjects were represented by fourteen undergraduate students from the UFRJ’s Computational and
Information Engineering School, enrolled in the ManMachine Interaction discipline. These students were trained on usability evaluation during the course, using different methods in exercises. All were informed on the execution of the observational study (without having knowledge of what would be surveyed) and signed a consent form. The study started in the first half of 2007 using the third version of the WDP (WDP v3). All the participants carried out the evaluation using the technique. Two types of data were collected in this study: observational data and inquiry data. Two techniques were jointly used to collect the observational data: (1) the ‘observer-executor’ technique where the participants were set in pairs with two roles: the ‘executor’ (who carries out the inspection) and the ‘observer’ (who carefully observes how the ‘executor’ carries out the inspection and makes notes on the execution); (2) cooperative evaluation, a variation of the ‘Think Aloud’ technique. The cooperative evaluation was the interaction protocol used by each ‘observer-executor’ pair where the executor described (or ‘thought aloud’) what one was doing during the detection and the observer could ask questions or request explanations on the decisions or acts of the executor. The inspection of the application was split into two parts (I and II) so that all the participants could act in the two roles: the participants that acted as observers in Part I became executors in Part II and vice-versa. The inquiry data was collected after the inspection, using feedback questionnaires with open questions. The participants were split into two groups: A and B, each group having seven participants. Group A consisted of the seven students with best marks in the classroom exercises and Group B by the seven remaining students. All the ‘Observer-Executor’ pairs had one student from each group. The participants already had knowledge on usability evaluation and received training on: (i) the JEMS2 System, presenting the application, the scope of the inspection and detailing the functions that are relevant to the Reviewer role (focus of this usability inspection); (ii) WDP technique v3, with examples of issues detected via the use of Heuristics x Perspective pairs; and (iii) Observation techniques used in the study: ObserverExecutor and Cooperative evaluation. After the training courses the participants were distributed in pairs for Part I of the inspection. The participants in Group A were the observers and those from Group B were the executors. It was suggested that the executors apply the WDP technique following a set 2
The Web JEMS (Journal and Event Management System) application is used to support the management process for conferences, journals, and articles, including the submitting, reviewing and acceptance of articles in conferences and publications promoted by the Brazilian Computing Society (SBC).
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 58–69, fev. 2010
Applying Grounded Theory in Qualitative Analysis of an Observational Study in Software Engineering – An Experience Report 61 _______________________________________________________________________________________________________
sequence by perspectives. The executors carried out the detection of usability issues in the inspection of the interaction for the use cases attributed to Part I, being monitored by the observers who took notes on queries and decisions of the executors in specific forms. A term of one week was given for the detection related to Part I and at the end the executors sent their discrepancies spreadsheets and the feedback questionnaires with their comments on the inspection, also it was informed if they had used the suggested sequence or not. The observers handed their forms with the notes on the observation. For Part II, the ‘Observer-Executor’ pairs were switched, retaining one participant from each group (A and B), but having, however the participants interact with other peers (by changing the pairs). During Part II, Group A participants became executors and Group B participants took up the role of observers. It is important to point that Group A participants, as they observed Group B members carrying out the Part I inspection, had the chance to accompany an inspector in the application of the WDP technique. This can be considered as additional ‘practical training which may have caused an increase in the volume of knowledge on usability inspections for Group A participants. At the start of Part II, it was pointed that the executors did not need to follow the sequence suggested for the application of the technique, and that they could use the sequence considered as more appropriate. The sequence used by each Group A participant was recorded. At the completion of detection for Part II, the executors sent the spreadsheet with the discrepancies found and the feedback questionnaire, informing what application sequence had been chosen. The observers also handed the forms with the notes on the observation. An analysis was done of both the quantitative data (from the discrepancy plans) and the qualitative data (from the observation forms and monitoring questionnaires). The use of Grounded Theory procedures for the analysis of qualitative data is shown in the next section. 3.1. Qualitative analysis using the Grounded Theory Method The qualitative data extracted from the monitoring questionnaires and from the forms with the observation notes were analyzed using a subset of the coding process stages as suggested by [17] for the Grounded Theory Method – open and axial coding. The three coding process stages were not executed in this observational study as proposed by the Grounded Theory, as it was possible to obtain the answer for the research issue in the study (‘How do inspectors apply the WDP technique?’) after the execution of the open and axial coding stages. Prior to commencing the data analysis in itself, a previous check was carried out of the contents in the
monitoring questionnaires (answered by the inspectors/executors) and of the observation forms (filled by the observers). It was found that the observation forms had basically comments on queries, difficulties or the ease found in the application of the Heuristics x Perspective (H x P) pairs in the evaluation of the JEMS system. This probably occurred due to the format of these forms. Only two observers made notes on the difficulties/ease of the executors with some specific perspective. However, the monitoring questionnaires had detailed feedback on the form of application of the technique as a whole, describing the doubts, difficulties and ease the inspectors had when carrying out the inspection. The monitoring questionnaires had the same observations on the H x P pairs that were recorded in the forms prepared by the observers, apart from other specific remarks on the application of the perspectives, of the heuristics and on the order of application. For this reason, the monitoring questionnaires were used as main source of data for analysis. One of the monitoring questionnaires was choosen as initial source for data exploration. When studying the data from this questionnaire, the researcher responsible for the technique created codes related to the text excerpts (citations) related to the learning process and the application of the WDP by the inspector. Despite the fact seed categories always can exist, mainly when a research question is described, seed categories were tacitly discussed and explored by the researchers. However, no explicit use was made of the ‘seed categories’ (an initial set of codes to start the coding), and live codes were created from the text of the questionnaires. This initial set of codes was revised with other researchers and after that initial review the open coding was started for the other thirteen monitoring questionnaires, with the analysis of their data and code association to the text citations. The open coding procedures stimulate the constant creation of new codes and the merger with existing codes when new evidence and interpretation data emerge. Figure 1 shows part of the codes associated to the citations in one of the questionnaires studied. The codes found in the questionnaires were grouped according to their properties, thus forming concepts that represent categories. These categories were studied along with other researchers and subcategories were identified to provide greater clarity to the phenomenon at hand. Finally, the categories and subcategories were related amongst themselves in the axial coding stage. In practice the open and axial coding steps overlap and join due to the interactivity of the process. The codes and categories identified have gone through successive revisions, where, at the end of the present version of the analysis, 89 codes were produced, associated to 03 categories: WDP Structure, Ease of
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 58–69, fev. 2010
62 Conte, T. et al. _______________________________________________________________________________________________________
Application and Application Sequence. Although the goal of the observational study was to identify the manner in which inspectors apply the WDP technique, the monitoring questionnaire had issues on adequacy and Ease of Application of the technique, which led to
the identification of the WDP Structure and Ease of Application categories. With axial coding it was possible to see that aspects related to the Ease of Application of the technique influence the application sequence adopted by the inspectors.
Figure 1 – Code and citation association in a feedback questionnaire (In Portuguese)
The definitions, properties, and subcategories associated with each category are shown in Tables 3, 4, and 5 below. After each table, a figure is shown with the graphical representation of the associations between the codes, categories, and subcategories. In these figures, the codes are shown followed by two numbers that represent, respectively, the groundedness degree and that for theoretical density of the code. The groundedness degree shows the number of citations the code is associated with. The degree of theoretical density shows the number of relationships of the code with other codes. In these graphical representations the codes that are preceded by ‘[XX]’ (where XX is an acronym designated to the category) representing categories and subcategories. It is possible to see that these codes that represent the categories and subcategories have a groundedness degree equal to zero, as they are not associated to the citations in the questionnaires. Apart from the codes, other elements of the graphical representations are the Memos (or Analysis Notes), that describe the history of the interpretation undertaken by the researchers and the results from the codifications. Table 3 shows the properties and Figure 2 shows the respective relationships for the WDP Structure category. This category shows the codes derived from the comments by the inspectors on the organization of the
WDP elements. Two subcategories, ‘Structure of Perspectives’ and ‘Structure of H x P Pairs’ compose the WDP Structure category. No subcategory related to the structure of heuristics was created as these are not an element originally proposed by the WDP technique but by (NIELSEN 1994), which led to the monitoring questionnaire having no question on the structure of heuristics. Table 3 – Properties of the Category Structure Category Concept
Variation Subcategories
Structure Organization of WDP elements technique. The structure was observed in terms of elements with an adequate structure (organization) and elements with an inadequate structure Positive (+): Adequately organized elements Negative (-): Inadequately organized elements Structure of Perspectives Structure of Pairs
In order to observe the dimensional variation of the ‘Structure of Perspectives’ and ‘Structure of H x P Pairs’ subcategories two connectors were proposed: ‘evidence of adequacy’ and ‘evidence of inadequacy’. These two connectors are specializations of the connector ‘is a’ (is a type of). ‘Evidence of adequacy’ shows a positive variation for the WDP Structure, where the elements are
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 58–69, fev. 2010
Applying Grounded Theory in Qualitative Analysis of an Observational Study in Software Engineering – An Experience Report 63 _______________________________________________________________________________________________________
indicated as adequately organized. And ‘evidence of inadequacy’ shows a negative variation, to point elements that are considered as inadequately organized. As an example of analysis from the graphical representations, one notices in Figure 2, in relation to the ‘Structure of Perspectives’, an evidence of adequacy from the ‘Perspectives are well described’ code, where such code is related to the comments from ten inspectors in their questionnaires (code groundedness degree equal to ten). However, there are also evidences of inadequacy, captured via the ‘Suggestion to improve description of perspectives’ and ‘Difficulty to distinguish interface elements and navigation elements’ codes. A relation to the ‘Suggestion to improve the description of the perspectives’ code is shown by the
specific suggestions from each inspector, recorded through these codes: ‘Key-questions should be incorporated into the description of the technique’, ‘Suggestion to add to the description of the Navigation Perspective that this is not static’, ‘Suggestion to describe perspectives in a simpler way’, ‘Problem Perspective Conceptualization described in a very subjective manner’ and ‘Problem Perspective Presentation overlaps with Navigation Perspective’. These five codes are examples of live codes, related to comments of specific inspectors. However, one can see through the groundedness degree that is equal to two that the ‘Key-questions should be incorporated into the description of the technique’ code was a suggestion made by two distinct inspectors.
Figure 2 – Graphical representation with the associations related to the WDP Structure category
Figure 3 is a supplement to Figure 02 that shows the codes that form those codes that are directly associated to the ‘Structure of H x P Pairs’ subcategory. Although these findings cannot be directly related to the phenomenon under observation (WDP application manner), they indicated aspects that, according to the inspectors, needed improvement in the WDP v3. All the codes related to evidences of inadequacy were studied after that by the researchers for the preparation of version 4 of the technique (WDP v4).
Table 4 shows the definition of the Ease of Application category and its properties. This category presents codes derived from inspectors’ comments on aspects that facilitate or hamper the application of the technique. For a better organization of these codes, five subcategories were created: ‘Ease to Apply Technique’, ‘Ease to Apply Perspectives’, ‘Ease to Apply Heuristics’, ‘Ease to Apply the Pairs’ and ‘Inspectors’ Opinion on Ease of Application’.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 58–69, fev. 2010
64 Conte, T. et al. _______________________________________________________________________________________________________
Figure 3 – Associations related to the Structure of Pairs subcategory Table 4 – Properties of the Ease of Application Category Category Concept
Variation
Subcategories
Ease of Application Comfort (or its lack thereof) to apply the WDP technique. The Ease of Application was observed via points that evidenced the ease to apply the WDP and points that showed a difficulty to apply it. Positive (+): Ease in the Application Negative (-): Difficulty in the Application Ease to Apply Technique Ease to Apply Perspectives Ease to Apply Heuristics Ease to Apply the Pairs Inspectors’ Opinion on Ease of Application
As regards the ‘Ease to apply Technique’ subcategory the codes attributed to the citations of the inspectors on the ease (or difficulties) to apply the technique as a whole were associated. The codes listed on the ease/difficulties to apply the H x P pairs, on the perspectives and to the heuristics were respectively associated to the ‘Ease to Apply the Pairs’, ‘Ease to Apply Perspectives’ and ‘Ease to Apply Heuristics’ subcategories. In the case of this category, the Ease to Apply Heuristics was considered given that the manner of application the heuristics in the technique can be customized. Finally, the ‘Inspectors’ Opinion on Ease of Application’ subcategory is made by two other subcategories: ‘Comparative Perception on Heuristic Evaluation’ – with the codes assigned to comments from the inspectors that compared the WDP technique with the Heuristic Evaluation method and ‘Perception on Adequacy and Ease of Use’ – that gathers the opinions
from the inspector on the adequacy of using the technique in usability inspections. Figure 4 shows the relations between the codes of the Ease of Application category. Two connectors were proposed in order to observe the dimensional variation: ‘evidence of ease’ and ‘evidence of difficulty’, that are specializations of ‘is a’ (is a type of) connector. ‘Evidence of ease’ connects codes related to aspects the inspectors considered as ease in the application of the technique. Similarly, ‘evidence of difficulty’ binds codes related to aspects considered as difficulties in applying the technique. When observing the ‘Ease to Apply Perspectives’ subcategory in Figure 4 one sees the high groundedness degree of the ‘Difficulty to assign issue to perspective’ code associated to the comments produced by six different inspectors although the ‘Division into perspectives is adequate’ code is an evidence of ease and also presents a high groundedness degree, being related to comments from seven inspectors. The causes for this problem in assigning the problem to the perspective were studied and it is possible to see that this difficulty was bound to a way of applying the technique that will be shown when describing the ‘Application Sequence’ category. The causal relations related to this aspect are discussed after the presentation of the ‘Application Sequence’ category. Figures 5 and 6 are supplemental to Figure 4 and show the codes associated to the ‘Inspectors’ Opinion’ and ‘Ease to Apply Pairs’ subcategories, respectively.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 58–69, fev. 2010
Applying Grounded Theory in Qualitative Analysis of an Observational Study in Software Engineering – An Experience Report 65 _______________________________________________________________________________________________________
Figure 4 – Associations related to the Ease of Application category
Figure 5 – Associations related to the Inspectors’ Opinion subcategory
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 58–69, fev. 2010
66 Conte, T. et al. _______________________________________________________________________________________________________
Figure 6 – Graphical representation of the Ease to Apply Pairs subcategory
Finally, Table 5 presents the definition and properties for the Application Sequence category and Figure 7 shows its relationships. This category presents the codes attributed to the comments from the inspectors on the manner they applied the technique. No subcategories were created for this category, and each code associated was considered a variation or type of sequence followed in the application of the WDP technique, being related to the category via the ‘is a’ (is a type of) connector. Table 5 – Properties of the Application Sequence Category Category
Concept
Variation
Application Sequence Order followed by the inspector in applying the WDP technique. The structure was observed in terms of elements with an adequate structure (organization) and elements with an inadequate structure Negative (-): Application followed no fixed order Neutral (0): Application in the suggested order Positive(+): Application with order adapted by the inspector after familiarity
For the purpose of controlling the variation of this category the following values were attributed to each associated code: negative (-) for the cases where the application of the technique did not follow any fixed order; neutral (0) for the cases where the application of the WDP was done in the suggested order (first the H x P pairs related to the Presentation perspective, followed
by the H x P pairs related to the Navigation Perspective and finally the pairs included in the Navigation Perspective); positive (+) for the cases where the application of the technique followed an adapted sequence by the inspector after he attaining familiarity. For the purpose of controlling the variation of this category the following values were attributed to each associated code: negative (-) for the cases where the application of the technique did not follow any fixed order; neutral (0) for the cases where the application of the WDP was done in the suggested order (first the H x P pairs related to the Presentation perspective, followed by the H x P pairs related to the Navigation Perspective and finally the pairs included in the Navigation Perspective); positive (+) for the cases where the application of the technique followed an adapted sequence by the inspector after he attaining familiarity. From Figure 7, it can be seen that the WDP Application Sequence varied between: • Switching between ׃from the H x P pair to identify a defect and from the defect to identify an H x P pair – this is a case considered as negative variation as the application did not follow any fixed order. • Suggested order (1st Presentation, 2nd Navigation, and 3rd Conceptualization) – this variation is considered neutral as the application followed the order suggested by the researchers.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 58–69, fev. 2010
Applying Grounded Theory in Qualitative Analysis of an Observational Study in Software Engineering – An Experience Report 67 _______________________________________________________________________________________________________
Figure 7 – Graphical representation with the associations related to the Application Sequence category
• Execution order: 1st Presentation, 2nd rd Conceptualization, and 3 Navigation – this is one of cases of order adaptation after the familiarity of the inspector with the technique. This order was followed by only um inspector who presented the justification explaining the change in the order to evaluate the perspectives, making Navigation the third perspective to be evaluated: the Navigation Perspective is not static and, to assess a link one needs to change page in the Web application. When evaluating this comment, it was noticed that this inspector was right and, when following this new order, it there was one cut the effort in inspection because the inspector did not need to return to the previous page (prior to evaluating the navigation) to assess the pairs of the Conceptualization perspective. Although this was a remark from only one inspector it was an important contribution that caused the order suggested for the application of perspectives to be changed as from the results of this analysis. • Order in execution: free after familiarity – this is another order adaptation after the inspector became familiar with the technique. This code was attributed to a direct citation from an inspector but it is related to codes such as ‘Familiarity with the technique allows the inspector to rearrange the inspection approach’ and ‘Familiarity with the technique leads the inspector to detect other heuristics associated to a problem’, that are associated to citations of other inspectors. With the detailed analysis of the codes of the Application Sequence category, it is possible to observe that there was no single reply to the research question of the study (‘How do the inspectors apply the WDP technique?’). Different forms of application were identified and it is possible to see a variation in the degree of ease/difficulty for the application of the technique related to the form of application. The results
on the application sequence of the WDP technique are detailed in [7]. The application of the Grounded Theory Method entails still the selective coding stage, where the core category of the theory is identified, to which all the others are related. The core category should be capable of integrating all the other categories and express the essence of the social process that occurs between those involved. A decision was made in the present study for not yet selecting a core category related to the phenomenon that explains the manner of application of the WDP technique. This decision is due to the fact that a rule in the use of the Grounded Theory Method is the circularity between the collection and analysis stages ([3]), where the properties of a conceptual category are validated by the researcher via the analysis of new data collected. As in this study there was only data collection, it has not been possible to validate the properties of the categories identified. It should be pointed here that not all the codes indicated in the questionnaires were related to the categories created until then and, apart from that, it has not been possible to validate the properties of the categories identified. As the coding process still is deemed as incomplete, it was decided to wait for the collection and analysis of new data, which may imply the creation of new categories that can be associated to these codes with the presently low density degree. The rule of thumb in the Grounded Theory is to continue the process of systematically collecting and analyzing the data until theoretical saturation is reached. According to [3], this final stage takes place when the marginal gains in the explanatory power of the theory for further evidence collected is nearly null. Then, once new data collections are done, the validation of the proposals made may be done with the results of the open and axial coding, as well as the proposals and verifications of new concepts, categories and relationships may be made until the moment of
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 58–69, fev. 2010
68 Conte, T. et al. _______________________________________________________________________________________________________
integration of a substantive theory in the selective coding. This study uncovered details on the possible modes inspectors can apply the WDP technique with. However, it is not possible to state that the results were conclusive as only one data collection was performed. It is necessary to collect new data to analyze how other inspectors may apply the WDP in other circumstances. 4. Conclusions and Lessons Learned [12] states that via an observational study it is possible to collect data on how the technology is applied and acquire a refine understanding on the technology, when witnessing eventual difficulties the participants may display. Under this view it is possible to see that the Grounded Theory procedures were fundamental to understand the behaviour of the participants during the interaction with the object of study. It is worth mentioning that quantitative approaches are quite useful in the providing of pointers related to the measures on efficacy, cost, and quality, for example. However, restricting the identification of the causes that led to the obtaining of these pointers to the quantitative analysis may hide relevant aspects of the behaviour of individuals whose influence cannot be ignored. An important lesson learned in this experience relates to the construction of data collection instruments. It was expected that the forms of observation would serve as the main data source. However, most of the data collected by this instrument was related to the application of each heuristics, in an individual manner, and not to the use of the technique as a whole. Fortunately, the instruments that were at first considered as supplemental were capable of capturing and providing significant data for the analysis. Therefore, special attention should be given to the construction of collection instruments under penalty of not gathering adequate elements to support the qualitative analysis to the benefit of clarification of the matter of research. To conclude, it is necessary to move forward in the construction of evidence in Software Engineering and, in this sense, the contribution of qualitative methods in dealing with some intangible and crucial for the thorough understanding of the issues inherent to the production of software is unquestionable. This paper discussed an example of qualitative analysis used for evolving a usability inspection technique. It is important to observe there are several other research questions in Software Engineering, as those mentioned in Section 1. Qualitative analysis methods are needed in software engineering especially in studies involving observation of processes and assessment of individual or collective profiles, as in [5]. Only the wider understanding of the phenomena that surround Software Engineering can support the development of technologies, effective and capable of boosting results both from the business point-
of-view as from the point-of-view that contemplates the satisfaction and well-being of society. Acknowledgements We thank Cleidson de Souza and Mariano Montoni for their assistance in using GT procedures in the observational study, to Jobson Massolar and Vinicios Bravo for their assistance during the execution of the work, the participants of the study, Verônica Vaz and Ulisses Vilela for taking part in the reviewing of WDP technique after getting these results. We are also grateful for the financial support provided by FAPEAM, CNPq, and FAPERJ. Special thanks are given here to Professor Emilia Mendes for her constant collaboration throughout the entire research work regarding WDP. References [1] BANDEIRA-DE-MELLO, R., 2006. "Softwares em Pesquisa Qualitativa". In: Godoi, C. K., Bandeira-de-Mello, R., Silva, A. B. d. (eds), Pesquisa Qualitativa em Estudos Organizacionais: Paradigmas, Estratégias e Métodos, Chapter 15, São Paulo, Saraiva. [2] BANDEIRA-DE-MELLO, R., CUNHA, C. (2003). Operacionalizando o método da Grounded Theory nas Pesquisas em Estratégia: técnicas e procedimentos de análise com apoio do software ATLAS/TI. Encontro de Estudos em Estratégia. Curitiba, Brazil. [3] BANDEIRA-DE-MELLO, R., CUNHA, C., 2006. "Grounded Theory".In: Godoi, C. K., Bandeira-deMello, R., Silva, A. B. d. (eds), Pesquisa Qualitativa em Estudos Organizacionais: Paradigmas, Estratégias e Métodos, Chapter 8, São Paulo, Saraiva. [4] BIANCHI, E. M. P. G., IKEDA, A. A. (2006). Analisando a Grounded Theory em Administração. IX SEMEAD - Seminários em Administração. São Paulo, Brazil. [5] CARVER, J., 2004. "The Impact of Background and Experience on Software Inspections." Empirical Software Engineering, v. 9, n. 3, pp. 259-262. [6] CONTE, T., MASSOLAR, J., MENDES, E., TRAVASSOS, P. G. H., 2009. "Web Usability Inspection Technique Based on Design Perspectives." IET Software Journal, n. Special Issue: Selected Papers of SBES 2007, pp. 1-18. [7] CONTE, T., VAZ, V., MASSOLAR, J., MENDES, E., TRAVASSOS, G. H., 2008. "Process Model Elicitation and a Reading Technique for Web Usability Inspections". In: International Workshop on Web Information Systems Engineering for Eletronic Businesses and Governments (E-BAG
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 58–69, fev. 2010
Applying Grounded Theory in Qualitative Analysis of an Observational Study in Software Engineering – An Experience Report 69 _______________________________________________________________________________________________________
[8]
[9]
[10] [11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
2008) v. LNCS 5176 - Advances in Web Information Systems Engineering - WISE 2008 Workshops, pp. 36-47, Auckland, New Zealand CUKIERMAN, H., TEIXEIRA, C. A. N., RUBERG, N., 2006. "Apresentação". In: WOSES 2006 Um Olhar Sociotécnico sobre a Engenharia de Software., pp. iii-iv, Vila Velha, Brasil DE SOUZA, C. R. B., REDMILES, D., CHENG, L.-T., MILLEN, D., PATTERSON, J., 2004. "How a good software practice thwarts collaboration: the multiple roles of APIs in software development." ACM SIGSOFT Software Engineering Notes, v. 29, n. 6, pp. 221-230. GLASER, B., 1992. Basics of grounded theory analysis. Mill Valley CA, The Sociology Press. GLASER, B., STRAUSS, A., 1967. The discovery of grounded theory: Strategies for Qualitative Research. New York, Aldine Transaction. MAFRA, S., BARCELOS, R., TRAVASSOS, G. H., 2006. "Aplicando uma Metodologia Baseada em Evidência na Definição de Novas Tecnologias de Software". In: Proceedings of the 20th Brazilian Symposium on Software Engineering (SBES 2006), v. 1, pp. 239 – 254, Florianopolis. October. MONTONI, M., ROCHA, A. R., 2007. "A Methodology for Identifying Critical Success Factors That Influence Software Process Improvement Initiatives: An Application in the Brazilian Software Industry". In: Software Process Improvement – 14th European Conference, EuroSPI 2007, v. 4764/2007, pp. 175-186, Potsdam, Germany SEAMAN, C. B., 1999. "Qualitative Methods in Empirical Studies of Software Engineering." IEEE Transactions on Software Engineering, v. 25, n. 4, pp. 557-572. SHULL, F., CARVER, J., TRAVASSOS, G. H., 2001. "An empirical methodology for introducing software processes." ACM SIGSOFT Software Engineering Notes, v. 26, n. 5, pp. 288-296. STRAUSS, A., 1987. Qualitative analysis for social scientists. New York, Cambridge University Press. STRAUSS, A., CORBIN, J., 1998. Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory. 2 ed. London, SAGE Publications. TAIPALE, O., KARHU, K., SMOLANDER, K., 2007. "Observing Software Testing Practice from the Viewpoint of Organizations and Knowledge Management". In: Proceedings of the First International Symposium on Empirical Software Engineering and Measurement, pp. 21 - 30,
[19] VAZ, V., CONTE, T., BOTT, A., MENDES, E., TRAVASSOS, G. H., 2008. "Inspeção de Usabilidade em Organizações de Desenvolvimento de Software – Uma Experiência Prática". In: Proceedings of the 7th Brazilian Symposium on Software Engineering (SBQS 2008), v. 1, pp. 369378, Florianopolis.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 58–69, fev. 2010
A Systemic Approach to Software Project Management Guilherme G. de Carvalho1, Hector P. de L. Oliveira1, César A. D. C. do Nascimento1, Antônio C. V. Pereira1, Daniel de A. Penaforte1, Daniel V. S. Julião1, Hermano P. de Moura1 1
Universidade Federal de Pernambuco (UFPE), Centro de Informática, Recife – PE – Brasil (guilherme, hector, cesar, valenca)@portalholon.com, (dap4, dvsj, hermano)@cin.ufpe.br
Abstract. By understanding how the various parties involved in the project are related and integrated, professionals anticipate consequences and evaluate problems and solutions more effectively. So, the application of systems thinking in projects environments promotes a holistic approach for greater understanding of most of the influences. Thus, the quality management must be related to all parties in a project and the manager is responsible for evaluating individuals, projects and organizational goals together as a whole. This article explains with practical examples how systems thinking impacts the decision-making before, during and after a project by allowing implicit assumptions identification, human factors incorporation, dynamic view of work processes, project behavior representations and other project main characteristics, improving the quality of the processes and results. Keywords: Project Management, Systems Thinking, System Archetypes (Received October 30, 2009 / Accepted January 22, 2010) 1. Introduction Many activities have similar dynamic characteristics [14]. Software development, presentation preparation, and building a nuclear power plant are some examples of projects. All these activities have the same generic processes in a greater or lesser degree of detail: goals definition, deadlines estimative, schedule of activities development, resources planning, monitoring and controlling, and planning of workload to perform these activities [13]. Actually, the quality of the final product is directly linked to good performance of these activities. In addition to traditional functions such as lead, organize, coordinate, select and allocate staff, software projects managers must include systemic aspects in their decision-making process. In this context, this article presents various elements related to the understanding of the reality and effectiveness of project managers in their actions, either in social or technical perspective. The modeling of dynamic systems can be seen (1) as a mapping process that uses graphics, diagrams, words and simple and friendly algebra to enable and capture knowledge from groups of people acting as a team, or (2) as a set of systematic knowledge developed to organize, filter and structure the vast knowledge that a team of individuals share [8]. So, it helps the development of learning environments from the moment where people are able to understand, test, challenge and redefine their ways of thinking. The evolution of these learning environments contributes to increase the quality of the results [2].
Moreover, the System Dynamics shows itself as the most appropriate method for modeling dynamic environments [17], such as the software projects environments. In fact, the System Dynamics is different from other analysis approaches as it presents a vision involving mixed qualitative and quantitative data analysis when assigns logical and mathematical concepts to subjective relations from the environment, leading individuals to develop the ability to think their reality systemically. Among the many benefits of applying systems thinking in the projects environment, is enhanced the ability to observe the prior influences of the project, to identify the implicit assumptions and conflicting factors, to understand the impact of quality policies, to systemically understand the processes and its positive and negative consequences [9]. Methodologies such as the Soft Systems Methodology (SSM) suggests that it is necessary a reflection process about actions to be taken for the effective realization of organizational changes needed to recover and improve problematic situations [4] [5] [6] [7]. As shown in figure 1, the SSM assumes that a problematic situation always involves people with different world views and therefore different perspectives about the problem definition, objectives and what should be done [7]. The expected result is not necessarily the creation of shared perceptions, but at least an accommodation between views and conflicting interests so that desired changes can be effectively implemented [4]. The systems approach helps managers and leaders of organizations and projects to understand their
A Systemic Approach to Software Project Management 71 _______________________________________________________________________________________________________
organizational systems in a simple, direct and connected way. For that, system archetypes are generic structures composed of cause-effect relations that are recurrent in different contexts while people are usually not aware of their effects [14].
Figure 1. Soft System Methodology as a learning system [5]
Thus, the revelation of system archetypes in the projects environment can inspire effective actions to the problematic situations that they represent. Therefore, this article aims to promote a systemic analysis of the factors involved in software projects and how these affect the quality of the final product and the environment as a whole, trying to show through system archetypes the dynamic structures that involve software projects. Focusing on a better contextualization, the following section presents some results of research involving management styles and how project managers work and think. In section 3, the systemic perspective is related to the project environment as it tries to demonstrate relationships between some relevant variables. Subsequently, section 4 explains how the systemic perspective contribute to the decision making process, while section 5 gives some final comments about the lessons learned with this work. 2. Stakeholders Mental Models Managers plan and monitor projects everyday. That is known as 85% of them assume that plan and 82% assume that monitor their projects [12]. The ability to understand how his or her routine works and how daily decisions are made is empowered by the use of causal diagrams associated with reflections of past and current actions. This ability allows managers to become more consistent in their decisions [14]. In fact, project managers are always developing the ability to solve immediate problems, setting goals, establishing schedules, allocating activities, among others [11]. Human beings quickly determine the cause for any event that is considered a problem and usually conclude that the cause is another event [14]. In the real world, the way how manager perceives (mental model
and world view) and handles the decision making process is crucial and directly affects the success of comprehension, planning and implementation of their goals [18]. Thus, a systemic view of the complex structure that involves the project decisions offers a great difference to the quality of decisions taken before, during and after the project. Actually, managers need to learn how to reflect on their current mental models and expose them to become more flexible to change [14]. Without this reflection, they will not be able to challenge their world vision, becoming limited to experiment new ways of thinking. Moreover, this inability should reflect on the team that will become restricted as a collaborative team. Flexibility brings new perspectives, allowing people to imagine creative solutions [3]. In fact, many project managers stuck into some mental models becoming prisoners of them [1]. For example, activities progress should be reported to management as the work is developed with allocated resources (people, equipment, processes, etc.). Thus, the manager monitors and evaluates the progress according to what was estimated before. When a difference between the estimated date and the actual date appears, adjustments are done in order to improve performance of the next activities. From this moment, the cycle is restarted with a new sequence of activities. Thus, these managers wait until they notice delays in the schedule to identify problems and only after that, rethink how they should act from the problematic scenario. Many still evaluate the progress and quality of the project using this procedure. In other words, managers need to reflect their structures of thought to avoid becoming prisoners of them. Actually, the system that involves the quality and the project management is a complex conglomerate of interdependent variables [1]. When reflecting on some variables (such as motivation and communication), it is possible to realize that effective management must consider the system’s complexity. That is, the quality of a component, even the smallest one, affects the quality of the product as a whole. An unenthusiastic software engineer, for example, can lead to the failure of an entire team. To maintain the quality of processes and products, the capacity of the whole team, especially of the project manager, to view the project from a systemic perspective turns into a basic condition due to the increasing complexity involved in software projects (teams growing, temporary employees, different technologies at each moment, etc.). It is known, for example, that the more participative is the style of management, the greater is the probability to increase the involvement and commitment of project members, thereby increasing the likelihood of project success [2]. However, according to a research conducted with project managers in Brazil, about 55%
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 70–79, fev. 2010
72 Carvalho, G. G. de et al. _______________________________________________________________________________________________________
of managers say that ignore concepts, definitions and practices of motivational theories and the other 45% say that, even knowing something, rarely apply any technique or practice to improve the team’s motivation [15]. Actually, many attitudes and ideas of how to encourage such participation do not come to practice because of conflicts with powerful implicit mental models [14]. Therefore, they are trapped to the paradigm that while things are working, no improvement is needed. This is the conformed thought which is associated with environments that promote dynamics such as win-lose or lose-lose. Thus, problems are not definitely solved so that it no longer occurs. In fact, this is just one of the inhibitors variables of the typical model found in most organizations, where are created conditions for unproductive conversations, selffulfilling prophecy, self-occlusive processes and successive errors, while the organizations and their managers remain unaware of the creation of these conditions [18]. This model promotes the unilateral control and competitiveness. Attached to these error conditions, people want to conceal their mistakes and, as inevitable consequences, the effectiveness is undermined, the confidence is shaken, the distance between members grows and the injustice becomes common. Then, the project quality is affected by these behavioral aspects. Finally, in this environment, people and teams will never be open to a true paradigm shift of their own actions and organizational culture [18]. A large amount of the workers’ time actually happens in the organization’s environment. It has implications in the individual’s conscience orientation when producing artifacts of all kinds, such as software or buildings [2]. However, although it is correct to say that the system affects people's behavior, it is also true that human complexity does not end in the environment of the project or the organization. Humans are actually greater than their circumstance. The manager and others involved in the project are directly influenced by the environment and influence it reciprocally. Therefore, the mental models of each member together form a pattern, establishing a mental model of the whole environment in a dynamic of reciprocal influence. These interactions sometimes result in confluence, sometimes in conflicts because of situations of incompatibility. Over time, however, these models are self-adjusted in order to minimize such conflicts. Therefore, the manager’s actions always aim at efficiency in accordance with their own internalized mental model and the perception that he or she has of the environment in which he or she is involved, as well as the perception that he or she has from the people who are also involved. So, when the manager`s mental model is well associated with the group’s mental model, the manager’s actions tend to have positive results.
3. Systemic Perspective of the Project Systems Thinking is interested in the essential characteristics of the dynamic and integrated whole, which are not fully represented by its pieces, but by the dynamic relationships between them, between them and the whole, and between the whole and other wholes [18]. Thus, the systemic perspective encourages those involved in the project to have their focus of thinking changed from the parts to the whole. That is, rather than look at each project area by itself, the manager should be directed to the project as a whole, its variables and how they relate and influence each other, the project and the organization. The software development process has many tangible and intangible variables that influence it [1]. These variables are not independent, but related to each other so that one variable causes several effects in others variables. Understanding the behavior of systems like these from practice is much more complex than the capacity of human intuition [18]. Individuals can not be aware of the many variables that influence all the activities that he or she is involved, such as management activities (planning, control, etc.) and product development activities (design, coding, reviewing, testing, etc.). Unwittingly, many authors sometimes construct analogies between systemic variables of the structure. These, however, are often incomplete because the common view of people is linear [14], especially in the western side of the world. That is, they realize that certain actions cause consequences, but rarely recognize that these consequences lead to other actions / consequences, which influence in many cases positively or negatively the early decision / action. So, the archetypes presented in this article are based on separate excerpts found in different articles, sections and authors. For that, many articles and books were read in order to get linear sentences and to relate them dynamically. Then, the archetypes were presented to some project managers for discussion and validation of the structures. People that develop the systems thinking ability understand much better and faster the impact that local actions have on the organization as a whole [16]. While thinking systemically and reflecting about their actions, teams learn as they work and such learning will positively influence the productivity and quality of the results [18]. This originates a virtuous cycle of reinforcement shown in Figure 2 below, where the variable "work" increases the variable "Learning," which in turn increases the capacity of work and understanding. In systemic language, arrows with "+" indicates that the expansion of a variable increases the other. On the other hand, it indicates that the decline of the first variable decreases the second. Therefore, systemically thinking the project is the ability to notice
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 70–79, fev. 2010
A Systemic Approach to Software Project Management 73 _______________________________________________________________________________________________________
these and other cycles and to understand how certain key variables influence directly or indirectly each other. + Work
R
Learning
+ Figure 2. Reinforcement cycle of work and learning
Moreover, while the software is developed, many factors reduce productivity [11]. These losses in the process (time spent preparing the environment, rework, bureaucracies, etc.) are caused, for example, by the pressure of timing and late revisions to detect errors [1]. It is quite common, for example, in software projects to have an increase in the week workload to enhance the implementation of the project activities and reduce the delay in the short term. This situation can be seen as an example of the so-called balancing cycle which is illustrated in Figure 3.
Delay in the calendar
B
Additional working time
+ Figure 3. Balancing cycle of schedule delay and extra workload
As it shows, the "delay in the calendar" increases the "additional working time" reducing the schedule delay. In systemic language, the arrow with the "-" indicates that the expansion of a variable decreases the other. On the other hand, the decline of the first expands the other. Analyzing systemically the environment of a project, it is possible to see how the parts are interrelated. In fact, projects have several non-linear relationships [17]. For instance, in the example of the delay resolved by the extra workload, it is estimated that this extra working time leads to a reduction in team productivity over time because members tend to feel tired with the extra workload. As the cycle continues, more errors are caused which generate more rework. In turn, this rework causes delay in the results and schedule of the project [1]. Figure 4 illustrates how these variables interact with each other, showing a practical example of a system archetype known as Fixes that Fail archetype. It shows that the hiring of extra hours is seen as a temporary and symptomatic solution for the delay in the project as it does not resolve the real cause of the problem and causes unintended consequences over time (shown in graphical notation by two strokes on the arrow). So, solving problems with short term solutions can generate a Fixes that Fails archetype where the quick solution causes unintended consequences, which over time raises once again the problem [18]. Thus, project
managers should resist the temptation of quick fixes that focus on short-term solutions. The solution should be fundamental in order to cause real change in the team involved in the project. In other words, it should not generate unintended consequences that might cause the problem again in the future. In the former case, it is understandable the preference for the rapid solution of hiring additional working time when focusing in shortterm solutions as the fundamental solution of hiring and training new members is difficult to implement and would not bring immediate results. Indeed, the evaluation of the workforce needed to perform a specific task of the project must consider several variables such as experience level of the team members, training for beginners, time to assimilate the training and losses in internal communication considering the size of the team [13]. Thus, the recruitment of new members requires time to impact positively the productivity. Indeed, the management decision to hire more employees in order to accelerate the completion of a project certainly increases the productivity of the organization in the long run, however, in the short term this productivity can be reduced as experienced workers will have to split time between their respective tasks and the training of new members [13]. Delay in the - calendar
B
+ Rework +
R
Errors +
+ Additional working time
Fatigue +
Figure 4. Additional working time as temporary solution
On the other hand, hiring additional working time leads to loss in process, resulting in more disordered work. Indeed, even cluttered, the team's work brings the project to its end. However, the quality of the result will be compromised. As it gets closer to the end of the project, the resistance to allocate new people to the project increases [1]. Thus, the proximity to the project conclusion decreases the chances of hiring and training new members. Figure 5 illustrates an archetype that explains how this structure functions for the problem of schedule delay. This structure refers to an archetype strongly associated with the psychology literature and its combined areas, Shifting the Burden [18]. This is due to the fact that, as the symptom of the problem appears, the simpler solution is faster and relieves the problem, while the person responsible for resolving the problem no longer thinks about it and its causality. However, the
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 70–79, fev. 2010
74 Carvalho, G. G. de et al. _______________________________________________________________________________________________________
quick solution generates side effects that, over time, decrease the possibility of a fundamental solution. That is either because of the insistence to adopt repeatedly the simplest solution or because of the increasing difficulty that the time causes in implementation of the complex solution, such as illustrated in the previous example. Actually, the best strategy for this structure is to immediately apply the quick fix in order to solve the problem in short-term, but on the same moment, take actions that lead to the fundamental solution to ensure that the problem does not recur in the future [18]. Hire additional working time
+
+
Disordered work
B -
Delay in the calendar
R -
B +
+ Proximity to the end of the project (compromissed quality)
Hire and training new members Figure 5. Shifting the Burden on schedule delay
the manager to the activity of project management in many companies is variable [12]. In fact, this variation is limited by the pressure exerted by the top management that requires full dedication to the project only when levels of stress, risk, delays and losses are high. The dedication of the project manager affects other variables in the system as a whole, such as staff productivity. Figure 6 shows another archetype, Limits to Growth, which shows the consequences of these variations of the manager`s dedication on other variables that are critical to the success of the project. From the archetype, it is possible to understand how learning, involvement and commitment of the team together with the results and productivity of the project are limited by the dedication of the manager. It is seen that a large number of favorable variables to the project originate reinforcement cycles with the team productivity. That is, the higher is the productivity, more results, involvement and learning. Moreover, from the balancing cycle, it is seen that the higher is the productivity, less stress factors such as internal risks and delays and, consequently, less need for dedication of the project manager. In this case, the organization needs to understand that reducing the dedication of the project manager to a specific project reduces productivity and generates future complications.
Another critical factor in project management is the commitment of the manager. In reality, the dedication of +
+
Dedication of the manager to the project management +
+ Learning + Team involvement + Results
B R
+ Productivity +
Pressure exerted (calendar, higher management) Risks Delay in the calendar
-
Figure 6. Consequences of manager’s dedication
In many archetypes mentioned is possible to perceive the action of time. The key problem when dealing with timing is combining two abilities: to understand how the structure works and to exercise patience [18]. In the project management environment it is seen several delays, such as between the time the errors are discovered and fixed, or between the time members are recruited and effective in the new function, or even between the time it takes to respond to any
change on the environment of the project [17]. For example, with a sudden growth in the size of the project, the adjustments done in the team involved or in the calendar are not instantaneous, because time is needed for these to take effect [1]. An example of archetype where time is considered highly relevant refers to the Growth and Underinvestment archetype. This is due to the fact that in this archetype structure, the one in charge of the project must predict and plan how the necessary
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 70–79, fev. 2010
A Systemic Approach to Software Project Management 75 _______________________________________________________________________________________________________
resources will be affected with the increasing demand for performance and capacity. Figure 7 illustrates an example of this archetype where the company's growth is seen as it is hiring people while accepting new projects in a reinforcement cycle where both number of projects that the organization is involved and team’s size are increasing together. For almost all organizations this scenario is great since the existence of new projects and employees indicates the growth of the organization. However, the manager must be aware of several limitations that this growth will come to find. In the example illustrated in Figure 7 the problem of communication is often overlooked by management. By understanding the archetype, it is possible to see that the growth in the number of projects cause problems with the communication’s processes, especially when dealing with organizations that have these processes fairly undefined. The efficiency of communication channels, in turn, directly affects the productivity of projects since the team keeps consistency in the design and avoids rework when communicating better [11]. In the same way that a team becomes more productive, the amount of projects in the company tends to grow. Unfortunately, the other way is also true and forms a balancing cycle in the archetype (called B1 in Figure 7), where the growth in the number of projects reduces the efficiency of communication, reducing productivity and the number of projects in the company. Further reading the archetype, one realizes that the decrease in the number of projects increases the efficiency of communication that increases productivity and the number of projects. Thus, this balancing cycle limits the growth of the number of projects and employees in the company. Productivity of projects
Hiring people
R
+
+ + Accepting new projects
B1
+ -
Efficiency of communication channels + Team integration +
B2
Loss of process
+ Investments in integration
Figure 7. Growth of the organization and underinvestment in communication
The interpretation of this archetype leads to the conclusion that investments in communications and integration, such as manager’s dedication and organization's resources allocation, are basic to support
organizational growth. So, the efficiency of the communication channels limits the growth of the organization. Because this is an internal factor and it is opened to investment, when the manager does not invest in improving the communication of his or her team and between teams in organization, this scenario is perceived as a growth with lack of investment. The other balancing cycle, illustrated in Figure 7 (B2), shows just how the efficiency of communication channels affects the organization. It is possible to see that investments in integration favor the continuous growth. Through this archetype, it is possible to see in a systemic way the problems of communication and the factors that lead the manager not to invest in communication for a period of time. Actually, there is a period of time in which the manager and other members believe that the communication is efficient until they realize some losses in the processes. In most companies, only when these losses are noticed, investments are done to solve it. This usually takes time, turning the implementation of the fundamental solution even more difficult after the problem is perceived. That is, the manager must see ahead the growth and perceive the possible limitations the project might face. Another extremely important aspect in systemically understanding the software project processes refers to detection of errors. Actually, working disorderly increases the amount of errors found during the project. Hidden errors exist and obviously attempts to minimize them are done when they are found, trying to improve the organization of work. So, the discovery of errors tends to improve the organization of work. On the other hand, the existence of hidden errors increases over time the delay in calendar, increasing consequently the time dedicated to reflections on the actions done to understand the problems. The reflection on the actions done, in turn, decreases over time the occurrence of hidden errors. The solution of problems through the reflection on the actions is considered definitive because it is directed to solving the problem in long term without negative consequences [18]. While considering these increases systemically, it is perceived the effects that the quickest solution (to continue the messy work) causes when the manager tries to achieve faster results: the need for control and occupation of manager. Figure 8 illustrates an archetype of Shifting the Burden based in this example. It is possible to see that the problem of the existence of hidden errors has a quick solution which is to continue the messy work ignoring the root of the problem. Thus, the tendency is to appear side-effects hinding the capacity to reflect about the actions while occupying even more the manager and other members of the project, not allowing them to find time to reflect and learn from their actions and mistakes.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 70–79, fev. 2010
76 Carvalho, G. G. de et al. _______________________________________________________________________________________________________
With this archetype, it is possible to understand one of the great benefits of reflection on actions. In fact, for understanding the parts and their relations with the whole is extremely important to practice the reflection of the actions [18]. However, in the daily management of projects, it is common to find practices that inhibit the motivation to this reflection.
+ Errors found
Disordered work B
+
+ Need for control
Hidden errors
R
+
+ Delay in the calendar
Occupation of manager
B Reflection on the actions +
The excessive control of activities, for example, may not be immediately seen as an inhibitor, but while systemically analyzing some structures, it is possible to see a conflict between the encouragement of learning against the control and monitoring of the team. To understand how this happens, one must realize that the decrease in the dedication of the project manager leads to loss of process, increasing the workload of the team and the need for control of it. Moreover, the decrease in commitment leads to losses in communication, affecting the learning and inhibiting the growth of the team’s experience level. This structure is known as the archetype of Escalation, where two balancing cycles are connected by a variable that tends to favor the results of only one side each time the cycle passes by, damaging the other. The result is a structure behavior that impossibilities the simultaneous and collaborative growth of the two cycles. That is, in the example, which is illustrated in Figure 9, the increase in the control of activities reduces the capacity of learning and vice versa. Actually, this is seen by the continuous variation of the project manager’s dedication that increases and decreases as the dynamic structure flows over time.
Figure 8. Solutions to the problem of hidden errors
Need for control of activities
Team’s experience level +
+ Workload of the team +
B1
+ Dedication of the project manager -
Loss of process
+
B2
Learning +
Efficiency of communication channels
Figure 9. Controlling and monitoring as inhibitors of learning and reflection
In the archetype of Escalation, the solution is the search for the separation of the cycles in order to avoid competition between the two activities. The most favorable solution is to encourage self-competition, where each one is concerned only with their own performance instead of becoming better than the other. Another suggestion is to stimulate cooperative action so that both can achieve together a common goal [18]. In the example, the reflections might stimulate the discovery of a different form of control in order to avoid clutter and allow the team to continue to reflect on their mistakes and successes. Finally, one of the most complex archetypes is the archetype of Accidental Adversaries. This archetype demonstrates how certain partnerships work and encourage reflection on the real commitment to a common goal of the partners involved. Often, people think they are acting in a collaborative way and,
however, there is a slightly competition between those that might become tragic [18]. As an example, one can consider the partnership between sales teams and planning teams in a project oriented company (Figure 10). Actually, customers demand increasingly faster results and the sales team has to accept the reduction to sell more. On the other hand, the efficiency of the project manager makes him or her more effective in planning and thereby increases the guarantee of project completion on time. At first, these two cycles work as reinforcements individually. Then, it is possible to realize the great partnership that these teams are up to when understanding that while more projects are sold, the management team becomes more efficient by learning from practice [2] or by hiring new competent managers. Also, the efficiency of the management team promotes the quality of the projects and gives more
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 70–79, fev. 2010
A Systemic Approach to Software Project Management 77 _______________________________________________________________________________________________________
arguments for the sales team to increase their sales even more. However, to really improve the sales goals, the sales team undergoes a shorter time of completion, increasing the pressure from top management to reduce the schedule set by the project manager. It reduces the efficiency of the management team and consequently the guarantee that the projects are finished on time. On the other hand, the efficiency of the management team
requires well planning with comfortable deadlines to avoid delays in the project. These feasible deadlines increase the time requested for projects resulting in the decrease of the sales of new projects. The two "partners" need to define if they really want to be partners and plan their future together, or if they want to be competitors [18]. Therefore, the sales team and management team should be closer in order to promote
Projects + R1 + B1
Pressure from high management to decrease the schedule length
+ Sales + +-
R2
Demand of faster results
+ Guarantee of project completion
R4
- + R3
Efficiency of the project manager +
Time requested for projects
+
B2
Quality of the + projects
Figure 10. Sales and management teams in an adversarial partnership
mutual success. From that, the sales team will be able to understand the production capacity of the company and the management team will understand the needs of the customers. Likewise, many other structures can be found in all processes of project management and quality assurance. Each archetype indicates structures of action and proposes a new perspective to see them. In fact, the archetypes focus on the clarification of the understanding of the whole. Actually, while considering the parts by themselves, people are unable to quickly identify these complex structures. Therefore, through systems thinking is possible to visualize the parts and the behavior of these as variables in a structure in a whole. 4. The Systemic Perspective in the Decision Making Process The decision can not be regarded as an arbitrary and ad hoc event. It would be like watching parts of the phenomenon without considering their functions in the whole. The decision is perceived as a network of relationships, meaning it as an indivisible whole, where
the boundary conditions were arbitrarily set in the organization [10]. In fact, factors identified as cultural or related to power, for example, do not belong only to the decision itself but to the entire organization and its external environment. So, they should not be isolated in the decision to avoid losing the context [10]. Then, people should not isolate the phenomenon of the decision, considering that everything in the organization is related. Actually, managers need ways to distinguish what is important or not, identifying the variables which they must focus on and which variables are less relevant. The management process of decision making is clearly favored by the application of systems thinking on it [14]. In fact, the art of systems thinking is to see through the complexity in order to understand the underlying structures that generate change [18]. Thus, the manager is able to make decisions and solve problems in the long term. In fact, systems thinking links the use of archetypes to the understanding of the complex underlying structures. Through the visualization of these structures, the manager is able to understand the effects that each
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 70–79, fev. 2010
78 Carvalho, G. G. de et al. _______________________________________________________________________________________________________
alternative in decision-making causes in the system as a whole. Each action reflects in several variables of the system and through the system archetypes the manager understands which of these should be handled in order to achieve the desired change. Thus, the decisionmaking is more consistent with the reality of the situation, which prevents the emergence of unintended consequences in the future [14]. Therefore, the system archetypes offer the manager a structured view of the real problems found in project management. The systemic models provide many benefits, but the main goal of these is to help the mental process of decision-makers to deal with the behavior of complex systems over time while representing the mental models in explicit formulations through causal diagrams. These models should be used as tools to support decision making process so that administrators can learn the consequences of their way of seeing reality, rather than being used only to make predictions about the future [3]. In fact, it turns out that decisions in any organization are based on information actually available to decision makers, and often these are not accurate [1]. However, one of the biggest problems faced by managers is not the lack of information, but the surplus, which creates complexity and makes the understanding of reality even more difficult [14]. Decisions are always based on variables that change dynamically during the life cycle of the project [1]. The decisions of project managers, in turn, are based on perceptions of the project status that may be different from reality at the moment. Thus, the difference between the perceived progress and real progress occurs because of the delays in the detection of errors in the development of a project, added to the delay between detection of mistakes and taking corrective actions, which gives a false impression of the progress realized by the manager and the real progress of the project [13]. In reality, some effects that are not initially visible affect positively or negatively the project more than many elements easily perceived by the manager or the team [1]. In fact, consciously or unconsciously, the management decisions are made based on the behavior of the system and result in actions that aim to change the actual state of the system. When the results of the actions affect the conditions of the system state, the manager can examine how it has changed. In other words, the manager examines the effectiveness of the management action taken [13]. Each change in the system generates information that can trigger new managerial actions that will trigger other changes in the system. This search for the ideal state of the system is processed by feedback into a set of causal relationships between the variables of the system.
Actually, the manager is responsible for deciding what actions should be more effective for each state of the system (moment) considering the possible expected and unexpected effects. Based on these diagrams and on the understanding of the complexity of the whole and before deciding on a course of action, managers must determine what are the consequences of each possible action and the consequences of their consequences from the perspective of interdependent actions [18]. However, sometimes unintended consequences appear only after several simulations or scenario planning. Fortunately, there are tools to help managers improve his or hers systemic perspective. Actually, the archetypes presented in this work were produced with two of these tools, SysMap and SimModel1, both granted by a partnership between the university and local companies. In order to encourage individuals to observe effective solutions through the complexity surrounding their situation, these tools help focusing on the reflexive thinking of the critical factors (internal and external) which are systemically more relevant. By learning how to understand the structures within which we operate, we began a process of liberation of forces that have not been previously identified and finally dominate the ability to work with them and change them [14]. The main practical result of systems thinking that affects the decision maker is the ability to identify how the actions and changes in structures can lead to significant and lasting improvements. By allowing a systemic view of the structure, archetypes help managers see the operation of these and then find the best points of leverage and action, especially in situations where there is a great pressure on the decisions. 5. Conclusion An effective and efficient management brings significant differentials from the application of the available resources. However, several difficulties are encountered by managers who are highly demanded due to the complexity of activities that are considered their responsibilities. Actually, the manager should have multidisciplinary expertise to be able to see the project as a whole and understand it. The application of systems thinking to project management allows the manager the ability to see the complex system involving the management and the critical factors that threaten and encourage it. In other words, it is possible to understand how various factors come together to originate a structure that involves all the processes and management practices. For that, is suggested the use of archetypes because they support the visualization of the mental processes involved in 1 Used for free in this research as a partnership with Holon Systemic Solutions, and available at http://www.portalholon.com
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 70–79, fev. 2010
A Systemic Approach to Software Project Management 79 _______________________________________________________________________________________________________
these structures that can last for a long period of time in a clear and concise way. The archetypes show behaviors that somehow people know only their immediate consequences. What is not easily seen are the feedback processes that reinforce or balance the actions taken. In fact, problems are not always easy to find out and often become recurrent in its generality. That is, many of the problems encountered are commonly recurring problems in the day-to-day of project management, either in the project, in the organization or in other organizations. Also, in many cases linear and conventional solutions are used for many problems even thought they are not suitable for many complex situations, as demonstrated in the situations addressed in this article. Future research includes the investigation of how system archetypes are revealed beyond the project management literature and how the discovery of them influences the manager’s actions. The manager is the responsible for analyzing which variables are the most relevant to the situation of his or her team, organization, project, client and the environment as a whole. Developing this holistic perspective tends to dramatically minimize the errors in decision-making processes currently prevalent. Thus, by understanding the structure as a whole that is affected by the complexity of relationships between variables, the manager is able to anticipate unwanted and undesirable consequences that emerge from every decision. Thus, the manager is able to plan for shortterm concrete and measurable goals guided to larger and long term goals.
[8]
[9]
[10]
[11]
[12]
[13]
[14] References [1] Abdel-Hamid, T. K. Madnick, S. E. Lessons Learned from Modeling the Dynamics of Software Development. Communications of the ACM. Dezembro, 1989. [2] Argyris, C. Schön, D. Organizational Learning: A theory of action perspective. Mass: Addison Wesley, 1978. [3] Andrade, A. et al Pensamento Sistêmico: Caderno de Campo. Porto Alegre: Bookman, 2006. [4] Checkland, P. (1981) Systems Thinking, Systems Practice. Chichester, West Sussex, England: John Wiley & Sons, 1981. 330p. [5] Checkland, P. & Scholes, J. (1990) Soft Systems Methodology in Action. Chichester, West Sussex, England: John Wiley & Sons, 1990. 329p. [6] Checkland, P. (2000) Soft Systems Methodology: A Thirty Year Retrospective. Systems Research and Behavioral Science, v. 17(S1), p. S11 – S58. [7] Checkland, P. & Poulter, J. (2006). Learning for Action – A Short Definitive Account of Soft Systems Methodology and its use for Practitioners,
[15]
[16] [17]
[18]
Teachers and Students. Wiley: Chichester, 2006. 200p. Figueiredo, R. Zambom, A. Saito, J. A introdução da simulação como ferramenta de ensino e aprendizagem. XXI Encontro Nacional de Engenharia de Produção. Salvador, 2001 Rodrigues, A. The Application of System Dynamics to Project Management: An Integrated Methodology (SDPIM). PhD Dissertation Thesis. Department of Management Science, University of Strathclyde, 2000. Leitão, S. P. Para uma nova teoria da decisão organizacional. Revista de Administração Pública, 31(2): 91-107, março/abril, 1997. PMI (Project Management Institute). PMBOK, Um Guia do Conjunto de Conhecimentos em Gerencia de Projetos - Terceira Edição. Project Management Institute, 2004. PMI (Project Management Institute) Brasil. Estudo de Benchmarking em GP 2008 (Relatório). Project Management Institute – Chapters Brasileiros, 2008. Disponível em . Acessado em 15/03/2009. Santos, A. M. A Aplicação de um Modelo de Simulação para o Gerenciamento de Projetos: um estudo de caso utilizando a dinâmica de sistemas. Dissertação de Mestrado, Escola Politécnica da Universidade de São Paulo, Departamento de Engenharia Naval e Oceânica. 2006. Senge, P. M. A Quinta Disciplina: Arte e Prática da Organização de Aprendizagem. 15. ed. São Paulo: Editora Nova Cultural, 2003. Silva, D. R. D. et al Um Retrato da Gestão de Pessoas em Projetos de Software: A Visão do Gerente de Projetos vs. A do Desenvolvedor. XXI Simpósio Brasileiro de Engenharia de Software. João Pessoa, 2007. Solinger, T. The Whole Works. PM Network – Setembro, 2004. Sterman, J.D. System dynamics modeling for project management. System Dynamics Group, Sloan School of Management, MIT, 1992. Valença, A. C. Mediação: Método de Investigação Apreciativa da Ação-na-Ação: Teoria e Prática de Consultoria Reflexiva. Recife: Bagaço, 2007.
_______________________________________________________________________________________________________ INFOCOMP – Special Edition, p. 70–79, fev. 2010