Maintaining Traceability During Object-Oriented Software Evolution: a ...

3 downloads 63 Views 146KB Size Report
Object-Oriented (OO) software system. The activity of checking the compliance of two software versions can be greatly assisted by automatic tools, that help a ...
Maintaining Traceability During Object-Oriented Software Evolution: a Case Study

Abstract

Keywords: traceability, versions compliance check, object orientation

1. Introduction

es, methods). However, results are too coarse grained and very difficult to summarize, visualize and interpret. Our process works on a code intermediate representation that encompasses the essentials of the class diagram in an OO design description language, the Abstract Object Language (AOL). The process recovers an “as is” design from the code in AOL, compares recovered design of subsequent software releases, and helps the user to deal with inconsistencies by pointing out regions of code which do not match i.e., added, deleted and modified classes and methods. Bunge’s ontology [5, 6] has been taken as the conceptual framework to define the similarity criterion. An object is viewed as an individual which possesses properties. Comparing individuals for similarity translates into checking the similarity of the individuals’ properties. When instantiated in the context of version traceability, individuals become classes, while properties are mapped into attributes and methods. We adopt a multi-step approach: at the first level a class interface average similarity is derived from class and attributes/methods names and signature by means of string edit distances [14]. Then, a maximum match algorithm [8] computes the best mapping between releases. The approach has been experimented with on 9 releases of the LEDA library. Support tools have been developed to extract the similarity measure, to compute matching and finally for result visualization. Indeed, extracted information presentation is probably the most critical issue: to communicate and discuss our findings with project managers and programmers a pair-difference coloring technique was adopted. A colored class diagram graph summarizes traceability relations assigning different colors to common information and information present in the old release but absent in the new one. The paper is organized as follows: Section 2 introduces our traceability links recovery process. In Section 3 we describe our tools, while in Section 4 we present the results of the case study. In Section 5 we discuss visualization relat-

AF

This paper presents an approach to build and visualize traceability links and properties of a set of OO software releases. The process recovers an “as is” design from C++ software releases, compares recovered designs at the class interface level, and helps the user to deal with inconsistencies by pointing out regions of code where differences are concentrated. The comparison process exploits edit distance and a maximum match algorithm and has been experimented with 9 releases of a library of foundation classes. Results as well as consideration related to presentation issues are reported in the paper.

T

G. Antoniol, G. Canfora, A. De Lucia University of Sannio, Faculty of Engineering Palazzo Bosco Lucarelli, Piazza Roma I-82100 Benevento, Italy [email protected] fg.canfora,[email protected]

DR

Large software systems evolve continuously to meet ever changing user needs. Changes may be driven by market pressure, adaptation to legislation, or improvement needs. Maintaining traceability links between subsequent releases of a software system is important to evaluate version deltas, to highlight effort/code delta inconsistencies, and to assess the change history. This can help planning the future steps of evolution and evaluating the reliability and cost of changes before the actual intervention takes place. This paper presents an approach to establish and maintain traceability links between subsequent releases of an Object-Oriented (OO) software system. The activity of checking the compliance of two software versions can be greatly assisted by automatic tools, that help a programmer to identify regions of code which do not match between two software releases. Context diff between files may be applied to establish similarity between entities belonging to different software releases (files, class-

Code Metrics Extraction

V i C++ Source Code

Code2AOL Translation

Code AOL Specifications AOL Parsing and Similarity Computation

Code2AOL Translation

Version

Scores

Comparison

Result Visualization

Code AOL Specifications

T

V j C++ Source Code

Vi - V j

Code Metrics Extraction

AF

Figure 1. Version Comparison Process.

ed issues, while Section 6 compares our approach to related work. Finally, in Section 7 we draw some preliminary conclusions and outline future works.

2. Versions Traceability

The whole version comparison process is represented in Fig. 1. The process consists of the following activities:

1. AOL Representation Extraction: in this phase an AOL system representation is recovered from code through a Code2AOL extractor;

2. Software Metrics Extraction: class level as well as function level software metrics are computed;

DR

3. AOL Parsing and Similarity Computation: for any given class in version Vi and any given class in version Vj a similarity weight is assigned;

4. Version Comparison: by means of a maximum flow algorithm an optimum matching is computed;

5. Result Visualization: computed match and extracted information are organized in a hierarchy of views of increasing details.

In the following subsection we will highlight the key issues of these elements and the related implications.

2.1 AOL Representation and Software Metrics Extraction AOL has been designed to capture OO concepts in a formalism independent of programming languages and tools.

AOL is a general-purpose design description language, capable of expressing concepts available at the design stage of OO software development. The language resembles other OO design/interface specification languages such as IDL[17, 24] and ODL[18]. More details on AOL can be found in [11, 1]. Software metrics are extremely appealing when a large software has to be assessed and no a-priori documentation and/or information are available. Several papers [15, 7, 25, 9, 20] and books [10, 19, 22] investigated software metrics attempting to draw conclusions on the relation between measured values and software characteristics (e.g., reliability, testability, maintainability, etc) with special regard to quality issues [20, 4]. In the present work we extract a suite of software metrics that can be used to enforce coding standards, to evaluate code complexity and sizes and may be also used to refine traceability links; this latter topic will be investigated as future work.

2.2 AOL Parsing and Similarity Computation AOL parsing and similarity computation is a two step process. The first step parses the AOL representation of the two software releases, and assigns weights to each class property. The second step computes an optimum match between the given classes. Chidamber and Kemerer [7], proposed a representation of substantial individuals, objects, as a finite collection of properties:

X =< x; P (x) >

(1)

where the object X , is identified by its unique identifier,

AF

t properties. Thus, a preliminary step in the definition of a similarity measure between them is the introduction of a mapping between a subset of the properties of X and a subset of the properties of Y . The remaining properties from P (x) and P (y ) are unmatched properties respectively of X and Y . If similarity of two things is defined as the intersection of the sets of properties, we can immediately derive that two individuals are indistinguishable if and only if they share the same name and possess the same collection of properties. However, as software evolves, implementations may deviate due to maintenance intervention: a criterion imposing substantial individuals identity is unnecessary stringent, leading, possibly, to unsatisfactory results. Therefore, a less stringent similarity of two things was experimented with in the present work. More precisely, let Vi and Vj be the compared software versions. For any given class Ei;k =< ei;k ; P (ei;k ) > in version Vi and any given class Ej;l =< ej;l ; P (ej;l ) > in Vj we introduce a similarity between individuals properties as follows:

 (x; y ) =   s(ei;k ; ej;l ) + (1

 )  s(x; y )

(2)

where x 2 P (ei;k ), y 2 P (ej;l ),  2 [0; 1℄ is the weight associated with class name matching, and s(u; v ) is the complemented edit distance [8] between strings:

s(u; v ) = 1

to remove edges that are introduced by the approach but that are unlikely to represent a real mapping between entities. Removing these edges may produce a graph with isolated nodes, these nodes represent either items deleted from the old version or items added in the new release. Finally, on the pruned bipartite graph the maximum match algorithm is applied to induce the mapping function between Vi and Vj . The pairs of nodes resulting from this mapping represent items in common in the two releases i.e., items evolved from the old to the new release. Up to now similarity was measured on the basis of the string matching between class names, attributes (including attribute type) and methods name (including signature). Thus we are not guaranteed that if two classes obtained a 100 % of similarity no modification occurred among them between releases Vi and Vj . It will be desirable to have a mechanism to identify methods changes not involving semantic of computation. In other words, we aim not to highlight minor changes; for example, if comments were added to a chunk of code or the code was indented to increase readability we would like to hide such a detail unless explicitly required.

T

x, and P (x) is its finite collection of properties. In general two objects X and Y may possess differen-

d(u; v )=(juj + jv j)

(3)

DR

Once the similarity between each pair of properties is available, an optimum match between Ei;k ; Ej;l can be inferred by applying the maximum match algorithm [8] to the bipartite graph in which nodes are respectively Ei;k , Ej;l properties and edges between nodes are weighted by the similarity score (equation 2). Ei;k , Ej;l similarity is then defined as the average optimum weight between properties as computed by the maximum match algorithm.

2.3 Version Comparison

The results of the activities carried out in the previous phase may be again thought of as the definition of a bipartite graph in which nodes are entities Ei;k , Ej;l and edges between entities represent the similarites between the entities. Clearly, each entity of version Vi may be connected to each entity of version Vj . By analyzing several releases of an OO system we noticed that dramatic changes or deep restructuring seldom occurs. Similarities among entities tend to be significant over a given threshold that may depend on the subject system. Thus we introduce the concept of a pruning threshold

3. Tool Support

The process for comparing software versions shown in Figure 1 has been completely automated. To extract the AOL representation from code a Code2AOL Extractor module has been developed, working for the C++ language. Extracting information about class relationships from code might have some degree of imprecision. In fact, there are intrinsic ambiguities, given two or more classes and a relation among them, due to the choice left to programmers implementing OO designs. Associations can be instantiated in C++ by means of pointer data members or by inheritance [12]. Furthermore, aggregation relations could result either from templates (e.g., list), arrays (e.g., Heap a[MAX]) or pointers data member (e.g.,Edges=new GraphEdge[MAX]). In the present work, an aggregation is recognized from code if and only if a template, an object array or an instance of an object is declared as data member. All the remaining cases, i.e. object pointers and references both as data members and formal parameters to methods, give origin to associations. The Code2AOL Extractor also extracts class and function level metrics among which are the classic class level metrics (number of public, private and protected attributes and methods, number of direct subclasses, number of direct superclasses, etc). Function level metrics include cyclomatic complexity, number of statements, number of passed parameters, number of operators, of function calls and return points. The matching phase relies on an AOL parser. Once the

4. Case Study

4.2 Assessing the similarity value threshold Due to the above result, we decide to limit our deeper analysis to the small subset of pairs with similarity value lower than 90%. Indeed, due to the use of the weight 50% in the string matching, all pairs with similarity measure greater than 90% showed a high degree of similarity both on the class name side and on the property side and then were to be considered as a case of evolution (or in some cases as unchanged classes). The results of the analysis showed that only pairs with similarity value higher than 75% were always to be considered similar pairs and only pairs with similarity value lower than 42% were not to be classified as similar pairs. This means that there was a quite large range of values including cases of uncertainty and most of the analyzed pairs were in this range. However, the results also showed that a threshold 50% discriminating similar pairs from pairs that are unlikely to represent a real mapping can be used with an acceptable error rate: in the worst case (comparing releases 3.0 and 3.1.2) the error rate was lower than 4%, but generally it was lower than 1%.

DR

AF

The version comparison approach described in the previous sections has been experimented with a freely available C++ class library. There are some advantages in analyzing public domain or freely available software: source code can be easily obtained and results can be compared with those of related works. Unfortunately, in the authors’ knowledge, no public domain OO system has both code and design available. Therefore, we are not able in the present case study to trace design into code thus ensuring not only traceability among software releases but also between design and code. We concentrated our effort on a well known C++ library of foundation classes, called LEDA (Library of Efficient Data types and Algorithms), developed and distributed by Max-Planck-Institut f¨ur Informatik, Saarbr¨ucken, Germany (freely available for academic research and teaching from http://www.mpi-sb.mpg.de/LEDA/). The LEDA project started in 1988 and a first version (1.0) was available in 1990. Since then several other versions have been released; the latest version (3.7.1) was released in 1998. In our case study, we analyzed and compared 9 different versions of LEDA, starting from version 2.1.1. Table 1 shows the main characteristics of the different versions obtained by static analysis of the source code. The table shows that LEDA evolved considerably from the first to the last version (from 35 to 153 KLOC and from 69 to 410 classes).

releases 2.1.1 and 3.0 we achieved 18% of pairs with similarity value lower than 90%).

T

abstract syntax trees of the compared versions are available, they are traversed and similarities computed. Computed weights are used to build a bipartite graph, passed to the maximum match algorithm. The AOL parser and the edit distance computation has been implemented in C while the maximum match algorithm was written in C++. Finally, visualization is based on the WEB paradigm: html pages are generated for each couple of releases and a grappa1 JAVA applet is exploited to draw directed graphs.

4.1 Setting preliminary weights

The first step of our case study consisted of devising a heuristic for the identification of a pruning threshold on the similarity measures to remove edges introduced by our tool that are unlikely to represent a real mapping between entities. To this aim we decided to compare consecutive versions of the library LEDA by assigning different weights to class name matching and property matching. We first tried by assigning the same weight (50%) to both types of matching. The result of this first attempt was in the case of the analyzed software quite interesting and useful to assess our parameters: in most cases less than 6% of pairs had a similarity value lower than 90% (only in the case of comparing 1 http://www.research.att.com/sw/tools/graphviz/packages/grappa.html

4.3 Assessing weights and thresholds Although the obtained results were quite encouraging, we decided to compare the releases of LEDA using different weights for class name and properties matching. We tried both by giving class name matching a greater and a lower weight. Table 2 shows the different pairs of weights used. The aim of the analysis was to assess the thresholds on the similarity values with respect to the adopted weights. The results showed that in the case of the software releases analyzed the pairs produced by the matching tool are almost the same, whatever weights are used. In the worst case (matching between releases 2.1.1 and 3.0) the pairs obtained by using different weights differed in less than 6% of cases (generally in less than 3% of cases). Furthermore, the similarity values of these pairs were generally low. However, the similarity values associated with pairs in the range (42%, 75%) were in most cases significantly different. We achieved the best results giving a lower weight to the class name matching (case 3 in Table 2). In this case, the range of uncertainty was (54%, 78%); moreover, we could use the threshold 70% to discriminate between similar and different pairs, with a lower error rate (less than 2% in the worst case). Conversely, giving a higher weight to class name matching (case 2 in Table 2) did not produce good results: the range of uncertainty was larger, (30%,79%), while we could not identify a threshold with a good precision as

Version KLOC Classes Methods Attributes Associations Aggregations Generalizations

2.1.1 35 69 1649 201 96 5 10

3.0 34 109 2388 245 116 16 43

3.1.2 61 176 3519 346 180 47 87

3.2.3 69 178 3695 371 186 52 87

3.4 95 208 4967 510 272 114 98

3.4.1 100 211 5104 543 275 118 99

3.4.2 111 210 5197 589 278 142 98

3.5.2 123 235 6124 740 336 181 114

3.7.1 153 410 10260 1177 537 410 206

Table 1. LEDA main features.

Weights case 1 case 2 case 3

class matching 50% 70% 30%

T

depict the relationships between the components of a system) or on runtime behavior (e.g. the abstract visualization of algorithms to educate novice programmers). Even when this is not the case, many traditional approaches fail to scale to very large amount of data and are therefore inadequate to represent the evolution of a real-life software system over a number of releases. Ball and Eick [3] have developed a suite of scalable visualization approaches and have applied them to several software engineering problems. The approaches include a line representation, which encodes the information of interest using a line’s color, a pixel representation, where information is associated with a small number of color-coded pixels, and a file summary representation, to show source-file level statistics, where each file is represented by a rectangle. We have combined these approaches with software structure visualization techniques and the idea of browsing to develop a visualization utility that shows the computed matches through a hierarchy of views of increasing detail. Three fundamental criteria have driven the design of the visualization utility:

AF

in the previous cases. The main reason for this behavior was due to a significant number of classes in the different releases that changed their name, without changing significantly their properties. Table 3 shows the results of comparing the different releases of LEDA using the weight 30% for class name matching; the first column contains the comparison of versions 2.1.1 and 3.0, remaining columns contain the comparison of the current column (new relese) with the previous one (old release). In particular, for each release the table outlines the number of classes and their total size (in terms of LOC) that result added, deleted, modified, and unchanged (in the interface) with respect to the previous release. Classes in the new (old) release are considered added (deleted) if they do not match any class in the old (new) release, or if they match some class with similarity value lower than 70%. Classes are considered unchanged in the interface if they match some class in the previous release with similarity value 100%. Property matching 50% 30% 70%

DR

Table 2. Weights used to compare the different versions of LEDA.

5 Result Visualization

The large amount of data to be analyzed when tracking the evolution of a software system and the different aims, cultures, backgrounds, and skills of the people involved in this operation pose several challenges to the way versioning information is organized and visualized. It is a common opinion that pictures make the data easier to understand for both the managers and the technical people. However, traditional approaches to software visualization are inadequate for version tracking as they mainly focus on the structure of software (e.g. the use of several forms of directed graphs to

  

Abstraction, that is the capacity to abstract away from the low-level raw data and present users with more useful higher level views; Navigation, which is related with the need for discovering useful information in large amount of available data without becoming lost or disoriented; Automation, that refers to the fact that the visualizations are generated automatically from the extracted information.

These criteria have been pursued by combining pictorial representations of data and graphical representations of software structures at different levels of granularity and linking them to the textual representation of source code. The philosophy is similar to hypermedia documents, as we provide users with several related and linked representations of an information space. In the current implementation, we organize versioning information into three layers of visualization, built by aggregating data at the level of an entire

Version Added Classes # Added Classes LOC Deleted Classes # Deleted Classes LOC Modified Classes # Modified Classes old LOC Modified Classes new LOC Unchanged Classes # Unchanged Classes old LOC Unchanged Classes new LOC Similarity value

3.0 47 1894 7 334 61 18551 19668 2 185 184 89%

3.1.2 70 4946 3 113 91 21019 22699 16 4029 4071 94%

3.2.3 10 363 3 95 52 16874 17772 122 25262 25460 96%

3.4 40 4193 10 372 73 20367 23716 96 21386 22136 95%

3.4.1 4 200 0 0 20 10605 10951 189 53238 53916 99%

3.4.2 4 148 4 173 56 34853 36283 152 30164 30309 98%

3.5.2 35 2941 9 307 62 37734 41669 140 28149 28072 96%

3.7.1 178 20679 1 63 27 12859 12456 208 69025 60923 98%

in common with the previous one. The color of this area encodes the average similarity of these common classes on a six value scale of grays; the lighter gray indicates a similarity between 0.7 and 0.75, while the darker gray indicates a similarity higher than 0.95. The remaining area of the rectangle is colored red 2 and is proportional to the number of classes (width) and statements (height) which are new in the release. Finally, external to the rectangle is a blue shadow that represents the classes and the statements that were present in the previous release, but are not present in the release being visualized. Of course, the width (respectively height) of the gray rectangle plus the width (height) of the red one gives the number of classes (statements) in the release. Similarly, adding the width (height) of the blue shadow to the width (height) of the gray rectangle gives the number of classes (statements) in the previous release.

AF

release, the classes it includes, and the attributes and methods of a single class, respectively.

T

Table 3. Results of comparing LEDA releases.

DR

5.2 Class level view

Figure 2. LEDA version comparison.

5.1 Release level view

In this representation, shown in Figure 2, each release is depicted by a rectangle. The width and height represent the total size of the release in terms of number of classes and source statements, respectively. The rectangle includes an area, with the origin in the rectangle’s lower-left corner, which represents the number of classes (width), and related statements (height), that the release being represented has

Each rectangle in the release level view points to a new representation, a class level view, that gives a number of additional details on the classes and relations added, deleted or modified. A class level view consists of two parts, a summary subview and a structural sub-view. The summary sub-view is very similar to the release level view described above. Each class is represented by means of a rectangle whose width and height represent the total number of public members (either attributes or methods) and the related statements, respectively. The inner area represents the number of common members (width), their size in terms of statements (height) and the average similarity (gray level). The remaining red area and the external blue shadow represent additions and deletions of members, respectively. The structural sub-view, shown in Figure 3, consists of a directed graph

2 In this black and white printing, red and blue colors appears as levels of grey

AF

T

is added (red), deleted (blue) or modified (one out of six levels of gray); modified methods are accompanied by the corresponding method in the previous release. Attribute lines and method rectangles are pointers to the textual representation of the corresponding piece of code.

Figure 3. LEDA 2.1.1 LEDA 3.0 Matching excerpt.

that depicts an AOL. We apply the pair-difference coloring technique described by Holt and Pak [16] to contrast pairs of versions. In particular, red (respectively blue) nodes and edges represent added (deleted) classes and relationships. Classes and relationships in common between the two releases are colored gray. Rectangles in the summary sub-view and nodes in the structural sub-view point to a more detailed representation, a member level view, which furnishes details on the attributes and methods added, deleted or modified.

DR

5.3 Member level view

This representation, shown in Figure 4, is a combination of a line representation, to describe additions/deletions/modifications of attributes, and a file summary representation, which depicts additions/deletions/modifications of methods. The higher part of the figure, shows the list of class attributes colored blue (deletions), red (additions) or gray (modifications). In the latter case, the attribute name is accompanied by the corresponding attribute in the previous release and the similarity is coded on a six level scale of grays. The lower part of figure shows the methods by means of rectangles. The height and width of the rectangle represent the number of statements and the cyclomatic complexity, respectively. The rectangle’s color encoded the fact that the method

Figure 4. LEDA comparison detailed view.

6. Related Work

Few approaches and systems have been presented to deal with the problem of building and maintaining traceability links either between design and code or among software releases [16, 11, 21, 23, 26]. The work by Meyers, Duby and Reiss [21] differs from ours both in the objective and in the implementation. The objective of CCEL is to check the compliance of a program against a set of design guidelines expressed as constraints that affect single or group of classes, our objective is to check the compliance of a set of OO software releases expressed as design models. Unlike CCEL, we have an explicit representation in terms of a design model of the code which states the existence of a set of specific entities with specific properties and relations among them and this must be traced into subsequent software versions. The work by Murphy, Notkin and Sullivan [23] is much closer to ours. Software reflexion models can be applied in the OO domain: Murphy et al. refer to an experiment on an industrial subsystem where a reflexion model was computed

T

steps are carried out with the approach presented here. Furthermore, we also advocate different and more flexible similarity measures, and a new matching algorithm that also adopts a pruning threshold to avoid false matching. Software evolution, in the large, has been previously studied by R. Holt and Y. Pak [16]. GASE the developed Graphical Analyzer for software Evolution allows highlighting of changes between software releases at architectural level. There are some commonalities between GASE visulaization approach and our work, indeed we borrowed the pair difference coloring technique from GASE. However, we focused out research on OO software and we developed a set of hierarchical views from the entire system to the finest grain level in which we compare classes property by property.

7. Conclusions

Traceability links among subsequent software releases during maintenance or iterative development process are often inconsistent or absent because maintaining consistency is a costly and tedious activity and reducing time to market is often vital to face competition. Automatic tools to support design-code and code-code compliance checks, showing potential discrepancies and lack of traceability between the two artifacts are thus useful both in development and maintenance. Reports about compliance, and especially graphical layouts such as the pair-difference diagrams are very useful, both for code inspections and for assessing software evolution under maintenance tasks. In the paper we have presented an approach based on string edit distance to assess the compliance of OO software systems across releases. Future work will be devoted to further experiment with the proposed approach to a wider set of case studies in order to better assess matching weights and pruning thresholds and to define a heuristic for the identification of cases of class splitting/fusion in two consecutive versions. In the present form, our approach does not deal with these cases; however, in the case study presented in this paper some of these cases were discovered when comparing the same versions using different weights. For example, the split class in the old version was coupled to two different classes of the new version, in both cases with a high similarity value. Another direction for future work will be to devise a formal framework and taxonomy to analyze inconsistencies and prioritize the restructuring interventions. Finally, in our opinion it can be helpful to integrate software metrics within the comparison process: this would extend our approach from a mapping at the interface level to a mapping at the implementation level.

DR

AF

to match a design expressed in the Booch notation against its C++ implementation. Their process and ours are similar and many analogies can be drawn. We both use an extraction tool to derive abstract information from source code. The reflexion model tool is analogous to our code-code matcher, in that they both provide an output in terms of where the high-level models agree or disagree. Where their and our approach mainly differ is in the use of the mapping between the two models. They use such mapping to trace the source code model entities onto the high-level model entities. But the nature and granularity of the two models is quite different: this is why such a mapping is needed. For example they have modules in the highlevel model and functions in the source code model: the mapping information is used to cluster the source code model entities in order to assign them to the high-level model entities. In this way they make use of regular expressions to exploit naming conventions of source code entities. In our case, the entities of the two models are exactly the same: classes and relations among them and matching is based on the interface of classes. Moreover, we allow a partial matching between entities in the two models based on similarity, computed using edit distance and then we extract on optimum set of traceability links. Finally, by means of the pruning threshold we can define different criteria, of different strength to check subsequent software version matching. Sefika et al. [26] have developed a hybrid approach that integrates logic-based static and dynamic visualization and helps determining design-implementation congruence at various levels of abstraction, from coding guidelines to architectural models such as design patterns [12] and connectors [13], to design principles like low coupling and high cohesion. Pattern-Lint, the system developed, differs from our work and Murphy’s, which are based only on static analysis, in that it also integrates dynamic visualization, whose results are compared against the static analyses ones. Although it is a very general and powerful framework, Pattern-Lint attempts to solve a different problem, it checks compliance of source code with respect to three types of design models: coding guidelines, architectural models such as design patterns or styles, heuristic models such as coupling and cohesion. Moreover, Pattern-Lint does not handle approximate matches like our system does using edit distance Our work is actually much more similar to the works [11, 2, 16]. We share with [11, 2] the general idea and the approach: we both rely on Bunge’s ontology and adopt an intermediate representation to build traceability links. This work can be regarded as a natural evolution of [11], actually, [11, 2] can be considered the first step of the process in which a design to code mapping is recovered, subsequent

DR

AF

[1] G. Antoniol, R. Fiutem, and L. Cristoforetti. Using metrics to identify design patterns in object-oriented software. In Proc. of the Fifth International Symposium on Software Metrics - METRICS98, pages 23–34, Nov 2-5 1998. [2] G. Antoniol, A. Potrich, P. Tonella, and R. Fiutem. Evolving object oriented design to improve code traceability. To appear in Proceedings of the 7 th Workshop on Program Comprehension, May 1999. [3] T. Ball and S. G. Eick. Software visualization in the large. IEEE Computer, 29(4):33–43, Apr 1996. [4] V. R. Basili, W. L. Melo, and L. C. Brinad. A validation of object-oriented design metrics as quality indicators. IEEE Transactions on Software Engineering, 20(6):476–493, June 1994. [5] M. Bunge. Treatise on Basic Philosophy: Vol. 3: Onthology I: The Furniture of the World. Reidel, Boston, MA, 1977. [6] M. Bunge. Treatise on Basic Philosophy: Vol. 4: Onthology II: A World of Systems. Reidel, Boston, MA, 1979. [7] S. R. Chidamber and C. F. Kemerer. A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 20(6):476–493, June 1994. [8] T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introductions to Algorithms. MIT Press, 1990. [9] J. Daly, A. Brooks, J. Miller, M. Roper, and M. Wood. The effect of inheritance on the maintainability of objectoriented software: An empirical study. In Proceedings of the International Conference on Software Maintenance, pages 20–29, Opio-Nice, Oct 1995. [10] N. Fenton. Software measurement: A necessary scientific basis. IEEE Transactions on Software Engineering, 20(3):199–206, Mar 1994. [11] R. Fiutem and G. Antoniol. Identifying design-code inconsistencies in object-oriented software: A case study. In Proceedings of the International Conference on Software Maintenance, pages 94–102, Bethesda, Maryland, November 1998. [12] E. Gamma, R. Helm, R.Johnson, and J. Vlissides. Design Patterns: Elements of Reusable Object Oriented Software. Addison-Wesley Publishing Company, Reading, MA, 1995. [13] D. Garlan and M. Shaw. Software Architecture: Perspectives on an Emerging Discipline, volume 1. Prentice-Hall, Englewood Cliffs, NJ, 1996. [14] D. Gusfield. Algorithms on Strings, Trees, and Sequences. Cambridge University Press, New York, 1997. [15] S. Henry and D. Kafura. The evaluation of systems’ structure using quantitative metrics. Software Practice and Experience, 14(6), June 1984. [16] R. Holt and J. Y. Pak. Gase: Visualizing software evolutionin-the-large. In Proceedings of the Working Conference on Reverse Engineering, pages 163–166, Monterey, 1996. [17] D. A. Lamb. Idl: Sharing intermediate representations. ACM Transactions on Programming Languages and Systems, 9(3):297–318, July 1987. [18] D. Lea and C. K. Shank. Odl: Language report. Technical Report Draft 5, Rochester Institute of Technology, Nov 1994.

[19] M. Lorenz and J. Kidd. Object-Oriented Software Metrics. Prentice-Hall, Englewood Cliffs, NJ, 1994. [20] J. Mayrand and F. Coallier. System acquisition based on software product assessment. In Proceedings of the International Conference on Software Engineering, pages 210–219, Berlin, 1996. [21] S. Meyers, C. K. Duby, and S. P. Reiss. Constraining the structure and style of object-oriented programs. Technical Report CS-93-12, Brown University, 1993. [22] K. H. Moller and D. J. Paulish. Software Metrics a Practicioner’s Guide to Improved Product Development. Chapman Hall, Boundary Row London SE1 8HN, 1993. [23] G. C. Murphy, D. Notkin, and K. Sullivan. Software reflexion models: Bridging the gap between source and high-level models. In Proceedings of the Third ACM Symposium on the Foundations of Software Engineering, 1995. [24] OMG. The Common Object Request Broker: Architecture and specification. OMG Document 91.12.1, OMG, December 1991. [25] T. Pearse and P. Oman. Maintainability measurements on industrial source code maintenance activities. In Proceedings of the International Conference on Software Maintenance, pages 295–303, Opio-Nice, Oct 1995. [26] M. Sefika, A. Sane, and R. H. Campbell. Monitoring compliance of a software system with its high-level design models. In Proceedings of the International Conference on Software Engineering, pages 387–396, 1996.

T

References

Suggest Documents