9 A developer can reuse lessons learned regarding the development of Java applications to solve a ... measurement program at the company IntelliPhone.
Goal-Oriented and Similarity-Based Retrieval of Software Engineering Experienceware Christiane Gresse von Wangenheim 1, Klaus-Dieter Althoff 2, Ricardo M. Barcia 1 Federal University of Santa Catarina, Production Engineering, 88049-00 Florian6polis, Brazil (gresse,rbarcia} @eps .ufsc .br
Frannhofer Institute for Experimental Software Engineering (IESE), Sauerwiesen 6, D-67661 Kalserslautern, Germany althof f@iese, fhg. de
Abstract For the successful reuse of software engineering know-how in practice, useful and appropriate experienceware has to be retrieved from a corporate memory. As support is required for different processes, purposes, and environments, the usefulness of retrieved experiences depends mainly on the particular reuse situation. Thus, a flexible retrieval method and similarity measure is required, which can continuously be tailored to specific situations based on feedback from its application in practice. This paper proposes a case-based approach for the retrieval of software engineering experienceware taking into account those specific characteristics of the software engineering domain, such as the lack of explicit domain models in practice, diversity of environments and software processes to be supported, incompleteness of data, and the consideration of ~experiences is complicated due to the specific requirements to an EF in the software domain. Those requirements derived from our experiences on experience-based support of planning of measurement programs [18,23,24,26,29] and from other SE areas [3,17,10,28] include:
119
Goal-oriented retrieval. The possibilities to support software process tasks through the reuse of experiences from the EB are manifold. Some examples are: 9 A project manager can reuse a quality model on effort distribution from a similar past project as a basis for the effort estimation of a new software project at the company IntelliCar, which produces embedded software for automobiles. 9 A developer can reuse lessons learned regarding the development of Java applications to solve a specific problem, which occurred during the development of an e-commerce application at the company IntelliCommerce. 9 A quality assurance team can reuse measures and data collection instruments from past measurement programs in network management projects while planning a new measurement program at the company IntelliPhone. 9 A process engineer can reuse a process model on code inspections, which has been developed at his company IntelliMed, when introducing code inspections in a new project. 9 A tester can reuse descriptions of problems that occurred in the past during testing to prevent the repetition of those failures during the current testing process. 9 An experience engineer can reuse individual experiences gathered from several past projects to identify patterns and develop general domain knowledge during the maintenance of the EB. To enable comprehensive support, the EF has to provide support for various reuse scenarios. Therefore, various types of experiences, denoted as experienceware [24], have to be retrieved from the EB for several software process tasks, viewpoints, environments addressing various purposes. This requires the explicit definition of retrieval goals based on the reuse scenarios and a parametric retrieval mechanism, which is tailorable to the particular retrieval goal and supplies useful experienceware wrt. the particular goal. Similarity-based retrieval. As the usefulness of experiences can only be determined when it has been tried to reuse them in the current situation, the a posteriori" criterion of usefulness is predicted through the criterion of similarity between the present situation and the one described in the experience, assuming that similar situations (or problems) require similar solutions (see Figure 1). For example, we assume, that measurement
Fig. 1. S ~ f l ~ i t y
120
programs with similar goals use similar quality models and measures or that similar problems occurring during code inspections have corresponding solutions. In this context, an important issue is the notion of similarity. In the SE domain, it is very unlikely to find an artifact in the EB fulfilling completely the needs of the current situation, because each software product, project, or organization is different. We rather have to search for experiences that have been gathered in past situations and are similar to the current one. For example, assuming a project manager wants to reuse a quality model on effort distribution for the planning of a new software project, which is characterized as shown in Table 1 (Present Project). Table 1. Example of project characterizations Characteristics
application sector ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
programming language
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
!!ii!iiiiiiiiiiii}ii iiiiiii!i!i!i!i::i::i::i::i::i::i::iiiiiiiiiiiiiiiiiiiiiiiii!iiiiii ili!i iiiiil iil!iiiiiiiiiiiiiiiiiiiill ~':~i::i::i
!!!i!iiiiiiiiiii[iiiii!i!iiiiiiiiiiiiiiiiiiiiii
iiiiiiiiiiiiiiii:,:,: i i:i:,;:,iiiiiiiiiiiiiiiiiiiiiiiii:,iii ii:i:iiiiill iil iiiiiiiiiiiiii: iiiii:iili:i:i:ii:iiiiii
project team size
: :.:.:: ::.:.: :::: :; ::::::::::::
experience level
|: :.: :.:: :.:: : : : : : : : :::: :5: :::5:::: :::::::::::::::::::::::::::::::::::::::::::::::::::::::
ii~!~i~i~i;:!!:!:!:i:i:i:i:i:i:i::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: complexity
Ii;iNNiiiiiiiiiiiiiiiiii;iiiiiiiiiiiiiiiiiiiii;ii!
(estimated) product size
li~!~(IKi~?iiiiii:!:i:i:i:i:i:i:i:i:i:i:i:i:i:i:i:i:i:i
:.
......................................................:::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::
'"'i!:"'............i?iiiiii:ii:................
iii~,~iii~,~,i~,;;',;',i',i~,i!iii':i'~':i':i'iii':i':iii':i':i':i'i~':i'::i':~i~:iiiiiiiiiiiiiiiiii iiiiiiii~iiiiiiiii~i~ii~~
Based on a set of indexes 1 relevant for the retrieval of useful experiences, the current situation description is compared to the experiences stored in the EB. It is very unlikely that a project with the same characteristics has already been done in the company, yet, it is quite probable that in the past ~ 6, weighting corresponding features stronger than non-corresponding ones. Unknown features are considered as less important for the identification of relevant cases (y=O). Redundant features are believed to have an impact on the determination of relevant cases. But as the specific values are not available in the respective case in the case base, they are associated with a very small weight (8 3, expert-->4), assuming an equidistant order of the values. Another possibility is the association of user-defined numerical values, which allows to express a non-equidistant order of the values, e.g., (none--->l, medium--->3, high--->5, expert-->6). Here, a smaller difference between high and expert experience is assumed than as, for example, between none and m e d i u m experience. These transformation have to be made under careful consideration and preservation of the semantic meaningfulness regarding the manipulation of the data. Based on the associated numerical values, the similarity functions as described for numbers can be utilized for the determination of the similarity value between ordered symbolic values. Unordered Symbol. Unordered symbols represent symbolic values without any order, for instance, like the values (C, C+ +, Smalltalk, Ada) of the feature