Position paper for Workshop on Empirical Studies in Reverse Engineering within STEP 2005
Characterization of Reverse Engineering Experiment Families Marco Torchiano Dipartimento di Automatica e Informatica Politecnico di Torino Italy
[email protected] Abstract Within the large software engineering community there have been several attempts to provide a standard template to characterize empirical studies. The objectives being twofold: first improve the communication among researchers and second facilitate the discovery of potentially useful information for the practitioners. In the emerging area of empirical studies in reverse engineering we could aim at a specialized template. 1. Introduction The empirical branches of both “hard” and “soft” sciences have since a long time defined widely accepted standards for presenting and characterizing the result of their studies. The advantages of such guidelines are: • it is easier to understand a research paper if you know where to look for the information you are looking for; • the search for the relevant studies is much easier since paper can be classified more easily; • the general “validity” or “goodness” of a paper can be assessed with a quick browse; • it is easy to identify blind spots of the studies and to design families of experiment to cover them. In the software engineering community recently have been published some preliminary guidelines [5]. Also in a recent EU research project (ESERNET [3]) guidelines have been proposed on how to present and summarize empirical studies. The problem of characterizing families of empirical studies in software engineering have been addressed in [2]. My intention in this position paper is to reason on how to adapt the proposals presented in the context of the wider empirical software engineering area to the narrower field of empirical studies in reverse engineering.
-1/4-
Position paper for Workshop on Empirical Studies in Reverse Engineering within STEP 2005
2. Characterization of RE empirical studies In [2] we find a proposed approach to characterize families of experiment. In short the idea is to use the GQM [1] template to identify experiments. There are five parameters in a GQM goal template: 1. Object of study: a process, product or any other experience model. 2. Purpose: to characterize (what is it?), evaluate (is it good?), predict (can I estimate something in the future?), control (can I manipulate events?), improve (can I improve events?). 3. Focus: model aimed at viewing the aspect of the object of study that is of interest, e.g., reliability of the product, defect detection/prevention capability of the process, accuracy of the cost model. 4. Point of view: e.g., the perspective of the person needing the information, e.g., in theory testing the point of view is usually the researcher trying to gain some knowledge. 5. Context: models aimed at describing environment in which the measurement is taken. The goal is to adapt the above generic approach to characterize empirical studies in reverse engineering. The objectives we wish to achieve are: • make it easy to distinguish similar studies, • identify clearly the essential features of each study, • facilitate the identification of possible extensions in order to build experiment families Examining the GQM template items we can find that some are more important than other, and we need some more information. The object of the study is the fundamental classification feature; it represent the specific RE technique of method under study. Perhaps with some restricted mindset we focus only on comparative studies. Therefore we can assume that the purpose is always evaluation. The focus of the study let us understand which aspect of the RE technique we are interested in. The point of view is usually that of the maintainer. The context of the study must include the cofactors. This is very important to identify the external validity of the study and to enable the design of further experiments that could complement the original one. It is also interesting to consider the population of the study. Since RE often employs automatic techniques, in those cases it can make more sense working with a population of programs than a population of maintainers (programmers, or developers). Finally the main independent variable (or factor) of the study should be mentioned explicitly together with its values. 2.1. Example
Let us consider a couple of examples.
-2/4-
Position paper for Workshop on Empirical Studies in Reverse Engineering within STEP 2005
The first study is an experiment presented in [7] whose goal is to compare code annotations to drawing editor for generating documentation of object-oriented programs. The second study is a (planned) experiment to compare the completeness and accuracy of static versus dynamic analysis. The characterization of the two empirical studies is presented in Table 1. Table 1: Characterization of two sample studies.
Study 1
Study 2
Object
Class diagram extraction method
Focus Factor Subjects Context
Ease of use METHOD = {annotations, drawing editor} Maintainer Program characteristics: size, application domain, etc. Maintainer characteristics: experience, skill, etc. Tool
Class model automatic extraction techniques Model accuracy and completeness ANALYSIS TYPE = {static, dynamic} Programs Program characteristics: size, application domain, etc. Tool
The context for both studies includes “tool”. For instance, in the first study we picked a drawing editor tool for convenience reasons. We can speculate that selecting another one the result would not change significantly but we cannot be sure. In the first study we selected students as subjects. There is a never-ending discussion about the difference between students and professionals [6]. The only meaningful way of addressing this concern is to replicate the experiment with professional developers. Similar considerations hold for the program characteristics, for instance the size of the program can have a significant impact on the results [4]. 3. Conclusions The proposed template is derived from a sound empirical study characterization template. It addresses issues of clear identification and definition of extension points for empirical studies. Since it has been derived though speculation on the basis of past experience with empirical studies, it would benefit both from an assessment and a discussion. 4. References [1]
[2]
[3]
V. Basili, G. Caldiera, and D. Rombach, "Goal question metric paradigm" in Encyclopedia of Software Engineering, vol. 1, J. J. Marciniak, Ed.: John Wiley & Sons, 1994. V. R. Basili, F. Shull, and F. Lanubile, "Building Knowledge through Families of Experiments" IEEE Transactions on Software Engineering, 25 (4): 456-473, 1999. R. Conradi and A. I. Wang Eds., "Empirical Methods and Studies in Software Engineering, Experiences from ESERNET": Springer, 2003.
-3/4-
Position paper for Workshop on Empirical Studies in Reverse Engineering within STEP 2005
[4]
[5]
[6]
[7]
K. El Emam, S. Benlarbi, N. Goel, and S. N. Rai, "The confounding effect of class size on the validity of object-oriented metrics" IEEE Transactions on Software Engineering, 27 (7): 630 - 650, July 2001. B. A. Kitchenham, S. L. Pfleeger, L. M. Pickard, P. W. Jones, D. C. Hoaglin, K. El Emam, and J. Rosenberg, "Preliminary Guidelines for Empirical Research in Software Engineering" IEEE Transactions on Software Engineering, 28 (8): 721734, 2002. P. Runeson, "Using Students as Experiment Subjects – An Analysis on Graduate and Freshmen Student Data" Proc. of 7th International Conference on Empirical Assessment & Evaluation in Software Engineering (EASE'03), April 8-10, 2003 M. Torchiano, F. Ricca, and P. Tonella, "A comparative study on the redocumentation of existing software: Code annotations vs. drawing editors" Proc. of IEEE International Symposium on Empirical Software Engineering (ISESE), Noosa Heads, Australia, November 17-18, 2005, pp.??
-4/4-