International Journal of Computers and Applications, Vol. 31, No. 3, 2009
AUTOMATED SEGMENTATION OF DEVELOPMENT SESSIONS INTO TASK-RELATED SUBSECTIONS I.D. Coman∗ and A. Sillitti∗
Existing usages of automatically collected data include automated suggestions for program investigation [9] and task concern identification [10]. However, these usages rely on the developer to indicate manually the start of a task. Without knowing when a switch to another task occurs, the techniques might mix the contexts of several tasks, with negative impact on their performances. Moreover, Robillard and Murphy [11] acknowledge the identification of the start point of a task (after the initial code exploration) as a technical challenge for attempts of supporting task-aware software development environments. We propose to enhance existing tools for automated data collection with an approach to automatically split streams of low-level, automatically collected data into taskrelated subsections. This paper adds to previous work [12] an updated, more detailed presentation of the approach, insights from a previous unsuccessful approach, and detailed discussion of potential benefits. The paper is organized as follows: Section 2 explains our approach, Section 3 presents the results of the validation study, Section 4 discusses the identified potential benefits, Section 5 relates our approach to existing works, and finally, Section 6 provides the conclusions and the directions for future work.
Abstract Automated data collection solves the main problems associated with manual data collection, ensuring accuracy of data, non-intrusiveness, and low-costs of collection. However, automated tools mainly collect low-level data. Higher-level information, such as the start of working on a specific task, is still intrusively collected.
To address this
issue, we propose an approach to segment development sessions into task-related subsections. We will first describe the concept of the approach, the results of an initial validation study and then the potential benefits.
Key Words Non-invasive measurement, automated software measurement, task concerns, task-aware environments, program navigation analysis
1. Introduction The automation of data collection is recognized as a key requirement for the success of software measurement programs [1–3]. The traditional, manual data collection is time consuming, tedious, error prone, and possibly biased or delayed [4, 5]. Partially automated data collection (e.g., LEAP [6] or PSP Studio [7]) are still affected by the context-switching problem [8] due to constant switching between working and recording activities. Fully automated, non-intrusive data collection addresses these problems. However, it introduces new problems of its own. The cost of non-intrusiveness is a lack of filtering and aggregation that humans can easily do but it is rather complex to automate. Thus, automatically collected data are usually low-level data, such as the time spent interacting with a certain piece of code or the time spent compiling or running an application. Automatically collected data lack higher-level information, such as the task on which the developer was working at a given time or the actual process that s/he was following.
2. Our Approach We define a development session as a period of time during which the developer interacts continuously with the computer. During this interaction, the developer might switch several times from one task to another, as s/he finishes the current task, gets blocked with it, or receives a higherpriority task. We aim at segmenting a development session into subsections corresponding to work on single tasks. A subsection is a continuous time interval during which the developer works on a single task. As soon as the developer switches to another (possibly previously attempted) task or an interruption occurs, another subsection begins. We focus on maintenance tasks. The work of developers on maintenance tasks consists in finding, understanding, and editing (modifying or adding) task-relevant code [13]. Our initial attempt, based on the observations from an industry study [14], has been to first identify such activities of developers (exploration and implementation) and then use them for finding the mo-
∗
Center for Applied Software Engineering, Free University of Bozen-Bolzano, 4, Via della Mostra, Bolzano, 39100, Italy; e-mail:
[email protected], Alberto.sillitti@ unibz.it (paper no. 202-2963)
159
ments of start, resume, or end of tasks in the data stream. However, an initial laboratory study revealed that the activities of developers are often interleaved up to the point that they occur simultaneously. Moreover, the method was overly sensitive to tasks that share a large part of the code of interest. To reduce this sensitivity, we need to take into consideration the context of the accesses to various locations. Given the high interleaving of exploration and implementation, a successful approach should consider a more detailed level of the interaction history. On such a level, the activities should be separated well enough to allow automated detection. However, this would require the collection of more specific data (such as invocation of commands from the IDE) which would restrict the usage of the method. In an attempt to maintain a level of general applicability, we decided, instead, to use these insights and to approach the problem from another point of view, with the following degree-of-access (DOA) method. The DOA method considers only the events of the interaction between the developer and the IDE, ignoring the interaction with other software tools. Thus, the interaction history consists in a stream of events reflecting the sequence in which the developer accesses various code locations (methods, classes, or files) and the time spent at each location, before switching to another one. We choose this type of events as they are essential for solving the types of tasks that we consider. However, the method can be easily extended to also make use of the interaction history outside the IDE, thus preserving its generality. The main idea behind this method is that each solving of a task requires intensive access to a set of locations (e.g., methods, classes, files) that are essential for that specific task (e.g., those that have to be modified). We name this set of locations the core of a task. This notion is related to that of concerns [10], but it is more restrictive – there can be just one core for each task, but there can be several concerns for each task. We expect the core of real-world maintenance tasks to usually contain several locations. We define a time interval of intensive access (TIIA) of a location as a continuous time interval during which the location has been focused for at least a minimum percentage of the total time. When the developer makes the modifications required to solve the task, s/he accesses intensively the core locations of that task. Thus, we expect that some of the TIIAs of the core locations are overlapping to a large degree. The DOA method has two steps: core identification and subsection inference. In the first step, we identify all the cores in a development session, based on the intensity of developer’s activity. Each core represents one subsection and the position in the event stream represents an initial rough approximation of the position of the corresponding subsection. Starting from these approximated positions, in the second step, we compute the boundaries (the start and end) of each subsection by grouping together methods with overlapping or close TIIAs, until all the TIIAs are part of one subsection. The hypotheses underlying this method are: Hypothesis 1 The core locations are accessed
throughout the solving of the task more than other locations of less or no relevance to the task. Developers start the work on a maintenance tasks by exploring the code to find the locations that are relevant to the task (the core). Many locations prove to be irrelevant to the task and are not accessed a second time. It is only the relevant locations that are repeatedly accessed. As soon as one location has been found as relevant to the task, it will be accessed again later to understand its relationships to other locations that are potentially relevant to the task. Moreover, developers access intensively the core locations when performing the modifications needed to complete the task. Hypothesis 2 The core locations are intensively accessed together during the actual implementation of the solution. Therefore, we expect that the TIIAs of the core locations overlap to a large degree around the moment when the developer implements the solution. 2.1 Core Identification We identify cores as groups of methods with greatly overlapping TIIAs. We measure the accesses to a method based on the amount of time that a method is focused. To assess how intensively a developer accesses a method, we define a measure called DOA. The DOA for a method m at a given time t and considering a first access to m at time t0 is the ratio between the access time (AT) and the total interval of time considered (1), where AT is the amount of time that the method m is focused in the time interval between t0 and t. All time values are considered in seconds. Due to the definition, DOA always has values between 1 and 0. DOA(m, t) =
AT (m, t) t − t0
(1)
We consider that the TIIAs for a method m are the time intervals where DOA(m) > th, with th being a threshold. Each TIIA(m) has a different t0 , that is the moment of the first access to the method m during that specific TIIA. The first TIIA(m) starts when m is accessed for the first time and ends when DOA(m) decays beyond th. After that, the next access to m will trigger a new TIIA(m) which will end when its DOA(m) < th. Using the above defined DOA, we compute for each method all its TIIAs. Then, we compute for each moment in time, the number of methods that are intensively accessed (as the number of distinct TIIAs at that moment) and identify the peaks of this measure in the entire development session. For this, we first compute a weighted central moving average (WCMA) of the previously computed number of methods intensively accessed at each moment in time. A WCMA is computed in every data point as an average of the current point (the central point) and 250 data points on either side, with weights (that decrease arithmetically on either side) associated to each data point. Then, we use the first derivative to identify the peaks of the obtained continuous line. For each peak, we compute its height as the difference between the highest point of 160
Table 1 Measures Used for Core Identification Measure
Formula
Span of a TIIA (for any α a TIIA)
span(α) = end(α) − start(α)
Span of a core/subsection (for any c a core or a
span(c) = max(end(α)) − min(start(α)) α∈c
subsection and α all the TIIAs that belong to c).
α∈c
Distance (c is a core or a subsection, α is a d(α, c) = span(c + α) − span(c) TIIA that is not part of c, and c + α is the core or subsection c after adding α to it). Table 2 Algorithm for Subsection Inference Step 1 Create one subsection for each core location Step 2 Compute for each subsection s the span(s) Step 3 For all TIIAs α not yet assigned to a subsection: For all subsections s Compute d(α, s) Step 4 Add to each subsection s the TIIA α with the lowest d(α, s), discarding the TIIAs that would cause two subsections to overlap Step 5 Repeat from step 2 until all TIIAs have been assigned (tp) and, the time considered on each side of the weighted average used for smoothing (ta). The th controls the intervals of time that are allowed between two distinct accesses to the same method during a TIIA. The lower the th, the longer the TIIAs are and the longer the periods of lack of focus that are accepted during a TIIA. Both tp and ta are related to core detection and to the amount and number of methods whose TIIAs need to overlap to be considered a core. The tp mainly controls the minimum variation between the numbers of overlapping methods during a time interval to consider its maximum overlapping methods as the core. The ta mainly controls the size of the time interval over which the average number of overlapping TIIAs is computed. In our validation experiment, we use a ta of 250 and a tp of 0.2. The value of ta means that the weighted average for smoothing is computed in every point taking into consideration approximately 4 min of time on each side. The value of tp means that a peak should represent an increase in the number of overlapping methods of at least 0.2 to represent a core. The value of th should be defined in a way which takes into account the characteristics of a development session in terms of the length of distinct accesses. For a session which contains mainly long accesses, single accesses potentially mean lack of focus for a long time but should not end the TIIA of a previously accessed method. Therefore, in such cases, the th should be small. In the validation experiment, we compute th according to (2). We consider the median length of the accesses and not the average as the length of the accesses is not normally distributed, but strongly
the peak and the higher of the two valleys. To ignore very small variations in the number of TIIAs, we consider only the peaks whose height is above a certain threshold. We discuss the choice of thresholds in Section 2.3. The peaks represent moments in time when the number of TIIAs greatly increases, meaning that many methods are intensively accessed during approximately the same time. Following Hypothesis 2, we thus consider the methods having TIIAs during a peak as a core. We consider the time position of each core as an initial, rough approximation of a subsection. 2.2 Subsection Inference We define the time span of a subsection as the minimal time interval that covers all the TIIAs that are part of the subsection. Considering the set of TIIAs in a subsection, the start of a subsection is the start of first TIIA and the end of a subsection is the end of the last TIIA. First, we create for each core a subsection that contains all its TIIAs. Then, we expand each subsection by adding at each step the closest TIIA that is not already part of a subsection. We ignore the TIIAs that would cause two or more subsections to overlap. The algorithm ends when all the TIIAs (except those ignored because of causing subsection overlap) have been assigned to a subsection. Table 1 defines the measures used and Table 2 provides a pseudo code description of the algorithm. 2.3 Parameters The method has three parameters: the threshold for DOA (th), the threshold for the peaks used for core identification 161
skewed to the left, meaning that there are much more short accesses than long ones. The formula sets the th at 2/3 of the median. The division by 10 is needed due to the DOA taking values only between 0 and 1. th =
2 1 × median(accesses) × 3 10
or files). The data have granularity of 1 s, to ensure that no events are missed (as the median of events from developers working on their daily tasks [14] is 4 s). We also collected screen-capture videos from each participant. We use these to identify for each participant, the exact task on which they were working at any given time. To identify the actual boundaries of each task, we used the same procedure as Ko et al. [13]. The initial validation of the proposed approach is twofold: first we evaluate whether the approach correctly identifies the number and approximate positions of the subsections; second, we evaluate how well the detected boundaries of the subsections map to the actual boundaries of the solving of various tasks. For this, we compute the difference in seconds between the actual start of the work on a task and the inferred start of a subsection. Table 3 presents an evaluation of the method with respect to detecting the correct number of tasks and their approximate location. Table 4 focuses on the performances of the method at detecting the actual boundaries of tasks. The 3 participants are identified as P1, P2, and P3. Using the data from this study, our approach is able to correctly detect the number of tasks and their approximate location in 81% of the cases (Table 3). All identified subsections correspond to a task. The error of 19% comes from not detecting two subsections out of the overall 11. However, these two subsections correspond both to the solving of one very simple task with a real duration of less than 3 min. Moreover, in the case of participant 3, where the solving of this task took much longer (20 min), the approach correctly identified the corresponding subsection.
(2)
3. Validation To obtain an initial validation of our approach, we perform a laboratory study with duration of 70 min. The participants are three M.Sc. students from the Free University of Bozen-Bolzano, with good knowledge of Java. During the 70 min, they have to solve as many tasks as possible, out of five the maintenance tasks (3 bugs and 2 enhancements). The code is a Java Swing application, called Paint, having 9 classes across 9 source files with a total of 503 noncomment, non-whitespace lines of code. The task and the Paint code, are those developed by Ko et al. [13]. The participants can use the Eclipse 3.1 development environment and any other software tool that they need. Prior to the experiment, the participants do not have any knowledge of the code or the tasks. They also ignore the purpose of the experiment and the methods that we want to test. They are paid for their time. During the experiment, PROM [15] collected data in a fully automated, non-intrusive way. The data consist in a stream of events reflecting how long and in which sequence the developer accessed various locations (methods, classes,
Table 3 Quantitative Evaluation of our Approach with Respect to Its Task Subsection Detection Capabilities Measure (task detection) Formula
P1
P2
P3
All
Real tasks
Number of tasks attempted
5
3
3
11
Detected tasks
Number of subsections that correspond to a task
4
3
2
9
Undetected tasks
Number of tasks that were not identified as subsections
1
0
1
2
False tasks
Number of subsections that do NOT correspond to a task
0
0
0
0
Success rate
Detected tasks/Real tasks
80% 100% 66% 81%
Table 4 Quantitative Evaluation of our Approach with Respect to Its Task Span Detection Capabilities Absolute error in seconds as: abs(detected_begin-real_begin) Error as: absolute error/task_span
P1
P2
P3
Minimum absolute error
54
0
5
Maximum absolute error
285
296
1158
Average absolute error
187.5 187 581.5
Minimum error
23%
Maximum error
41% 23% 166%
Average error
33% 14% 162
0% 0.15%
83%
We consider that it is an acceptable error of the method not to detect such small subsections, of less than 5 min. Trying to detect even such small subsections might actually lead to the method being overly sensitive to various subtasks and to segment the development session into much smaller subsections than needed. With respect to the detection of the actual beginning of each task, the biggest error is of 20 min difference between the real start of the work on a task and the inferred start of a subsection. This large error occurred only in one case. The main reason for this error is the fact that the current task shared with the previous task one method. The developer ended the previous task with a large amount of time spent on two locations: the method shared by the two tasks and the running application. This caused a large TIIA for the shared method to account for all the approximately 20 min. This large TIIA was wrongly assigned by our method to the second task. Assigning this TIIA correctly to the first task would reduce the error approximately 10 times. One way to improve the behaviour of this approach in such cases is to consider other locations also besides the code (in this case, the running application). The running of the application was part of the final testing of the implemented solution for the previous task. Thus, taking into consideration, this event would have made the link correctly to the first task. Our study has several limitations. The most important ones are the following:
4. Potential Benefits During the initial validation of our approach, several potential benefits emerged. This section presents them in detail. 4.1 Suggestions for Code Investigation The identified time span of a task (one subsection) can be further divided into sections corresponding to the initial exploration of the code and to the implementation of the solution. The implementation would correspond to the time interval of access to the core. The initial exploration would correspond to the time from the start of the task to the first core access. The core locations represent the locations essential to the task. The locations accessed during the initial exploration are those that the developer considered as starting points. Thus, for each task, we can identify both the essential locations and those that seem good starting points to developers. When another developer is accessing at the start of a task one of the previously identified starting point locations, s/he could get suggestions for code investigation based on core locations that are associated with that starting point. This approach to suggestions for code investigation takes into account the possibly misleading clues that developers use when choosing starting points for program investigation. Thus, the approach should be capable of offering valuable suggestions to developers even when they make a (frequently done) mistake in choosing a starting point for a task.
• Size and characteristics of the sample (three M.Sc. students). The ways in which students work to solve a task might be different from that of experienced programmers. The reduced size of our sample does not allow us to generalize the results. We plan to validate the method also on larger samples including developers working on their daily tasks.
4.2 Concern Identification The initial purpose of core detection in our method is to automatically detect the number and approximate position of task-related subsections inside a development session. For this purpose, we consider just the maximum point of the peak, although, due to the central average, the maximum number of accessed methods can be in the vicinity. However, for identifying the actual locations that are intensively accessed together, all the locations that are having TIIAs in the time frame of the peak should be considered. The method for detecting sets of methods that are accessed together could be applied inside each task-related subsection, so that the accesses from other tasks do not influence the accesses for the current task. The locations intensively accessed during the time frame of each detected peak represent groups of methods intensively accessed together. These groups could represent concerns that the developer was investigating. Further investigation is needed to assess whether such groups represent, indeed, concerns and to compare the groups of methods identified in this manner with the results from other concern identification methods such as in Ref. [10]. Our initial validation study did not collect information from the participants with respect to what they consider the concerns of each task, as this potential usage of our method emerged only later.
• Size and complexity of code and tasks. Given the limited duration of the study and the need for several tasks, we could not use a large amount of code, nor very complicated tasks. The participants might have addressed differently more complex tasks on larger projects. • Characteristics of programming language and IDE. The characteristics of Java and of Eclipse as the IDE might also influence the ways in which developers work on their tasks. Although we were aware from the beginning of these limitations, we considered that a laboratory study is a better choice for an initial validation study. We needed extensive validation data to help us not only assess the results, but also understand the causes of potential problems to improve our approach. Such data are very hard to get in industry settings. Moreover, we wanted to also understand to which extent our method is sensitive to personal ways of working with various developers. Thus, we needed participants to work on the same tasks. However, we plan as future work a validation of the approach in industry settings as well, given the encouraging results of this study. 163
Table 5 Summary of Approaches to Retrieval of Task Information in Existing Automated Data Collection Tools Tool
Purpose of Tool
Approach to Retrieving Task Information
Type of Approach
MetriQ1
Time tracking
User-defined grouping of files Semi-automated and applications.
RescueTime2
Time tracking
User-provided tags.
Semi-automated
SLife3
Time tracking
User selection of task.
Semi-automated
TimeSnapper4
Time tracking
User-provided tags.
Semi-automated
TrackTime5
Time tracking
User selection of task.
Semi-automated
Hackystat [18]
In-process data collection and analysis User-defined grouping of files Semi-automated and applications
6th Sense Analytics [19] In-process data collection and analysis User-defined grouping of files Semi-automated and applications PROM [15]
In-process data collection and analysis None
N/A
EPM [20]
In-process data collection and analysis None
N/A
ECG [21]
In-process data collection and analysis None
N/A
SUMS [22]
In-process data collection and analysis None
N/A
1 http://www.metriq.biz. 2 http://rescuetime.com 3 http://www.slifelabs.com 4 http://www.timesnapper.com 5 http://www.mamooba.com/TrackTime
the IDE based on their perceived degree of interest to the developer [9]. Table 6 summarizes the main characteristics of these works and relates them to our DOA approach.
4.3 Preliminary Step for Other Techniques Our method could be used as a preliminary step by techniques that offer developers suggestions on possibly relevant code [16, 17] or propose organizations of code elements based on their perceived relevance [9]. Currently, such techniques rely on the developer to manually specify the start of working on a task. Without this information, the contexts of two tasks are mixed and the performances of the techniques are negatively affected. Our method could complement these techniques by automatically identifying the switch to a new task.
Kersten and Murphy [9], describe a degree-of-interest (DOI) model for building the current context of code elements based on developer activity. The model relies on the developer for resetting it when a new task starts. DOI is quite similar to our DOA as both of them increase when a developer interacts with an element and decrease when a developer does not interact anymore with the given element. However, there are two main differences between DOI and DOA. First, DOI ranks elements at any given time based on all their previous history, whereas DOA computes for each element the intervals of intense access. Second, DOI does not explicitly take time into consideration, but only the number of key strokes and selections of elements, whereas DOA considers the time that an element is focused.
5. Related Work Some existing tools for automated data collection assign collected data to projects, activities, or tasks based on manually provided tags (RescueTime, TimeSnapper), user selection of activity (SLife, TrackTime) or user-defined grouping of files and applications (MetriQ, Hackystat). Table 5 gives an overview of existing automated tools for data collection with respect to their approach to retrieving task information. To our knowledge, our approach is first aiming at automatically splitting a development session into taskrelated subsections. Directly related works investigate the interactions of the developer with the IDE for inferring concerns [10, 11, 23] suggesting potentially relevant code, [16], or organizing the visualizations of code elements in
Parnin and G¨ org [16], propose to enhance recommendations for developers with a technique for building a context based on the interaction intensity and momentum of each access of the developer to a method. Both measures are defined based on 3 types of interactions between developer and IDE: navigation, clicks, and key strokes. The intensity represents the number of such interactions that occur during one single method access. The momentum is a continuous measure representing an exponential decay of the intensity. They do not consider explicitly the time, but consider instead the number of interactions as a proxy. 164
Table 6 Comparison of our DOA Approach to Related Approaches Approach
Purpose
Main Elements from Task Information Developer–Computer Interaction
Mylar [9]
To assist developers on their programming tasks by automatically building contexts (code elements of interest).
• Considers number of selections Manual: user has to reset the of elements. context when switching to a • Considers key strokes. new task.
Parning and To enhance code recommendations by G¨org [16] building contexts from the interaction history.
• Considers navigation. • Considers clicks. Considers key strokes.
Manual: user has to reset the context when switching to a new task.
NaCIN [10, 23]
Automated inference of concerns (sets • Considers visibility of of code elements important for a task). elements. • Considers ways in which the developer accesses elements.
Manual: the developer has to specify the number of elements to include in a concern.
DOA (our approach)
Enhancing automatically collected data • Considers navigation. with higher-level information on • Considers explicitly the time task-switching.
Automated detection of start of tasks.
consider as future work an automated tuning of the parameters. We plan a larger experiment that would allow us to test the automated tuning of the parameters on various types of tasks and developers. During the initial validation of the approach, two possible additional usages of the segmentation of development sessions emerged: automated suggestions for code investigation and task concern detection. We consider as future work the investigation of these directions more in detail. Finally, we also plan as future work the integration of our method with an existing tool for unobtrusive automated data collection. This would enhance the activity reports produced by the tool, by offering the information on the time spent for each task in interaction with the computer.
The contexts are computed as groups of methods that have frequently high momentum at the same time. Robillard and Murphy [10], propose to infer concerns (set of code elements that are important for a task) based on the interaction between developer and IDE during a development session and on the structural dependencies between the elements. As elements of interaction between developer and IDE, they consider the visibility of elements and the ways in which the developer accessed them. The developer has to specify the number of elements to include in a concern [23]. 6. Conclusions and Future Work The work presented here represents a first step toward enhancing tools for automated data collection with automated segmentation of development sessions into taskrelated subsections. Our aim is to provide a method for automated segmentation of development sessions into taskrelated subsections, based only on generic data that can be collected from any software tool that the developers might use. The automatic identification of the starting times of various tasks is useful for developers and managers to identify the time spent on each task. It can also serve as a preliminary step for other algorithms that detect task concerns. Currently, such algorithms require the user to mark manually the start of a task. Based on higher-level studies of developers’ behaviour and our own experience with a first unsuccessful approach, we proposed the DOA method that relies on the intensity of the developers’ activity. Tested in an initial validation study, the method performed well. We are currently working on improving our method based on the insights obtained from the validation study and we plan to perform more experiments (in laboratory and in industry) for validating it. Furthermore, we also
References [1] S.L. Pfleeger, Lessons learned in building a corporate metrics program, IEEE Software, May 1993, 67–74. [2] M. Daskalantonakis, A practical view of software measurement and implementation experiences within Motorola, IEEE Transactions on Software Engineering, 18 Nov 1992, 998–1010. [3] A. Gopal, M.S. Krishnan, T. Mukhopadhyay, & D.R. Goldenson, Measurement programs in software development: Determinants of success, IEEE Transactions on Software Engineering, 28(9), 2002. [4] A.M. Disney & P.M. Johnson, Investigating data quality problems in the PSP, 6th Intl. Symposium on the Foundations of Software Engineering, Orlando, USA, 1998. [5] P.M. Johnson & A.M. Disney, A critical analysis of PSP data quality: Results from a case study, Journal of Empirical Software Engineering, 1999. [6] C.A. Moore, Project LEAP: Personal process improvement for the differently disciplined, Proc. of ICSE, 1999, 726–727. [7] J. Henry, http://csciwww.etsu.edu/psp/. [8] P.M. Johnson, H. Kou, J. Agustin, C. Chan, C. Moore, J. Miglani, S. Zhen, & W.E.J. Doane, Beyond the personal software process: Metrics collection and analysis for the differently disciplined, Technical report, July 2002, http://csdl.ics.hawaii.edu/techreports/02-07/02-07.pdf.
165
Biographies
[9] M. Kersten & G. Murphy, Mylar: A degree-of interest Model for IDEs, 4th Conference on Aspect-Oriented Software Development, 2005. [10] M.P. Robillard & G. Murphy, Automatically inferring concern code from program investigation activities, 18th International Conference on Automated Software Engineering, 2003, 225– 234. [11] M.P. Robillard & G.C. Murphy, Program navigation analysis to support task-aware software development environments, Proc. of ICSE Workshop on Directions in Software Engineering Environments, 2004, 83–88. [12] I.D. Coman & A. Sillitti, Automated identification of tasks in development sessions, Proc. of Intl. Conference on Program Comprehension, 2008. [13] A.J. Ko, B.A. Myers, M.J. Coblenz, & H.H. Aung, An exploratory study of how developers seek, relate and collect relevant information during software maintenance Tasks, IEEE Transactions on Software Engineering, December 2006. [14] I.D. Coman & A. Sillitti, An empirical exploratory study on inferring developers’ activities from low-level data, Proc. of Software Engineering and Knowledge Engineering, 2007. [15] A. Sillitti, A. Janes, G. Succi, & T. Vernazza, Collecting, integrating and analyzing software metrics and personal software process data, Proc. of EUROMICRO, 2003. [16] C. Parnin & C. G¨ org, Building usage contexts during program comprehension, Proc. of IEEE Conference on Program Comprehension, 2006. [17] M.P. Robillard, Automatic generation of suggestions for program investigation, Proc. of ESEC/FSE, 2005. [18] P.M. Johnson, H. Kou, J.M. Agustin, Q. Zhang, A. Kagawa, & T. Yamashita, Practical automated process and product metric collection and analysis in a classroom setting: Lessons learned from Hackystat-UH, Proc. of the 2004 Intl. Symposium on Empirical Software Engineering, August 2004. [19] G. Burnell, 6th Sense Analytics, homepage, http://www. 6thsenseanalytics.com. [20] M. Ohira, R. Yokomori, M. Sakai, K. Matsumoto, K. Inoue, & K. Torii, Empirical project monitor: A tool for mining multiple project data, Proc. of Workshop on Mining Software Repositories, 2004. [21] F. Schlesinger & S. Jekutsch, ElectroCodeoGram: An environment for studying programming, Proc. of Workshop on Ethnographies of Code, March 2006. [22] N.A. Nystrom, J. Urbanic, & C. Savinell, Understanding productivity through non-intrusive instrumentation and statistical learning, Proc. of 2nd Workshop on Productivity and Performance in High-End Computing (P-PHEC), 2005. [23] I. Majid & M.P. Robillard, NaCIN – An eclipse plug-in for program navigation-based concern inference, OOPSLA 2005 Eclipse Technology Exchange, 2005.
Irina Diana Coman received her B.Sc. degree in Electrical and Computer Engineering from the “Politehnica” University in Bucharest and her M.Sc. degree in computer science with a Data Mining specialization from “Ecole Polytechnique” of Nantes in 2005. After graduation, she worked briefly as a software engineer in industry. She is currently a Ph.D. student at the Center for Applied Software Engineering at the Free University of Bozen-Bolzano. Her research interests include empirical software engineering, non-invasive measurement and agile methodologies. Alberto Sillitti is an Assistant Professor at the Free University of Bolzano. He holds his Ph.D. in Electrical and Computer Engineering received from the University of Genoa in 2005. He is involved in several EU funded projects related to Agile Methods and Open Source Software in which he applies non-invasive measurement approaches. He has served as Member of the Program Committee of several international conferences (e.g., XP 200x, OSS 200x), as Program Chair of OSS 2007 in Limerick (Ireland), as Reviewer for international journals and magazines (e.g., IEEE TSE, IEEE Software), and reviewer of international projects. His research areas include empirical software engineering, non-invasive measurement, web services, agile methods, and open source development.
166