was built on IBM's Jazz platform as a collaborative .... a review id and choosing a list of files. They also choose a ... that lists the defects, shows which reviewers found each defect .... reviewers each had a computer with the standard Rational.
DEFECT ESTIMATION USING CAPTURE-RECAPTURE IN IBM JAZZ John Doran and Kevin Gary Department of Engineering Arizona State University at the Polytechnic Campus Mesa, AZ 85212 {john.doran, kgary}@asu.edu ABSTRACT A single defect can have a crippling effect in today’s complex software systems. Defects may result in the loss of millions of dollars or in some cases even death. However it is not possible to prevent or find all defects when applications may have millions of lines of code. Project managers and software engineers are employing statistical approaches such as the Capture-Recapture method to estimate the total number of defects in a software project. The application of Capture-Recapture to defect estimation is fairly new. Currently there are not any solutions that provide a seamless structured way to collect defect data and automatically produce defect estimates using Capture-Recapture. The Defect Estimation Component is a solution that provides a process for performing code reviews and automatically applies the Capture- Recapture Method to the defect data collected. It was built on IBM’s Jazz platform as a collaborative development tool that combines the aspects of project management and software development into an integrated work environment. The goal of the Defect Estimation Component is to make it easier for developers to apply defect estimation to real world software projects. This paper presents the Defect Estimation Component and the results of a validation exercise conducted in the context of a software engineering capstone experience. KEY WORDS Defect estimation, capture-recapture method, software tools, software engineering education, inspections, quality
managers evaluate the health of their projects and to instill confidence in their customers. One such measure is the number of defects in a project. Software inspection has become an accepted practice. It has been shown that even the most experienced engineers on average introduce a defect every ten lines of code [4]. However it is not feasible to inspect every single line of code. Therefore a statistical approach has been introduced, known as the Capture-Recapture Method (CRM). CRM was originally used in biology to estimate the populations of different animals [5]. The total number of animals is estimated by capturing samples of the population and determining the amount of animals captured in more than one sample [6]. CRM is becoming common in software inspections. Instead of tagging animals, developers review software documents for defects. For defect estimation, there are tools available for either performing code reviews or for calculating estimates. There is a lack of solutions that implement a process that integrates both in a distributed team environment, and that present the results in the context of the software project. The Defect Estimation Component (DEC) was designed to implement a process that combines code reviews and defect estimation with CRM. The DEC is integrated into Rational Team Concert and takes advantage of the services it provides. IBM’s Rational Team Concert is a tool that provides a work environment that supports collaboration between project managers and developers [7]. Since Rational Team Concert is built on the Jazz platform, those terms are used interchangeably in this paper. The DEC is intended to help developers easily apply the process of defect estimation using the CRM on a software project.
1. Introduction Software quality is essential in many scientific, medical, and business areas. Defects can cost companies losses in profits, time, and credibility. Last year NASA lost a satellite designed to orbit Mars because of a unit conversion error [1], costing approximately 125 million dollars. More recently, Toyota has had to recall approximately 400,000 vehicles due to an error in software related to the anti-lock braking system [2]. A particularly devastating error occurred in software for an instrument designed to treat cancer [3]. The resulting overdoses of radiation resulted in the deaths of five people and adversely affected the health of 28 other patients. A measure of software quality is needed to help
2. Background 2.1 The Capture-Recapture Method LaPlace was the first person recognized for applying the Capture-Recapture method (CRM), using it in 1786 to estimate the population of citizens in France [8]. It is now common to use it to estimate the population of a species of wild animals in a particular region. As indicated by its name, a biologist captures a few wild animals in the area under study. The animals are tagged, recorded, and released. The biologist waits a period of time to allow the tagged animals to mix back into the general population. A
second biologist works independently to capture another sample of animals without knowledge of the details of the first sampling. The second biologist then records how many tagged animals are found in the second catch. There are a few assumptions made in the experiment [9]. First and most important is that the two biologists work independently so that the second biologist does not gain any advantages from the details of the first sampling. Usually it is assumed to be a closed model [6], meaning that no animals leave or enter the area under study through birth, death, or migration. Optional assumptions are that biologists have the same skill at capturing animals, and all animals are equally difficult to catch. The total population can then be estimated by the Lincoln-Peterson equation [9]. N is the total population, n1 is the number of animals in the first sample, n2 is the number of animals in the second sample, and m2 is the number of animals found in both samples. (1) N = (n1 * n2) / m2 Mills [6] is credited as the first person to apply CRM to the discipline of Software Engineering in 1972. He took pseudocode for a software program and injected known defects (seeding) into it. This step is similar to the animals tagged in the first capture. He then had a reviewer examine the pseudocode to find both the seeded defects as well as any previously undiscovered defects. However it was found that seeded defects did not always duplicate the actual defects in the software. This is similar to having the tagged animals act differently than how the rest of the population acted naturally in their environment. Basin [9] changed the experiment by using two different reviewers to examine the software document instead of trying to seed defects. Eick [8] is credited as the first person to apply CRM to software inspection. Since that time several models have been developed to vary the experimental assumptions made to emulate real world conditions. The application of CRM in software reliability remains controversial, we are (clearly) in agreement with researchers claiming its validity in this area. Readers are referred to [8][9] for further discussion on the topic. Models are a group of estimators that adhere to the same set of assumptions. Table 1 summarizes the models and their assumptions. An estimator is the actual function used to calculate the estimated total number of defects. There are a variety of estimators as well, and some perform better than others in different conditions. The model operates under the following assumptions. • Reviewers work independently [6]. • Defects have an equal probability of being detected. • Reviewers all have equal ability to find the defects. The third assumption means they have the same experience and do not specialize in finding a particular type of defect. This model is known as M0. Examples of estimators used are the Lincoln-Peterson equation and the Maximum Likelihood equation. Schofield [10] provides a modified version of the Lincoln-Peterson estimator. N is
the total number of defects, A is the number of defects found by the reviewer who found the most unique defects, B is the number of unique defects found by all other reviewers, and C is the number of defects found in common between A and B . (2) N = (A * B) / C The second model, Mt, is known as the time-response model. It recognizes that different reviewers have different abilities, though it still treats all defects as having an equal probability of being detected. Examples of estimators for this model include the Moment estimator [12] and the Maximum Likelihood estimator [6]. The third model Mh, known as the heterogeneity model [9], recognizes that not all defects are equally easy to detect, though it does assume all developers have equal detection ability. The fourth model, Mth, is the most flexible as it does not make assumptions about reviewer ability or defect difficulty. It typically produces the most accurate results but requires more data to calculate the estimated total. Table 1. Models and Estimators in CRM [8] M Assumptions M0 All defects have equal probability of being found All reviewers have equal ability to find defects
Estimators Maximum Likelihood Lincoln-Peterson Mh All defects have equal probability of being found Maximum Some reviewers better than others at finding defects Likelihood Chao (1989) Mt Some defects more difficult to discover than others Jackknife All reviewers have equal ability to find defects Chao 1987 Mht Some defects more difficult to discover than others Chao 1992 Some reviewers better than others at finding defects
2.2 Existing Tools 2.2.1 IBM/Rational’s Jazz/TeamConcert Platform The DEC extends the IBM/Rational Jazz platform by using the services provided by IBM’s Rational Team Concert (RTC). RTC supports a collaborative work environment [7], providing transparency in the development process (we use the terms RTC and Jazz interchangeably). RTC provides iteration planning, version control, and configurable lifecycle process models. It also provides history and traceability for a software project. RTC does support defect tracking, but it does not yet have any built-in processes for conducting code reviews, nor does it have any mechanisms for estimating the total number of defects in a project. 2.2.2 Defect Estimation Using CRM There are many software packages available for calculating estimates using CRM. Although the CRM is applicable to defect estimation, these packages were designed specifically for wildlife population estimation. 2CAPTURE was designed at Colorado State University [13]. One of its strengths is that it has a choice
of ten different estimators. Estimators are the equations used to provide an estimate for CRM. Each estimator works best under different assumptions and conditions. A nice feature of 2CAPTURE is that it uses tests to determine the best estimator to use. Another software package is EstimatorS [14]. It also has a variety of estimators. It accepts a tab delimited file for input and can export the results to an output file. The current software packages were designed for wildlife population estimations and do not provide a process for collecting defect data. In addition, the results of a calculation must be exported out to a file. The DEC saves the calculation results to a custom work item. It uses work item relationships to keep the results in the context of the code review, leveraging Jazz work item query builders to retrieve results at any time. 2CAPTURE and EstimatorS each have the advantage of being more flexible in adjusting to different conditions. They both provide several different CRM estimators. The DEC in its current state only provides a single estimator. However it was designed so that future implementations of estimators can easily be substituted for the default configuration. 2.2.3 Code Reviews Jupiter is an open-source plug-in for the eclipse IDE (integrated development environment) for performing code reviews developed at the University of Hawaii [11]. The tool starts a new code review with the user setting up a review id and choosing a list of files. They also choose a set of reviewers and an author. Jupiter has three phases of review. During the individual phase, the reviewer will examine the files and create a review issue entry for each defect found. The review issue requires a severity and an issue type. It also allows the user to specify a line number of the file under review. The review issue also has a summary and description section. At the end of each individual review, the reviewer saves their .review XML file. During the team phase, each individual’s .review files are imported so that all review issues can be analyzed by the team. The review issues can be resolved as duplicate, invalid, or another resolution. The issues can also be assigned to people for fixing later. After the team phase is over, the .review files are saved again. The final phase of Jupiter is called rework. This is where users fix the review issues that were verified to be accurate. The method for distributing the code review across multiple computers is to commit the review XML files to a separate configuration management repository. In this sense, Jupiter keeps track of the history of code reviews. The DEC was designed similarly to Jupiter but has many additional benefits. Jupiter provides a three step process for performing code reviews. The DEC adds an additional step for defect estimation calculations using Capture-Recapture. With the DEC, reviewers can work independently and can even be separated geographically yet still collaborate synchronously thanks to Jazz’s clientserver architecture. The facilitator can easily know the status of a particular defect estimation session at any time.
In Jupiter, distributed users must use an external means of communication and must also constantly commit their review files to a repository for the same real-time insight into the process. The DEC has the added benefit that users can use a built-in chat client to record the participants’ discussions to a work item so the context of the code review is not lost when reviewed later. By using customized work items, the DEC makes it easy to see the review issues that were discovered. Jupiter on the other hand, requires that the review XML files be loaded each time a user wishes to see the review issues. Both Jupiter and the DEC provide an opportunity to customize the level of detail of a review issue.
3. The DEC Solution The DEC implements a process where users interact with three types of custom work items. The four step process would be familiar to anyone who has previously participated in a code review and a defect estimation exercise. The four steps are initialization, individual review, team discussion, and estimation. The individual review and team discussion steps are similar to the individual and team phases in Jupiter. The process requires a team of at least two people. In the case where there are only two users, one user must act as a facilitator and a reviewer. The reviewer role is responsible for examining documents for defects and recording those defects. The reviewer also participates in a team discussion where the recorded defects are inspected for validity and duplication. The facilitator role chooses the documents to be reviewed. The facilitator also chooses the reviewers who will be involved in the process. The facilitator also leads the team discussion. Custom work items were created in Jazz using the built-in work item customization [15]. Jazz provides an interface that leverages existing code to modify editor presentations and add custom attributes. It is also possible to define custom workflows (states and transitions). There is no need to define a storage model, as it already exists. The three custom work items used in the DEC were Review Parent, Review Child, and Review Issue. Figure 1 shows the relationships between the custom work items.
Figure 1. Relationships of custom work items for the DEC The facilitator uses a Review Parent work item to manage transitions in each process step. Review Parent defines a series of states that indicate the workflow process. Review Parent attaches documents to review, records the reviewers involved, and presents a summary of the defect estimation. The summary presents a table that lists the defects, shows which reviewers found each defect, and shows which estimator was used in the calculations. The summary gives an estimate of the total probable defects in the documents being reviewed. If the documents are source code, defect density is found by taking the total probable defects and dividing by the lines of code. If the documents are requirements, defect density is found by taking the total probable defects and dividing by the number of requirements. Figure 2 is a screenshot of the details view of the Review Parent work item.
the same as those found in Jupiter, so that Jupiter users can import existing review data into Jazz. The team may link a Review Issue as a duplicate if it was found by two different reviewers during the same session. The decision to use a new, distinct work item instead of the Jazz builtin defect work item was made to avoid adding extra attributes to the built-in work item that may only be useful for defect estimation. Instead, the unique Review Issue work items can be converted to Jazz defect work items at the end of defect estimation. Figure 4 shows the Review Issue work item.
Figure 4. Review Issue work item details view
Figure 2. Review Parent work item details view Each reviewer involved in the defect estimation uses a Review Child work item. Review Child is a child of Review Parent that asks a reviewer to examine each document for defects. Review Child starts in an open state and the reviewer changes it to the closed state when s/he is done with individual inspections. Review Child is shown in Figure 3.
The process starts with the facilitator preparing for the defect estimation session. The facilitator chooses a unique review id and the documents to review. The review id is used to identify the defect estimation session and to group work items together. The documents under review could be any artifacts in the software project but are often source code. The facilitator chooses a minimum of two people to act as reviewers. The facilitator may then customize instructions that will be presented to the reviewers on the associated work items. At the end of this step, Jazz creates a Review Parent work item for the facilitator, and a Review Child work item for each reviewer. Figure 5 shows the interactions of the first step.
Figure 3. Review Child work item details view Found defects are recorded using a Review Issue work item. Review Issue has attributes such as severity and issue type which are used to help classify types of defects. Review Issues are linked as children of a Review Child. Review Issue states and associated resolutions are
Figure 5. Interactions during the start of defect estimation
The next step of the process is where each reviewer independently examines the documents. It is important that the reviewers act independently as this was an assumption in all models of the CRM. When a reviewer finds a defect, s/he creates a Review Issue work item and links it to the Review Child. When finished inspecting, the reviewer moves the Review Child to the closed state. When all Review Child work items for a defect estimation session have been closed, Jazz will indicate to the facilitator to start the next step of the process. Figure 6 shows the interactions during the individual review step.
from Discussion to Estimation. Figure 7 shows the interactions during the discussion step.
Figure 7. Sequence of events during the discussion step
Figure 6. Sequence of events during the individual review During the next step of the process, discussion, the team examines all the Review Issues of the defect estimation session as a group. If two or more Review Issues indicate the same defect in a document, the team will link the other Review Issues as “duplicate of” another Review Issue. There are a few ways that Jazz can assist in finding duplicates. Jazz has a “find duplicates” button on each work item’s view. The button will run a query that examines the values of its attributes and can suggest the possibility of another work item being a duplicate. Jazz also allows the creation of custom work item queries. The team can set up a query to look at specific attributes of a work item such as review id, issue type, and summary. In either case, the team should still closely compare the Review Issues before linking them as duplicates. If the team decides that a Review Issue is not valid, the Review Issue is moved to either a Closed or Resolved state and its resolution is marked as “Invalid - Won’t Fix.” The combination of state and resolution are indications to Jazz that the work item should be disregarded when performing defect estimation calculations. When the team is finished with the discussion, the facilitator can move to the next step by changing the state of the Review Parent
The last step in the defect estimation process is to calculate estimates for the total probable defects. The Jazz server will collect the information from all the Review Issues to automatically perform a calculation. The results are then presented and saved on the Review Parent work item. The choice of Capture-Recapture estimator to use in the calculation will have been set up previously by an administrator. It is possible for an error to occur in the calculations if the reviewers did not follow instructions or if the feature was incorrectly configured. Instead of providing estimates, Jazz will provide a helpful error message that will allow the users to fix the mistake and recalculate the results. Figure 8 shows the interactions during the estimation step. Figure 9 presents the results on the Review Parent. Development for the Jazz platform is done by creating plug-ins that group together Java classes that have related functionality Plug-ins can extend other plugins or provide extension points to those plug-ins. Those with extension points expose interfaces. Those that provide extensions implement the interfaces [16]. The plug-ins allow a very modular design and the extensions make it easy to add or change functionality. A total of five plug-ins were created for the defect estimation process. All the plug-ins have the same prefix edu.asu.poly.defect.estimation.recapture, which indicates to developers that they are related and also hints at their functionality. The plug-ins also have a component id edu.asu.poly.defect.estimation.recapture in their plug-in.xml configuration files which Jazz uses to recognize that the plug-ins together form a component. The service plug-in contains services that are run on the Jazz server side. Related to the service plug-in is service.tests which contains unit tests for the classes in the service plug-in. The common plug-in mostly contains interfaces deployed on both the server and any clients.
The client plug-in contains client libraries. The rcp.ui plug-in contains classes related to the user interface in the RTC eclipse client. The UI uses the SWT library to create perspectives views, editors, and wizards.
Figure 8. Sequence diagram for the estimation step
Figure 9. Estimation results are saved on Review Parent The core functionality that supports the defect estimation process is found in the common and service plug-ins. The service plug-in was implemented in three layers. The goal of the layered approach was to make it possible to perform the calculations without being tied to the Jazz server. One result was that it became easier to write unit tests. Another result was that the code could be reused outside of the Jazz environment. The most basic layer is the calculation layer. At this level, estimators are represented by Java classes that take numbers as inputs and return numbers as outputs. One class in this layer is the SoftwareEnterpriseEstimator. The next layer, called the review session layer, represents a code review via model objects that relate defects to reviewers. SoftwareEnterpriseEstimator is wrapped by ReviewSessionSoftwareEnterpriseEstimator to accept ReviewSession objects. The ReviewSession is composed of ReviewerRecords and DefectRecords. A ReviewerRecord contains the ids of all defects found by a single user. A DefectRecord contains a list of user ids for reviewers who found the same defect. The last layer, called the Jazz integration layer, uses Jazz services and objects like work items to implement the code review in Jazz. This is where the heavy lifting is done to implement the process. The BasicWorkItemService applies to work
items in general. It can create new work items and links, find related work items, and look up attribute values, states, and resolutions for work items. The DefectEstimationService interfaces with the BasicWorkItemService to handle the DEC custom work items. It will create all the work items needed for a defect estimation session. The DefectEstimationListenerTask is an asynchronous task that runs in the background on the server. It periodically checks the states of the work items to assist in the defect estimation process. If all Review Child items are closed, it will move the Review Parent from In Progress to Discussion. When a Review Parent is moved to the Estimation state, this listener task will perform the calculation and update the Review Parent with a summary or else an error message indicating what a user needs to change to perform a calculation. The listener task can be configured through the Jazz web interface, under Advanced Properties. It has two properties. The first property known as the "taskDelay" indicates how often the task should be run. The second property is the "estimatorClassname". This is where the defect estimation process can be updated to use a different CRM estimator. There is also a class called DefectEstimationReviewIdSynchAdvisor. This advisor runs as a precondition before a Review Child is closed. It will update the Review Issue children to have the same review id as the Review Child. This eliminates the need for reviewers to manually fill in the review id for each Review Issue that they create.
4. Validation The DEC tool is a general purpose tool intended for use by software development professionals adopting the Jazz platform. In the present context however, the tool was implemented and deployed in an academic setting in the context of a capstone project experience known as the Software Enterprise. In the Software Enterprise, upperclassmen and graduate students work in a cyclical pedagogical model where domain concepts are reinforced through problem-centered learning and project-based practice. A full presentation of the Software Enterprise may be found in [17] or [18]; for the purpose of evaluating the DEC tool, it is merely important to recognize that juniors and seniors were asked to learn the tool in this pedagogical context. Students were introduced to CRM by reading Schofield [10], and performed a practice exercise (approximately 3 hours) where they applied CRM in the small via both pencil-and-paper and the DEC tool. They were then asked to perform defect estimation on their capstone projects using the DEC tool, and reflect on the experience. At the end of the semester, we conducted an ethnographic validation exercise. The validation task was to evaluate the ease of use of DEC for the intended audience by conducting a structured exercise to solicit feedback. The structured exercise consisted of groups of three students. The students had a working knowledge of
Jazz and a basic understanding of code reviews and CRM. Each individual was given a code review check sheet and an instruction guide on using the DEC. They were also given a sample Java file containing several defects. The reviewers each had a computer with the standard Rational Team Concert eclipse client installed. The eclipse client had been setup with the client feature of the DEC. In addition, the computers were equipped with microphones and Cam Studio. Cam Studio is freely available software that records user interaction through screen shots and audio input. Cam Studio saves the recordings as AVI files [19]. The server feature of the DEC had been installed on a Jazz Team Server and a test project was set up with the defect estimation process configuration. The students were asked to perform a code review on the Java source file using the new Jazz component. Their interactions and any feedback or questions were recorded for later analysis. Usability would be considered a success if the students were able to navigate the DEC and arrive with an estimate of the total defects at the end of the experiment by use of the user guide. If the users ended with an error condition or required significant intervention by an observer, then the DEC would need to be improved. For the structured exercise, a total of six students participated. The first group was not given a specific time limit to analyze the source code provided. This resulted in time being spent on the ability of a student to review code rather than on their interaction with the Defect Estimation Component. For the second group, a limit of 15 minutes was given to students to find defects in the source code. The second group was able to complete the exercise more quickly. Both groups were able to complete the defect estimation successfully by obtaining calculations at the end instead of an error condition. During the experiment, the students asked several questions or provided comments. The questions and comments were grouped into three areas (see Table 2). Some feedback was related to experiment setup. Other feedback was related to the clarity of the user guide. There was also feedback associated with how the component itself worked. The feedback from the students was very helpful. Some suggestions have already been applied to the Defect Estimation Component. For example, one student suggested that it was tedious to have to copy the review id to each Review Issue work item. It could also lead to errors if a typo was made. The component was revised to use a Precondition in Jazz that automatically synchronized the review id of the Review Issue work items with their parent Review Child work item. The feedback also resulted in a revision of the user guide. Suggestions that could not yet be implemented are presented as future work. The result of validation was that the Defect Estimation Component was verified to meet necessary requirements. Other “nice to have” features could be added as enhancements in future releases. The structured exercise showed that students familiar with code review and CRM understood how to use the Defect Estimation Component. The students provided feedback that helped
to improve the clarity of the user guide. The feedback also resulted in enhancements to make the DEC easier to use. The improvements will make the DEC a better candidate for classroom use. Although the initial application was to provide a tool for students, the DEC tool was designed to be applicable to a larger audience. Additional validation is planned by providing the DEC to the Jazz community to gain additional feedback. The Jazz community consists of users and developers, many who use Jazz regularly to manage real world projects. Table 2. Feedback from the student validation experiment Category Question/Comment Will we have work items created for us? Are we only reviewing a single file? Do we get the file from the team stream? Experimental When finding duplicates during the team discussion, do Setup we go through work items or through the checksheet? How do we set the resolution for an invalid Review issue? What value should we assign for a Review id? Does the “owned by” attribute need to be entered for Review issues for the calculation? Should one person or multiple people mark the review issues as duplicates? We can only make a review issue a duplicate of a single User Guide review issue. Should each duplicate issue point to each other? I don’t see the Review issue work item type How do we link multiple Review issues when we have more than 2? Do we close the Review Child afte we are finished reviewing the code? Do the review issues need to be closed during the individual review? Is it possible to automatically populate the review id on Defect the Review Issues work item? It is tedious and a user Estimation could have a typo entering it manually. Component What should a user do if an issue spans multiple lines? Finding duplicates of the Review Issues is a terribly tedious process. Is there an easier way?
5. Conclusions The Defect Estimation Component leverages the services of Jazz to provide a solution that assists users in document reviews and defect estimation calculations using CaptureRecapture. There are solutions that assist in code reviews (like Jupiter) or solutions that provide an automated calculation of Capture-Recapture (like 2CAPTURE), but there is not a solution that can do both. The primary goal of the Defect Estimation Component was to assist students in learning about defect estimation and CRM by making it easy to apply to their software projects. To support validation of the DEC, a structured exercise was conducted to see how easily a defect estimation session could be completed by students with some knowledge of CRM and Rational Team Concert. The initial results seem positive. The Defect Estimation Component is not limited to academic use. It can be implemented by any organization already using the Jazz platform. It was designed to support multiple CRM estimators beyond the one provided in the Schofield
paper. Also because Jazz provides the ability to customize processes and work items, the DEC can be modified to suit a development team’s specific code review process. Additional validation will be performed by providing the DEC to the Jazz community for more comprehensive evaluation. It is expected to provide more feedback which can then be applied to future versions of the DEC. There are still plenty of opportunities to build upon the DEC. Some enhancements were suggested initially and others came as a result of user feedback. Feedback from the structured exercise had two questions that have not yet be addressed. The first question was “Is there an easier way to find duplicate Review Issues?” During the exercise, the students were given the suggestion to use the “find potential duplicates” button that is located on the Links tab of a work item. One possible remedy would be to create a custom query using Jazz’s query builder that examines the attributes of different Review Issue work items. The other question was “How do I represent a defect that spans multiple lines?” The current configuration of a Review Issue only has a single line number attribute. It would be fairly easy to add another line number attribute. More structured exercises should be conducted to see if that is an acceptable solution. Another way to add more value to the DEC would be to implement a visual representation of the estimated defects vs. the actual defects over time. A history of the estimates and actual defects found would provide a simple view of project health. It would also be beneficial to project managers as they could refine the defect estimation process to be more accurate. The DEC was designed to be flexible so that it could accommodate different CRM estimators. Due to time constraints, the only one estimator was developed. By implementing other estimators, the DEC would allow users to relax some assumptions and adjust the defect estimation to provide better estimates when conditions change such as the number of reviewers or when source code may contain a mix of simple and complex defects. Another beneficial enhancement would be to automate the process of converting a Review Issue work item into a Jazz Defect work item. Currently this process has to be done manually. If there was a way for a user to define and save a mapping between the states, resolutions, and attributes, then a service on Jazz could use the mapping to convert the work items. Defect estimation, though around for some time, is only now starting to gain mainstream adoption is software development houses. The availability of tools is always a primary factor driving adoption, particularly given the popularity of modern IDEs. The DEC tool implements a simple but powerful online, asynchronous, and collaborative environment for integrated code reviews and CRM-based defect estimation.
Acknowledgement(s) This work was supported by an IBM Eclipse Innovation Foundation Award.
References [1] R. Lloyd. “Metric Mishap Caused Loss of NASA Orbiter.” http://www.cnn.com/TECH/space/9909/30/mars.metric.02. September 2009. [2] M. Williams. “Toyota Prius Software Glitch Forces Global Recall” http://www.computerworlduk.com/toolbox/softwarequalitytestin g/quality-assurance/news/index.cfm?newsid=18732. Feb 2010. [3] D. Gage, J. McCormick, and B. R. Thayer, “’We Did Nothing Wrong’ Why Software Quality Matters.” http://www.iaea.org/NewsCenter/Features/Radiotherapy/dissecti on109.pdf. March 2004. [4] W. S. Humphrey, “Why Quality Pays,” Computerworld, vol. 36, pp. 48–50, May 2002. [5] A. Chao, “An Overview of Closed Capture-Recapture Models,” Journal of Agricultural, Biological, and Environmental Statistics, vol. 6, no. 2, pp.158–175, 2001. [6] T. Thelin and P. Runeson, “Confidence Intervals for Capture-Recapture Estimations in Software Inspections,” Information and Software Technology, vol. 44, no. 12, pp. 683– 702, Sept 2002. [7] IBM. “Rational Team Concert Features.” http://jazz.net/projects/rational-team-concert/features.April 2010 [8] H. Petersson, T. Thelin, P. Runeson, and C. Wohlin, “Capture-Recapture in Software Inspections after 10 Years Research, Theory, Evaluation and Application,” Journal of Systems and Software, vol. 72, no. 2, pp. 249–264, Jul 2004. [9] L. C. Briand, K. E. Emam, B. G. Freimut, and O. Laitenberger, “A Comprehensive Evaluation of CaptureRecapture Models for Estimating Software Defect Content,” IEEE Transactions on Software Engineering, vol. 26, pp. 518– 540, 2000. [10] J. Schofield, “Beyond Defect Removal: Latent Defect Estimation With Capture-Recapture Method,” CrossTalk: The Journal of Defense Software Engineering. August 2007. [11] T. Yamashita, H. Kou, and J. A. Sakuda. “Jupiter User Guide.http://code.google.com/p/jupitereclipseplugin/wiki/UserGuide. June 2009. [12] A. Chao, “Estimating population size for sparse data in capture–recapture experiments.” Biometrics 45, 427–438, 1989. [13] E. Rexstad and K. Burnham. “User’s Guide for Interactive Program CAPTURE. http://www.mbrpwrc.usgs.gov/software/doc/capture/capture.htm [14] R. K. Colwell. “EstimateS: Statistical Estimation of Species Richness and Shared Species from Samples. Version 8.2. User’s Guide.” http://purl.oclc.org/estimates. 2009. [15] IBM. “Work Item Customization.” http://jazz.net/library/article/129. June 2009. [16] E. Clayberg and D. Rubel. eclipse: Building CommercialQuality Plug-ins. Boston, MA: Addison-Wesley, 2004. [17] K. Gary. “The Software Enterprise: Practicing Best Practices in Software Engineering Education”, The International Journal of Engineering Education Special Issue on Trends in Software Engineering Education, July 2008. [18] K. Gary. “The Software Enterprise: Preparing Industryready Software Engineers” Software Engineering: Effective Teaching and Learning Approaches, Ellis, H., Demurjian, S., and Naveda, J.F., (eds.), Idea Group Publishing. October 2008.