Toward Quantitative Process Management With Exploratory Data ...

5 downloads 13274 Views 29KB Size Report
with the exploratory data analysis involved in initiating quantitative process ... The Software CMM is a five-level model that describes good engineering.
1999 International Conference on Software Quality

Cambridge, MA

Toward Quantitative Process Management With Exploratory Data Analysis Mark C. Paulk Software Engineering Institute Carnegie Mellon University Pittsburgh, PA 15213

Abstract The Capability Maturity Model for Software is a model for building organizational capability that has been widely adopted in the software community and beyond. The Software CMM is a five-level model that prescribes process improvement priorities for software organizations. Level 4 in the CMM focuses on using quantitative techniques, particularly statistical techniques, for controlling the software process. In statistical process control terms, this means eliminating assignable (or special) causes of variation. Organizations beginning to use quantitative management typically begin by "informally stabilizing" their process. This paper describes typical questions and issues associated with the exploratory data analysis involved in initiating quantitative process management.

Introduction The Capability Maturity Model (CMM) for Software [Paulk95], developed by the Software Engineering Institute (SEI) at Carnegie Mellon University, is a model for building organizational capability that has been widely adopted in the software community and beyond. The Software CMM is a five-level model that describes good engineering and management practices and prescribes improvement priorities for software organizations. The five maturity levels are summarized in Figure 1. The higher maturity levels in the CMM are based on applying quantitative techniques, particularly statistical techniques [Florac99], to controlling and improving the software process. In statistical process control (SPC) terms, level 4 focuses on removing assignable causes of variation, and level 5 focuses on systematically addressing common causes of variation. This gives the organization the ability to understand the past, control the present, and predict the future – quantitatively. Regardless of the specific tools used (and control charts are implied by SPC), the foundation of levels 4 and 5 is statistical thinking [Hare95], which is based on three fundamental axioms: • all work is a series of interconnected processes • all processes are variable 

Capability Maturity Model and CMM are registered with the U.S. Patent and Trademark Office. Personal Software Process and PSP are service marks of Carnegie Mellon University. The Software Engineering Institute is a federally funded research and development center sponsored by the U.S. Department of Defense. SM

1

1999 International Conference on Software Quality

Cambridge, MA

• understanding variation is the basis for management by fact and systematic improvement The statistical thinking characteristic of a high maturity organization depends on two fundamental principles. First, process data is collected at the “process step” level for realtime process control. This is perhaps the most important single attribute of a level 4 organization – that engineers are using data to drive technical decision making in realtime, thereby maximizing efficiency. Second, and a direct consequence of statistical thinking, is that decision making at the process level incorporates an understanding of variation. A wide range of analytic techniques can be used for systematically understanding variation, ranging from simple graphs, such as histograms and bar charts, and statistical formulas, such as standard deviation, to statistical process control tools, such as XmR charts, u-charts and beyond. The simplicity of a histogram does not lessen its power – a simple picture that imparts insight is more powerful than a sophisticated formula whose implications are not understood.

Level 5 Optimizing 4 Managed 3 Defined

2 Repeatable

1 Initial

Focus

Key Process Areas

Continual process improvement

Defect Prevention Technology Change Management Process Change Management Quantitative Process Management Software Quality Management

Product and process quality Engineering processes and organizational support

Organization Process Focus Organization Process Definition Training Program Integrated Software Management Software Product Engineering Intergroup Coordination Peer Reviews Requirements Management Software Project Planning Software Project Tracking & Oversight Software Subcontract Management Software Quality Assurance Software Configuration Management

Project management processes

Competent people and heroics

Figure 1. An overview of the Software CMM. Although the Software CMM has been extensively used to guide software process improvement, the majority of software organizations are at the lower maturity levels; as of March 1999, of the 807 organizations active in the SEI's assessment database, only 35 were at levels 4 and 5. While the number of high maturity organizations is growing rapidly, it takes time to institutionalize a measurement program and the quantitative management practices that take good advantage of its capabilities. The typical software organization takes over two years to move from level 1 to level 2 and from level 2 to level 2

1999 International Conference on Software Quality

Cambridge, MA

3 [Herbsleb97]. One to two years is a reasonable expectation for building, deploying, and refining quantitatively managed processes. One of the challenges in moving to level 4 is the discovery organizations typically make when looking at their process data: the defined processes used by the projects are not as consistently implemented or measured as believed. When a process is being placed under statistical process control in a rigorous sense, it is "stabilized" by removing assignable causes of variation. "Informal stabilization" occurs simply by examining the data (graphically) before even placing it on a control chart, as patterns in the data suggestive of mixing and stratification are seen. If there is a great deal of variability in the data, a common complaint when arguing that SPC cannot be applied to the software process [Ould96], the control limits on a control chart will be wide. High variability has consequences: if the limits are wide, predictability is poor, and highly variable performance is to be expected for future performance. If highly variable performance is unacceptable, then the process will have to be changed. Ignoring reality will not change it. Since some studies suggest a 20:1 difference in the performance of programmers, variability is a fact of life in a design-intensive, humancentric process. The impact of a disciplined process can be significant in minimizing variation while improving both quality and productivity, as demonstrated by the Personal Software ProcessSM [Humphrey95, Hayes97]. Some software organizations are using control charts appropriately and to provide business value [Paulk99a, Paulk99b], thus there are a few examples of SPC for software providing business value. Informally stabilizing the process can be characterized as an exercise in exploratory data analysis, which is a precursor to the true quantitative management of level 4. The processes that are first stabilized tend to be design, code, and test, since there is usually an adequate amount of inspection and test data to apply statistical techniques in a fairly straightforward manner. A fairly typical subset of code inspection data in Table 1 illustrates what an organization might start with. The organization that provided this data was piloting the use of control charts on a maintenance project. Table 1. Representative Code Inspection Data From an Organization Beginning Quantitative Management. Number of Inspection Inspectors Preparation Time 7 7.8 6 8.1 5 4.4 5 2.8 6 18.0 6 4.7 6 3.0 5 2.6

Code Inspection Time (number of inspectors X inspection hours)

13.5 9.5 1.3 2.5 0.9 1.5 3.0 2.5 3

Number of Defects

Lines of Code

2 2 0 1 2 2 0 3

54.3 87.4 60.1 1320.6 116.2 46.6 301.6 62.0

1999 International Conference on Software Quality

Cambridge, MA

If you were asked to analyze the data in Table 1, what questions might you ask? They will probably fall in four broad categories: operational definitions, process consistency, aggregation, and organizational implications.

Operational Definitions Good operational definitions must satisfy two important criteria [Florac99]: • communication. If someone uses the definition as a basis for measuring or describing a measurement result, will others know precisely what has been measured, how it was measured, and what has been included and excluded? • repeatability. Could others, armed with the definition, repeat the measurements and get essentially the same results? In looking at the data in Table 1, the first question is likely to be, "How is a line of code defined?" The fact that the LOC values are not integers rings a bell, yet, when first hearing that this is a maintenance project, the question should have arisen, "How do they deal with modified, deleted, and unchanged lines?" In this case, a formula was used to create an aggregate size measure that is a weighted function of new, modified, deleted, and unchanged lines. It is more important to know that the formula exists and is being consistently used than to know what the specific formula is. The second question might be, "What are the time units?" One panelist at the 1999 European SEPG Conference reported a case where the unit was not clearly communicated, and they discovered that their time data included both hours and minutes (assuming every value of 5 or less was hours was their pragmatic solution). The third question is obviously, "How is a defect defined?" While the first two metrics can be collected in a fairly objective fashion, getting a good operational definition of "defect" can be challenging. Are multiple severity levels included, from life-critical to major to trivial? Are trivial defects even recorded? How does the inspection team determine what category a defect belongs in? Again, the crucial question is whether the data can be collected consistently and repeatably. A fourth question should be, "Is the data collected at the same point in the process each time?" For example, are code inspections performed before or after a clean compile is obtained? If this is not specified, there may be a mix of compiled/not compiled inspection data, which will increase variability significantly.

Process Consistency Even if the data is collected consistently and repeatably, the process itself maybe vary from one execution to the next. For example, some teams may engage in "pre-reviews" before inspections (to ensure that the inspected code is of acceptable quality – not an

4

1999 International Conference on Software Quality

Cambridge, MA

unreasonable practice if the number of defects reported in the inspections has ever been used in a performance appraisal). This, too, can lead to a mix of pre-reviewed/not prereviewed data that will increase variability. Another panelist at the 1999 European SEPG Conference identified a case where examination of data revealed two operationally distinct inspection processes. The distinguishing attribute of the inspection was the size of the work product. If the code module being inspected was larger than about 50 lines of code, the inspection rates were significantly different – even though the same inspection process was supposedly being performed. The important insight is not whether the existence of two operationally different inspections is appropriate, but that the decision be a conscious one. In the case of the data in Table 1, two questions immediately arise: "Is it a good idea to have inspections covering this wide a range of code sizes?" and the corollary question, "Are the inspection rates for these different sizes of module reasonable?" Different organizations may establish somewhat different guidelines, but150 LOC per hour is a reasonable target [Fagan86]. A casual examination of the data provided in Table 1 suggests that some inspection rates are running at greater than 2,000 LOC per hour, which suggests a significant process consistency issue.

Aggregation When analyzing process data, there are many potential sources of variation in the process. It is easy to overlook sources of variation when data are aggregated. Common causes of overly aggregated data include [Florac99]: • poor operational definitions • inadequate contextual information • lack of traceability from data back to its original context • working with data whose elements are combinations (mixtures) of values from different sources The predominant source of aggregated data is simply that different work products are produced by different members of the project team. Collecting data on an individual basis would address this, but could have severe consequences in terms of motivational use of the data, e.g., during performance appraisals, which can lead to dysfunctional behavior [Austin96], and in terms of the amount of the data available for statistical analyses. There are no easy answers to this question. It is, however, possible on occasion to disaggregate data. For example, defect data could be separated into different categories, and control charts on each category may provide significantly better insight into separate common cause systems [Florac99].

Organizational Implications In the particular example we have gone through above, the data was used within a single

5

1999 International Conference on Software Quality

Cambridge, MA

project. When dealing with organizational data, these problems are exacerbated. In moving between projects, application domains, and customers, operational definitions may be "adjusted" to suit the unique needs of the new environment, thus it is crucial to understand the context of the data when doing cross-project comparisons. It can be particularly challenging when government regulations or customers demand that data be reported in different ways than the organization would normally collect it.

Conclusion This paper provides a simple road map through some of the issues that an analyst must deal with in implementing quantitative process management. As we frequently say about the CMM, this is not rocket science, but it is easy to miss an important point, and it can be quite frustrating at times to work through these issues. These are, however, typical problems that most organizations have work through on the journey of continual process improvement; "informal stabilization" seems to be a necessary precursor to the useful application of rigorous SPC techniques.

References Austin96

Robert D. Austin, Measuring and Managing Performance in Organizations, Dorset House Publishing, ISBN: 0-932633-36-6, New York, NY, 1996.

Fagan86

M.E. Fagan, "Advances in Software Inspections," IEEE Transactions on Software Engineering, Vol. 12, No. 7, July 1986, pp. 744-751, reprinted in Software Engineering Project Management, R.H. Thayer (ed), IEEE Computer Society Press, IEEE Catalog No. EH0263-4, 1988, pp. 416423.

Florac99

William A. Florac and Anita D. Carleton, Measuring the Software Process: Statistical Process Control for Software Process Improvement, ISBN 0-20160444-2, Addison-Wesley, Reading, MA, 1999.

Hare95

Lynne B. Hare, Roger W. Hoerl, John D. Hromi, and Ronald D. Snee, "The Role of Statistical Thinking in Management," ASQC Quality Progress, Vol. 28, No. 2, February 1995, pp. 53-60.

Hayes97

Will Hayes and James W. Over, "The Personal Software Process (PSP): An Empirical Study of the Impact of PSP on Individual Engineers," Software Engineering Institute, Carnegie Mellon University, CMU/SEI-97TR-001, December 1997.

Herbsleb97

James Herbsleb, David Zubrow, Dennis Goldenson, Will Hayes, and Mark Paulk, "Software Quality and the Capability Maturity Model,” Communications of the ACM, Vol. 40, No. 6, June 1997, pp. 30-40.

6

1999 International Conference on Software Quality

Cambridge, MA

Humphrey95 Watts S. Humphrey, A Discipline for Software Engineering, ISBN 0-20154610-8, Addison-Wesley Publishing Company, Reading, MA, 1995. Ould96

Martyn A. Ould, "CMM and ISO 9001," Software Process: Improvement and Practice, Vol. 2, Issue 4, December 1996, pp.281-289.

Paulk95

Carnegie Mellon University, Software Engineering Institute (Principal Contributors and Editors: Mark C. Paulk, Charles V. Weber, Bill Curtis, and Mary Beth Chrissis), The Capability Maturity Model: Guidelines for Improving the Software Process, ISBN 0-201-54664-7, AddisonWesley Publishing Company, Reading, MA, 1995.

Paulk99a

Mark C. Paulk, “Practices of High Maturity Organizations,” The 11th Software Engineering Process Group (SEPG) Conference, Atlanta, Georgia, 8-11 March 1999.

Paulk99b

Mark C. Paulk, "Using the Software CMM With Good Judgment,” ASQ Software Quality Professional, Vol. 1, No. 3, June 1999, pp. 19-29.

7

Suggest Documents