The Journal of Systems and Software 57 (2001) 1±7
www.elsevier.com/locate/jss
A note on the evolution of software engineering practices David E. Drehmer, Sasa M. Dekleva * Kellstadt Graduate School of Business, DePaul University, 1 E. Jackson Blvd., Chicago, IL 60604-2287, USA Received 7 October 1999; received in revised form 5 May 2000; accepted 7 August 2000
Abstract This paper describes how software development process naturally evolves. Items taken from an early version of the SEI's CMM questionnaire are shown to make up a linear scale that describes how software development and management practices are introduced in a software production organization. Analysis of this scale shows that when software practices are inadequately or incompletely implemented, they are re®ned in several iterations. This observation implies that such repetitive re®nements of software practices can be estimated and that it may be possible to correct inadequate or incomplete initial implementations and shorten the cycle between re®nement steps and thus accelerate the software process evolution. Ó 2001 Elsevier Science Inc. All rights reserved. Keywords: Software process evolution; Software engineering practices; Software process maturity model; Software development
1. Introduction Concerns about the inability of software developers to consistently and eciently deliver quality software stimulated investigations of software development processes. Researchers have recognized that the succession in which software development practices are implemented is not optimal. Several prescriptive models describing ways to improve software development process have been proposed. One such model is the software engineering process maturity model, developed by the Software Engineering Institute (SEI) at Carnegie Mellon University (Humphrey, 1988). The model suggests which practices should be used and in what sequence they should be implemented to gradually progress on a software process maturity scale. The SEI model identi®es ®ve maturity levels labeled initial, repeatable, de®ned, managed, and optimized and is an adaptation of the quality management maturity grid (Humphrey, 1988). SEI has asserted that a higher maturity level is associated with lower risk, higher productivity and higher quality of the software development and maintenance process (Humphrey et al., 1989). Interestingly enough, the evolution of the software development process as it naturally occurs in practice has not been de®ned as a measurable variable until it *
Corresponding author. Tel.: +1-312-362-6789; fax: +1-312-3626208. E-mail address:
[email protected] (S.M. Dekleva).
was calibrated (Dekleva and Drehmer, 1997). If the SEI model recognizes software process maturity as a categorical variable, as crude as it is, can software process evolution also be de®ned as a continuous linear variable? This paper ®rst presents a method of converting observations into a linear variable describing software process evolution. It then describes what can be learned by the analysis of software process evolution. The paper ®nally suggests that software process evolution can and should be improved and that ecient interventions can be designed to help practicing software developers lower the risk, increase the productivity, and improve the quality of their software development and maintenance process.
2. From observation to measurement Wright and Linacre assert that all observations begin as nominal or at best ordinal data (Wright and Linacre, 1989). They remind us that ``quantitative science begins with identifying conditions and events which, when observed, are deemed worth counting. This counting is the beginning of quanti®cation. Measurement is deduced from well-de®ned sets of counts. The most elementary level is to count the presence, `1,' or absence, `0,' of the de®ned condition or event.'' All classi®cations, such as the SEI maturity model, are qualitative and in this case ordered. As Merbitz and colleagues emphasize, such ordered classes say nothing
0164-1212/01/$ - see front matter Ó 2001 Elsevier Science Inc. All rights reserved. PII: S 0 1 6 4 - 1 2 1 2 ( 0 0 ) 0 0 1 1 2 - 6
2
D.E. Drehmer, S.M. Dekleva / The Journal of Systems and Software 57 (2001) 1±7
about distances between them and thus do not represent a measure (Merbitz et al., 1989). A ``measure'' is a value, which can be meaningfully added, subtracted, multiplied and divided. A need to progress from counting observations to measurement is required. Georg Rasch developed a complete solution (Rasch, 1960/1980). It was later shown that Rasch's solution is not only sucient but also necessary for the construction of measures in any science. Detailed, elementary explanations of why, when and how to apply Rasch's idea to dichotomous (right/wrong, yes/no, present/absent) data are provided by Wright and Stone (1979). The extension of Rasch's solution to rating scales and other observations embedded in ordered categories is developed and explained in Wright and Masters (1982) and extended to multifaceted conjoined measurement by Linacre (1989). Rasch analysis includes the examination of the initial data for the possibility of a single latent variable along which software development practices can be calibrated and software development organizations can be measured. When we conceptualize mapping both software practices and organizations on the same continuum, it becomes immediately apparent which software engineering practices one would expect to be used and which would be expected to not be used by an organization at any particular point of evolution. Fig. 1 shows three hypothetical software practices (p1 ; p2 and p3 ) located on the evolution continuum from early to late evolution. Similarly, organizations can be located in terms of their evolutionary development on this same continuum progressing from simple to sophisticated. Fig. 1 shows two hypothetical organizations, one primitive and one advanced organization. Conceptually, the primitive organization depicted in Fig. 1 would be expected to be using software practice p1 and not using practices p2 and p3 . Similarly, the advanced organization in Fig. 1 would be expected to have implemented practices p1 and p2 but not yet practice p3 . When practices can be calibrated from early to late, any organization should be able to be located along that continuum in such a way that the organization's position determines which practices are expected to have been implemented and which practices are not expected to have been implemented. The Rasch model speci®es within stochastic certainty that the probability of any particular organization using a software engineering practice is a simple function of
Fig. 2. Response pattern zones.
the sophistication of the organization and the developmental level of the software engineering practice. Speci®cally, a highly developed organization would have a high probability of using simple software engineering practices. Similarly, organizations stalled in the use of early software engineering practices would not be using highly advanced practices. The Rasch model (Wright and Stone, 1979) tests whether this conceptualization describes in a probabilistic sense the actual simultaneous positioning of both organizations and software engineering practices. The expected probability of an organization using a practice is given by Eq. (1). P
Xij 1jpi ; oj
e
pi oj ; 1 e
pi oj
1
where pi is the ith software engineering practice; oj the jth organization; Xij the response of organization i to practice j; 1 represents used; and 0 represents not used. When the responses of an organization to the use of software engineering practices are ordered according to the best ®t to the Rasch model given in Eq. (1), then a pattern similar to that displayed in Fig. 2 may be seen. Three distinct zones may be conceptualized. The zone of endorsement is composed of the practices generally used by the organization; the zone of rejection is identi®ed by practices not used. An intermediate zone of transition extends over an area where practices are being introduced. It can be seen that a narrow zone of transition indicates a precise assessment of the location of the organization. Fit can be assessed by the consistency of an organization's zones of endorsement and rejection. A concise presentation of the Rasch model can be found in Wright and Stone and is implemented in BIGSTEPS available from the MESA statistical laboratory at the University of Chicago (Wright and Stone, 1979). 3. Subjects
Fig. 1. A hypothetical example of organizations and software engineering practices.
Two subject groups were asked to participate in this investigation. The ®rst group consisted of 44 software engineering practitioners who attended a meeting and conference of the Software Maintenance Association. The second sample was composed from 39 software
D.E. Drehmer, S.M. Dekleva / The Journal of Systems and Software 57 (2001) 1±7
engineering practitioners who attended the same conference one year later. These practitioners reported an average of 15 years experience in software development and maintenance. 4. Procedure Each subject was presented with questions that asked them whether a particular software engineering practice was used in their organization. These questions were part of the 1987 SEI maturity questionnaire (Humphrey and Sweet, 1987) and are shown in Fig. 3. The ®rst sample answered the 38 questions described as ``the key questions'' or ``asterisked questions'' used by SEI to determine the software engineering maturity levels two through four (Humphrey and Sweet, 1987). A Rasch calibration yielded 22 software engineering practices that ®t the model according to standard statistical criteria and are reported in Dekleva and Drehmer (1997). The second sample of respondents were presented with the same 38 questions and an additional 23 questions taken from the same source covering the same levels of SEI maturity. Calibrations for these new 23 practices were assessed by anchoring previously scaled 33 software engineering practices to the assessed values determined by Dekleva and Drehmer (1997). Four of these new 23 practices were discarded due to lack of ®t to the Rasch model. 5. Results Fig. 3 displays the software engineering practices in calibration order where the software process evolution progresses from the earliest stage at the bottom toward the latest stage at the top. 33 practices used in the Dekleva and Drehmer (1997) scaling are shown in the right side and newly scaled 19 practices in the left side of the scale. Not only did these practices ®t to the Rasch model, but their positioning also makes sense conceptually and con®rms the framework hypothesized by Dekleva and Drehmer (1997). The focus of this paper is on the calibration of practices and not upon the measurement of organizations. However, if an organization is positioned on any point on that scale, it would be expected that that organization would be more likely than not to use the practices positioned below that point. Conversely, it would be more likely than not that the organization does not use the practices located above that point on the scale. If one examines the continuity of the practices along the scale, one might speculate that they represent a continuous progression. It can be noticed that a meaningful growth sequence in the evolution of
3
software development or maintenance process is represented by these practices. In other words, practices located in the same region on the scale share a similar function. Even though a transition is continuous, the authors delineated arbitrary thresholds to better label the stages of evolution. Clusters of related software practices form evolutionary stages marked A to G below and in Fig. 3. Stage A ± Reviews and change control. The establishment of con®guration management and change control, of code and design reviews, and of tracking trouble reports from testing characterize this stage of software process evolution. Con®guration management, one of the two newly added practices, con®rms this stage. Tracking test trouble reports represents an addition to this category. Stage B ± Standard process and project management. The second evolutionary stage focuses on standardization of the software development process (methodology) and on project management. Soon after the standardized software development process is implemented and managers start to sign o and review each project before making contractual commitments, formal scheduling and estimating procedures are implemented. Four new practices have been scaled into this stage. One regards the comparison of technologies used by the organization with those available externally, and pairs with the one related to the managed introduction of new technologies. Two new practices require the use of mechanisms for ensuring traceability of software requirements and design. One would expect that standard development process would contain such provisions. The fourth new practice calls for regular technical interchanges with the customer. Stage C ± Review management and con®guration control. This stage brings a higher level of formalism to management and documentation. Review data are analyzed, action items are tracked to closure, code review standards are applied, con®guration control is used for each project, and a formal software size estimation procedure is used. The new practices involve the use of computerized tools for establishing a software development library, making sure that the design teams understand the requirements, the standard content of deliverables and formal test case reviews. Stage D ± Software process improvement. This stage represents the endpoint of the middle evolution stages. The process is systematically being improved, analysis of errors is extended to identify related process inadequacies, and compliance with the software process standards is ensured. The two new items that have been scaled into this stage are again related to the others in this category. One requires the application of standards to the preparation of unit test cases and the other requires that standard process documentation describes the use of tools and techniques. The standard
4
D.E. Drehmer, S.M. Dekleva / The Journal of Systems and Software 57 (2001) 1±7
D.E. Drehmer, S.M. Dekleva / The Journal of Systems and Software 57 (2001) 1±7
Fig. 3. Evolution map of new and originally calibrated software engineering practices.
5
6
D.E. Drehmer, S.M. Dekleva / The Journal of Systems and Software 57 (2001) 1±7
development and maintenance process is not only being constantly improved, but its application is also ensured. In addition, the scope of standardization is extended to include the preparation of test cases and the documented use of tools and techniques. Stage E ± Management of review and test coverages. Stage E is the ®rst of the stages in which advanced engineering and management practices are gradually introduced. The two new practices require assessments of existing designs and code for reuse, and continuous maintenance of data on planned and actual software units completing testing. Measurement and recording of design and code review coverages and of test coverage for each phase of functional testing are also introduced at this stage. Stage F ± Analysis of measurements. Data accumulated at the lower stages enable analysis of review eciency, while an errors database enables determination of likely distribution of remaining defects. Adequacy of regression testing is assured and the scope of measurements is further expanded to include the maintenance of software size pro®les for each con®guration. The two new practices add the requirement for independent audits of each step of the development process and maintenance of planned and actual software units designed. Stage G ± Advanced practices. Some of the practices at this level are above the evolution of most organizations in our sample. Required training for various roles appears no less than four times. It comes as a surprise that an organization needs to progress this far in its software process evolution to recognize the need for training. Perhaps the analysis at the preceding level helps managers appreciate this need. Quality assurance practices are further extended. Code, test, and design errors are projected and compared with actual observations. A process measurements database is established at this level and data are gathered for all projects. One of the new practices calls for independent audits of software process for each project, and the remaining two new practices are related to training programs. It seems that this stage prepares an organization to climb to the process optimization evolution stage and possibly other higher stages not investigated in this study. 6. Discussion The software process evolution scale described here re¯ects the way software development and management practices are introduced in the industry, and how their use evolves ``naturally''. What can be observed, stepping away from this model, is a meta-model where individual practices seem to have their own evolutionary pattern. After a practice is introduced, it is re®ned and extended, standardized, enforced, results are measured, measures analyzed and then its users trained. For example, con-
sider the practice of reviewing code. It is ®rst introduced in Stage A, at the lowest stage of software process evolution. The outcomes of reviewing code are action items. Creating action items alone does not solve the problem but ®nding a way to track the action items resulting from code reviews to their closure represents an evolutionary step forward. Simultaneously, code review standards are introduced. After establishing that the reviews are eective, their coverages are measured and recorded to insure that code reviews are universally applied. The next step in the progression is to analyze the review eciency for each project and analyze error data from code reviews to determine the likely remaining error distribution and characteristics. Paradoxically, organizations recognize the need for formal training of code review leaders at this late stage. Other similar stories can be extracted from the software process evolution scale regarding size and cost estimation, testing, process standardization, project data recording and analysis, and so on. It appears that, in general, evolution is reactive: it occurs to correct the inadequacies of prior implementation of a software engineering practice. These examples lead us to believe that ecient interventions can be designed to compress software process evolution. Rather than abandon the practice, new layers of practice seem to have been added to support the original practice. This repetitive process of adding new methods to support older methods seems to account for a substantial number of software engineering process improvement strategies. Iterative improvement, or Kaizan, in production processes is not new, and its recognition has become a fundamental part of the quality movement in recent decades. There are two important implications of this observation. First, the regularity with which this repetitive re®nement process exists may make it possible to estimate when the next re®nement will be found and implemented. Such an analysis is beyond the scope of this paper, but calibrations of the evolutionary development provided here with invariant measurement properties (except for origin and scale) are essential if these models can be constructed. The second implication of de®ning the micro-evolution of software engineering practices gives rise to the possibility of anticipating the kinds of change and evolution that any new practice will undergo. It appears that successive practices may often correct inadequate or incomplete initial implementation. With this framework of thinking it might be possible to shorten the cycle between re®nement steps that transform the basic process. The questions to be asked are simple and obvious: ``What changes will be necessary (useful) to support this process improvement? When those changes are made and those anticipated needs are satis®ed, then what new changes will be necessary (useful) to support this process improvement?'' While
D.E. Drehmer, S.M. Dekleva / The Journal of Systems and Software 57 (2001) 1±7
7
forecasting the processes that will be needed in the future is beyond the scope of this paper, it appears that the necessary support structure may already be de®ned. This meta-support structure appears to be practice introduction, re®nement and extension, standardization, enforcement, measurement of results, analysis of measurements, and training of its users. When implemented as a whole package, the need for iterative improvement may be eliminated altogether, thus shortening the time to process improvement.
Merbitz, C., Morris, J., Grip, J.C., 1989. Ordinal scales and foundations of measurement. Archives of Physical Medicine and Rehabilitation 70 (4), 308±332. Rasch, G., 1960/1980. Probabilistic models for some intelligence and attainment tests. Danish Institute for Educational Research, Copenhagen and University of Chicago Press, Chicago. Wright, B.D., Linacre, J.M., 1989. Observations are always ordinal measurements, however, must be interval. Archives of Physical Medicine and Rehabilitation 70 (12) 857±860 (Also available online: http://mesa.spc.uchicago.edu/memo44.htm). Wright, B.D., Masters, G.N., 1982. Rating Scale Analysis. MESA Press, Chicago. Wright, B.D., Stone, M.H., 1979. Best Test Design. MESA Press, Chicago.
References
David E. Drehmer is Associate Professor of Management at the Kellstadt Graduate School of Business at DePaul University in Chicago, Illinois. He received his Ph.D. in psychology from Illinois Institute of Technology. Currently, Dr. Drehmer is the director of the Performance Enhancement Institute. His primary research interests are applications of Rasch measurement and scaling, and creativity and innovation in the workplace. He has authored over 100 papers in refereed journals, technical reports, and conference presentations.
Dekleva, S.M., Drehmer, D.E., 1997. Measuring software engineering evolution: A Rasch calibration. Information Systems Research 8 (1), 95±104. Humphrey, W.S., 1988. Characterizing the software process: A maturity framework. IEEE Software 5 (3), 73±79. Humphrey, W.S., Kitson, D.H., Kasse, T.C., 1989. The state of software engineering practice: A preliminary report. Report CMU/ SEI-89-TR-1, ESD-TR-89-01, Software Engineering Institute, Carnegie-Mellon University, Pittsburgh. Humphrey, W.S., Sweet, W., 1987. A method for assessing the software engineering capability of contractors. Report CMU/SEI87-TR-23, ADA187230, Software Engineering Institute, Carnegie Mellon University, Pittsburgh. Linacre, J.M., 1989. Many-Facet Rasch Measurement. MESA Press, Chicago.
Sasa M. Dekleva is Associate Professor of Information Systems at the Kellstadt Graduate School of Business at DePaul University in Chicago, Illinois. He earned his Ph.D. in information systems from University of Belgrade. Before coming to DePaul University, he spent 10 years in the industry at various system engineering and management positions and taught at the universities of Ljubljana, Maribor, and Iowa. His articles have appeared in Communications of the ACM, MIS Quarterly, Information Systems Research, and many other journals. His current research interests include electronic business and data management.