SOFTWARE PROCESS NEWSLETTER

SOFTWARE PROCESS NEWSLETTER Committee on Software Process

Technical Council on Software Engineering IEEE Computer Society

No. 12, Spring 1998.

 1998, IEEE Computer Society TCSE

TABLE OF CONTENTS Empirical Software Engineering Victor Basili ............. ..............................................................................1 Benefits and Prerequisites of ISO 9000 based Software Quality Management Dirk Stelzer, Mark Reibnitz, and Werner Mellis .....................................3 The Personal Software Process as a Context for Empirical Studies Claes Wohlin........... ..............................................................................7 SPICE Trials Assessment Profile Robin Hunter........................................................................................12 Software Process Improvement in Central and Eastern Europe Miklós Biró, J. Gorski, Yu. G. Stoyan, A.F. Loyko, M.V. Novozhilova, I. Socol, D. Bichir, R. Vajde Horvat, I. Rozman, and J. Györkös ..................................................................19 European SPI Glass Colin Tully ............................................................................................21 SPICE Spotlight Alec Dorling ............ ............................................................................23 Announcements ...................................................................................25

EMPIRICAL SOFTWARE ENGINEERING Victor Basili University of Maryland

In most disciplines, the evolution of knowledge involves learning by observing, formulating theories, and experimenting. Theory formulation represents the encapsulation of knowledge and experience. It is used to create and communicate our basic understanding of the discipline. Checking that our understanding is correct involves testing our theories, i.e., experimentation in some form. Analysing the results of the experimental study promotes learning and the ability to change and refine our theories. These steps take time which is why the understanding of a discipline, and its research methods, evolves over time. The paradigm of encapsulation of knowledge into theories and the validation and verification of those theories based upon experimentation, empirical evidence, and experience is used in many fields, e.g., physics, medicine, manufacturing.

Editor: Khaled El Emam

What do these fields have in common? They evolved as disciplines when they began learning by applying the cycle of observation, theory formulation, and experimentation. In most cases, they began with observation and the recording of what was observed in theories or specific models. They then evolved to manipulating the variables and studying the effects of change in the variables. How does the paradigm differ for these fields? The differences lie in the objects they study, the properties of those objects, the properties of the systems that contain them and the relationship of the objects to the systems. So differences exist in how the theories are formulated, how models are built, and how studies are performed; often affecting the details of the research methods. Software engineering has things in common with each of these other disciplines and several differences. In physics, there are theorists and experimentalists. The discipline has progressed because of the interplay between both groups. Theorists build models (to explain the universe). These models predict the results of events that can be measured. The models may be based upon theory from understanding the essential variables and their interaction or data from prior experiments, or better yet, from both. Experimentalists observe and measure, i.e., carry out studies to test or disprove a theory or to explore a new domain. But at whatever point the cycle is entered there is a pattern of modelling, experimenting, learning and remodelling. The early Greek model of science was that observation, followed by logical thought, was sufficient for understanding. It took Galileo, and his dropping of balls off the tower at Pisa, to demonstrate the value of experimentation. Eddington's study of the 1919 eclipse differentiated the domain of applicability of Einstein's theories vs. Newton's. In medicine, we have researchers and practitioners. The researcher aims at understanding the workings of the human body and the effects of various variables, e.g., procedures and drugs. The practitioner aims at applying that knowledge by manipulating those variables for some purpose, e.g., curing an illness. There is a clear relationship between the two; knowledge is often built by feedback from the practitioner to the researcher. Medicine began as an art form. It evolved as a field when it began observation and theory formulation. For example, Harvey's controversial theory about the circulation of blood through the body was the result of many careful experiments performed while he practiced medicine in London. Experimentation varies from

The Software Process Newsletter is targeted at software process professionals in both industry and academe, internationally. Its mission is to provide, rapidly, up to date information about current practice, research and experiences related to the area of software process.

Software Process Newsletter: SPN - 1

controlled experiments to qualitative analysis. Depending on the area of interest, data may be hard to acquire. Human variance causes problems in interpreting results. However, our knowledge of the human body has evolved over time. The focus in manufacturing is to better understand and control the relationship between process and product for quality control. The nature of the discipline is that the same product is generated, over and over, based upon a set of processes, allowing the building of models with small tolerances. Manufacturing made tremendous strides in improving productivity and quality when it began to focus on observing, model building, and experimenting with variations in the process, measuring its effect on the revised product, building models of what was learned. The Empirical Software Engineering journal [see the announcements section of this issue of SPN] is dedicated to the position that like other disciplines, software engineering requires the cycle of model building, experimentation, and learning; the belief that software engineering requires empirical study as one of its components. There are researchers and practitioners. Research has analytic and experimental components. The role of the researcher is to build models of and understand the nature of processes, products, and the relationship between the two in the context of the system in which they live. The practitioner's role is to build "improved" systems, using the knowledge available and to provide feedback. But like medicine (e.g., Harvey), the distinction between researcher and practitioner is not absolute, some people do both at the same time or at different times in their careers. This mix is especially important in planning empirical studies and when formulating models and theories. Like manufacturing, these roles are symbiotic. The researcher needs laboratories and they only exist where practitioners build software systems. The practitioner needs to better understand how to build systems more productively and profitably; the researcher can provide the models to help this happen. Just as the early model of science evolved from learning based purely on logical thought, to learning via experimentation, so must software engineering evolve. It has a similar need to move from simple assertions about the effects of a technique to a scientific discipline based upon observation, theory formulation, and experimentation. To understand how model building and empirical studies need to be tailored to the discipline, we first need to understand the nature of the discipline. What characterises the software engineering discipline? Software is development not production. Here it is unlike manufacturing. The technologies of the discipline are human based. It is hard to build models and verify them via experiments--as with medicine. As with the other disciplines, there are a large number of variables that cause differences and their effects need to be studied and understood. Currently, there is a lack of models that allow us to reason about the discipline, there is a lack of recognition of the limits of technologies for certain contexts, and there is a lack of analysis and experimentation. There have been empirical analysis and model building in software engineering but the studies are often isolated events. For example, in one of the earliest empirical studies, Belady & Lehman [3][4] observed the behaviour of OS 360 with respect to releases. They posed several theories that were based upon their observation concerning the entropy of systems. The idea of entropy that you might redesign a system rather than continue to

change it - was a revelation. On the other hand, Basili & Turner [2] observed that a compiler system being developed, using an incremental development approach, gained structure over time. This appears contradictory. But under what conditions is each phenomenon true? What were the variables that caused the different effects? What were the different variables as size, methods, the nature of the changes? We can hypothesise, but what evidence do we have to support those hypotheses? In another area, Walston and Felix [6] identified 29 variables that had an effect on software productivity in the IBM FSD environment. Boehm [5] observed that 15 variables seemed sufficient to explain/predict the cost of a project across several environments. Bailey and Basili [1] identified 2 composite variables that when combined with size were a good predictor of effort in the SEL environment. There were many other cost models at the time. Why were the variables different? What did the data tell us about the relationship of variables? Clearly the answers to these questions require more empirical studies that will allow us to evolve our knowledge of the variables of the discipline and the effects of their interaction. In our discipline, there is little consensus on terminology, often depending upon whether the ancestry of the researcher is the physical sciences, social sciences, medicine, etc. One of the roles of the Empirical Software Engineering journal is to begin to focus on a standard set of definitions. We tend to use the word experiment broadly, i.e., as a research strategy in which the researcher has control over some of the conditions in which the study takes place and control over the independent variables being studied; an operation carried out under controlled conditions in order to discover an unknown effect of law, to test or establish a hypotheses, or to illustrate a known law. This term thus includes quasi-experiments and preexperimental designs. We use the term study to mean an act or operation for the purpose of discovering something unknown or of testing a hypothesis. This covers various forms of research strategies, including all forms of experiments, qualitative studies, surveys, and archival analyses. We reserve the term controlled experiment to mean an experiment in which the subjects are randomly assigned to experimental conditions, the researcher manipulates an independent variable, and the subjects in different experimental conditions are treated similarly with regard to all variables except the independent variable. As a discipline software engineering, and more particularly, the empirical component is at a very primitive stage in its development. We are learning how to build models, how to design experiments, how to extract useful knowledge from experiments, and how to extrapolate that knowledge. We believe there is a need for all kinds of studies: descriptive, correlational, causeeffect studies, studies on novices and experts, studies performed in a laboratory environment or in real projects, quantitative and qualitative studies, and replicated studies. We would expect that over time, we will see a maturing of the empirical component of software engineering. The level of sophistication of the goals of an experiment and our ability to understand interesting things about the discipline will evolve over time. We would like to see a pattern of knowledge building from series of experiments; researchers building on each others' work, combining experimental results; studies replicated under similar and differing conditions. The Empirical Software Engineering journal is a forum for that learning process. Our experiments in some cases, like those in the early stages of other disciplines,


will be primitive. They will have both internal and external validity problems. Some of these problems will be based upon the nature of the disciplines, affecting our ability to generate effective models or effective laboratory environments. These problems will always be with us, as they are with any discipline as it evolves and learns about itself. Some problems will be based on our immaturity in understanding experimentation as a discipline, e.g., not choosing the best possible experimental design, not choosing the best way to analyse the data. But we can learn from weakly designed experiments how to design them better. We can learn how to better analyse the data. The ESE journal encourages people to discuss the weaknesses in their experiments. We encourage authors to provide their data to the journal so that other researchers may reanalyse them. The journal supports the publication of artifacts and laboratory manuals. For example, in issue 1(2), the paper "The Empirical Investigation of Perspective-based Reading" has associated with it a laboratory manual that will be furnished as part of the ftp site at Kluwer Academic Publishers. It contains everything needed to replicate the experiment, including both the artifacts used and the procedures for analysis. It is hoped that the papers in this journal will reflect successes and failures in experimentation; they will display the problems and attempts at learning how to do things better. At this stage we hope to be open and support the evolution of the experimental discipline in software engineering. We ask researchers to critique their own experiments and we ask reviewers to evaluate experiments in the context of the current state of the discipline. Remember, that because of the youth of the experimental side of our disciplines, our expectations cannot yet be the same as those of the more mature disciplines, such as physics and medicine. This goal of the journal is to contribute to a better scientific and engineering basis for software engineering. References [1] Bailey, J., and Basili, R. R. 1981. A meta-model for software development resource expenditures. Proceedings of the Fifth International Conference on Software Engineering. San Diego, USA, 107-116. [2] Basili, V.R., and Turner, A. J. 1975. Iterative enhancement: A practical technique for software development. IEEE Transactions on Software Engineering SE-1(4). [3] Belady, L. A., and Lehman, M. M. 1972. An introduction to growth dynamics. Statistical Computer Performance Evaluation. New York: Academic Press. [4] Belady, L. A., and Lehman, M. M. 1976. A model of large program development. IBM Systems Journal 15(3): 225-252. [5] Boehm, B. W. 1981. Software Engineering Economics. Englewood Cliffs, NJ: Prentice-Hall. [6] Walston, C., and Felix, C. 1977. A method of programming measurement and estimation. IBM Systems Journal 16(1): 54-73. This article appeared as an editorial to issue 1(2) of the Empirical Software Engineering journal. It is reprinted here with permission of the publisher and author. Victor Basili can be reached at: Department of Computer Science, University of Maryland, College Park, MD 20742, USA; Email: [email protected]

Software Process Newsletter issues on the Internet. Past issues of the Software Process Newsletter are now available on the Internet through anonymous ftp and the world wide web. Two formats are available: compressed postscript and pdf. The ftp site is ftp-se.cs.mcgill.ca; the directory is pub/spn; and the files are spn_no1.xx, spn_no2.xx, spn_no3.xx, and so on. The URL is: http://www-se.cs.mcgill.ca/process/spn.html http://www.iese.fhg.de/SPN/process/spn.html (mirror)

BENEFITS AND PREREQUISITES OF ISO 9000 BASED SOFTWARE QUALITY MANAGEMENT Dirk Stelzer, Mark Reibnitz, Werner Mellis University of Koeln

The ISO 9000 quality standards were released a decade ago. Since then thousands of software companies have implemented ISO 9000 based quality systems. In Europe, the ISO 9000 standards are the prevalent model for implementing software quality management. However, only few empirical studies on ISO 9000 based quality management in software companies have been published [1][8][16][23][29][33]. Little is known about benefits software companies have achieved with the help of ISO 9000 based quality systems. Furthermore, a profound knowledge of the enabling and inhibiting factors, i. e. the prerequisites of successful software quality management, is still lacking. The objective of this paper is (1) to describe benefits that software companies have achieved by implementing ISO 9000 based quality systems, and (2) to identify prerequisites of conducting successful software quality management initiatives. Research Method Between October 1996 and December 1997 we analysed published experience reports of 25 software organisations that had implemented an ISO 9000 quality system and sought certification. By examining the experience reports we identified benefits and prerequisites of implementing ISO 9000 based quality management initiatives in software companies. The study covers experience reports of 12 organisations located in the UK, eight German organisations, two French organisations, and one organisation each in Austria, Greece, and the US. The study includes published reports of software organisations at ACT Financial Systems Ltd. [3], Alcatel Telecom [5], ALLDATA [20], Answers Software Service [37], AVX Ltd. [34], BR Business Systems [13], Bull AG [25, 26], Cap Gemini Sogeti [31], CMS (British Steel) [14], Danet-IS GmbH [2, 21], Dr. Materna GmbH [32], IBM Deutschland [6], IDC-UK [28], INTRASOFT [11], Logica [9, 10], Oracle [35], Praxis [15], PSI AG [38], SAP AG [7, 36], Siemens AG [39, 40], Sybase [24], Tembit Software GmbH [30], Triad Special Systems Ltd. [12], Unisys Systems and Technology Operations [4], and an anonymous British software company [27]. The authors of the experience reports are quality managers or senior managers of the software companies. At the time the experience reports were written, the size of the companies ranged from 10 to 2700 employees (mean: 674 employees). The time needed to implement the quality systems ranged from 10 to 96 months (mean: 21 months). The time between certification of the quality system and the publication of the experience reports ranged from 0 to 60 months (mean: 36 months); 72 % of the companies had gathered experience with the quality system for more than two years. Benefits In the following section we will describe benefits that the authors of the experience reports have attributed to the implementation of ISO 9000 based quality systems.


0%

10%

20%

30%

40%

50%

60%

70%

improved project management

74%

improved productivity and efficiency

48%

improved customer satisfaction

43%

improved product quality

43%

more on-time deliveries

17%

positive return on investment improved corporate profitability

80%

26% 13% Percentage of companies addressing benefit categories (n=23)

Figure 1: Benefits of ISO 9000 based Software Quality Management Twenty three (of 25) reports explicitly describe benefits that were achieved by implementing the quality system. We have summarised these benefits in seven categories: improved project management, improved productivity and efficiency, improved customer satisfaction, improved product quality, more on-time deliveries, positive return on the investment in software quality management, and improved corporate profitability. Figure 1 shows the percentage of reports that mention benefits relating to each of the categories. Improved project management is reported by 74 % of the companies. This usually results from better documentation of the software process and from improved communication among staff members and managers in different organisational units of the company. ISO 9000 based quality systems lead to better visibility of the software process, improved documents and checklists, clearer definition of responsibilities, and making use of experience and best practices of other projects. Improved project management leads to a variety of other benefits: 48 % of the companies report improved productivity and efficiency of software development. 43 % of the companies report improved customer satisfaction. 43 % of the companies report improved product quality (usually described in a reduction of defects delivered to customers). 17 % of the companies report more on-time deliveries. 26 % of the companies explicitly mention a positive return on the investment in software quality management. 13 % of the companies report improved corporate profitability that they attribute to the implementation of ISO 9000 based software quality management. However, only 6 out of 23 companies (26 %) support their statements on benefits with quantitative data. Examples are a reduction of budget overruns by 50 % in 4 years [31], a reduction of defects found in user

acceptance tests by a factor of 9 [13], a reduction of 13 % in post-installation support costs [13], a reduction of programmers' time spent for hotline support by a factor of 3 [27], and a reduction of overall software development cost by 20 % [38]. The other 17 companies that address benefits of ISO 9000 based quality management do not give any quantitative data. Presumably, the statements on benefits in these reports primarily reflect perceived advantages of implementing quality systems. Prerequisites The term "prerequisites" summarises factors that authors of the experience reports covered in our study regard as essential when implementing an ISO 9000 based quality system. Implementation of the factors has facilitated the success of software quality management; a lack of compliance with the factors has delayed progress in quality management or made it difficult to achieve. Prerequisites of successful software quality management

Percentage of experience reports addressing the factors

Management commitment and support

84%

Staff involvement

84%

Providing enhanced understanding

72%

Tailoring improvement initiatives

68%

Encouraging communication and collaboration

64%

Managing the improvement project

56%

Change agents and opinion leaders

52%

Stabilizing changed processes

52%

Setting relevant and realistic objectives

44%

Unfreezing the organization

24%

Figure 2: Prerequisites of successful software quality management (n=25)


We identified 10 prerequisites of successful software quality management efforts. Figure 2 shows the factors and the percentage of experience reports addressing these factors. Management commitment and support is the degree to which management at all organisational levels sponsor the implementation of the quality system. The necessary investment of time, money, and effort and the need to overcome staff resistance are potential impediments to successful ISO 9000 based improvement initiatives. These obstacles cannot be overcome without management commitment. Active participation and visible support of senior management may give the necessary momentum to the initiative. This positively influences the success of the quality system. 84 % of the experience reports emphasise the importance of management commitment and support. Staff involvement is the degree to which staff members participate in quality management activities. Staff involvement is essential to avoid a schism between software engineers in development projects and quality managers responsible for implementing the quality system. Staff members have detailed knowledge and first hand experience of strengths and weaknesses of the current processes. Using the skills and experience of employees guarantees that the resulting quality system is a consensus that reflects the practical considerations of diverse projects. 84 % of the authors address this point. Providing enhanced understanding to managers and staff members comprises acquiring and transferring knowledge of current practices. Managers usually have a general idea of the software process, but they do not have complete understanding of essential details. Employees often do not understand how their work contributes to the corporate mission and vision. Successful quality management initiatives give managers a clearer picture of current practices and they give staff members the opportunity to better understand the business of their organisation. 72 % of the authors emphasise the significance of this topic. Tailoring improvement initiatives means adapting quality management efforts to the specific strengths and weaknesses of different teams and departments in the company. Standardised and centralised quality systems are usually not well accepted. Quality management must clearly and continually demonstrate benefits to projects. Tailoring increases the compatibility of improvement plans with existing values, past experience, and needs of various projects within an organisation. Tailoring helps to implement a quality system that responds to the true needs of the organisation. 68 % of all reports stress this point. Successful quality management initiatives have encouraged communication among staff members. This has helped to rectify rumors, to preclude misunderstandings, and to overcome resistance of staff members. Successful quality management efforts have also emphasised collaboration of different teams and divisions. Close cooperation of organisational units provides natural feedback loops, enhances staff members' understanding and knowledge, encourages people to exploit synergy, and consequently improves productivity and quality. Intensive communication and collaboration help to create a coherent organisational culture that is necessary for achieving substantial improvements. 64 % of the authors mention the importance of communication and collaboration. Managing the improvement project means that the implementation of the ISO 9000 based quality system is treated like a professional project. At the beginning, in some organisations the quality management projects had

neither specified requirements nor had they elaborated a formal project plan, defined milestones, or outlined a schedule. Areas of responsibility were not accurately determined and the initiative was lacking effective interfaces between quality management and software development teams. Successful initiatives set up and ran the quality management project like a software development project. They used existing project management standards, analysed requirements, defined explicit objectives, established milestones, and monitored progress. 56 % of the reports address this factor. Change agents are individuals or teams external to the system that is to be improved. Quality managers or consultants usually play the role of change agents. Often, they initiate the quality projects, request resources, and encourage local improvement efforts. They also provide technical support and feedback, publish successes, and keep staff members aware of the quality management efforts. Opinion leaders are members of a social system in which they exert their influence. Experienced project managers or proficient software engineers usually act as opinion leaders. They are indispensable for overcoming the potential schism between software development and quality management. They help to tailor the improvement suggestions to the needs of different teams and organisational units. 52 % of the authors mention this issue. Stabilising changed processes means continually supporting maintenance and improvement of the quality system at a local level. Staff members adopting new activities need continuous feedback, motivation, recognition, and reinforcement to stay involved in the improvement effort. They also need guidance and support to overcome initial problems and difficulties. Stabilising changed processes prevents that improved performance slides back to the old level. 52 % of the reports emphasise the need to stabilise changed processes. Setting relevant objectives means that the quality management efforts attempt to contribute to the success of the organisation. Mere conformance to the standards or attaining certification usually is not a relevant goal for staff members. It is essential that staff members understand the relationship between quality management and business objectives of the organisation. Setting realistic objectives means that the goals may be achieved in the foreseeable future and with a reasonable amount of resources. 44 % of the authors address this point. Lewin [22] has introduced the importance of "unfreezing" organisations before substantial improvements can be achieved. He emphasises that social processes usually have an "inner resistance" to change. To overcome this resistance an additional force is required, a force sufficient to break the habit and to unfreeze the custom. In software companies that have successfully implemented ISO 9000 based software quality systems perceived deficiencies in software development, management commitment, and competitive pressure have contributed to unfreezing the organisation. 24 % of the reports mention this topic. Discussion The findings of our study are based on experience reports written by managers of software organisations. Of course these sources primarily reflect the personal views of the authors of the reports. Nevertheless, the findings give interesting insights into benefits software companies might achieve and factors they should consider when implementing ISO 9000 based software quality management.


Benefits One would expect that quality managers will tend to publish their experiences with software quality management if the improvement efforts have been successful. Experiences with less successful quality systems are less likely to be published. Therefore, our findings may be biased because we analysed published experience reports only. Presumably, representative studies covering successful and unsuccessful quality systems would reveal lower percentages of companies addressing benefits. We have summarised the benefits described in the experience reports in seven categories. Improved project management is the prevailing benefit category. In most companies it leads to higher productivity and efficiency, improved customer satisfaction, or to improved product quality. However, more on-time deliveries are reported only by 4 of 23 companies. In the majority of the companies the implementation of ISO 9000 based quality systems does obviously not help to meet schedule commitments more often. This is astonishing since the ability to meet schedules more often is usually one of the benefits expected from the implementation of quality systems. Twenty three of 25 experience reports explicitly mention at least one benefit of the quality system. Surprisingly, only 6 of 23 companies report a positive return on the investment in software quality management, and only 3 of 23 companies mention improved corporate profitability. This might lead to the conclusion that the majority of the companies have not achieved a positive return on investment in software quality management. One might also conclude that ISO 9000 based quality systems will usually not improve corporate profitability. However, not a single experience report explicitly mentions that the quality efforts have not produced a positive return on investment or have not lead to a higher profitability. It is more likely that the small number of companies reporting positive returns on investment and improved profitability can be put down to the fact that most companies do not conduct comprehensive measurements of costs and benefits of software quality management. Only 26 % of the companies support their statements on benefits of ISO 9000 based quality management with quantitative data. This is remarkable because ISO 9001 [18] and ISO 9000-3 [17] suggest to use measurement and statistical techniques to establish, control and verify process capability. Furthermore, ISO 9004-1 [19] states: "It is important that the effectiveness of a quality system be measured in financial terms". However, a previous empirical study [33] has already shown that many software companies ignore the suggestions of the ISO 9000 standards to conduct measurements. Prerequisites Surprisingly, only 3 of the 10 prerequisites identified in our study are explicitly mentioned in ISO 9001: management commitment and support in clause 4.1 (management responsibility), managing the improvement project in clause 4.2 (quality system), and setting relevant and realistic objectives in clause 4.1.1 (quality policy) of ISO 9001. This means that companies that strictly stick to the elements of ISO 9001 will probably ignore other essential prerequisites of successful software quality management. Therefore, it is necessary to implement a more comprehensive approach to achieve substantial improvements. Most of the prerequisites identified in our study address the management of change, that is transforming

assumptions, habits, and working routines of managers and staff members so that the quality system may become effective. Implementing ISO 9000 based software quality management requires various changes to an organisation. Our study has shown that many software companies have obviously underestimated the effort needed to accomplish the change process. This indicates that change management is not sufficiently accounted for in the ISO 9000 standards. At first glance, the prerequisites discussed in this paper may be taken for granted. At least, they seem to be basics of software management. However, when one looks at the experience reports a second time it becomes clear that the factors are regularly described as lessons learned. Some organisations have obviously not paid enough attention to the implementation of the factors at the beginning of the initiative. Other organisations may not have fully understood the significance of the prerequisites until the improvement objectives had been accomplished. Obviously, most quality managers do not pay sufficient attention to the management of change when implementing ISO 9000 based software quality systems. Concluding Remarks Most software companies achieve benefits with the implementation of ISO 9000 based quality systems. Only few companies, however, report a positive return on the investment in software quality management and improved corporate profitability. This might lead to the conclusion that the majority of the companies have not achieved a positive return on investment and improved corporate profitability. However, one might also conclude that the small number of companies reporting economic success can be put down to the fact that most companies do not conduct comprehensive measurements of costs and benefits of software quality management. Most of the prerequisites of successful software quality management identified in our study address the management of change. The fact that the authors of experience reports emphasise these prerequisites as lessons learned shows that the factors are obviously not sufficiently accounted for in the ISO 9000 standards. Change management should therefore be a central element of future versions of the ISO 9000 family. References [1] M. Beirne, A. Panteli, and H. Ramsay, "Going soft on quality?: Process management in the Scottish software industry". In Software Quality Journal, no. 3, pp. 195-209, 1997. [2] G. Bulski and H. Martin-Engeln, "Erfahrungen und Erfolge in der SW-Projektabwicklung nach 4 Jahren DIN ISO 9001 Zertifizierung". In H. J. Scheibl, editor, Technische Akademie Esslingen Software-Entwicklung - Methoden, Werkzeuge, Erfahrungen '97. 23.-25. September 1997, pp. 403-406, Ostfildern, 1997. [3] H. Chambers, "The implementation and maintenance of a quality management system". In M. Ross et al., editors, Software Quality Management II, vol. 1: Managing Quality Systems, pp. 19-33, Southampton - Boston, 1994. [4] A. Clarke, "Persuading the Staff or ISO 9001 without Tantrums". In SQM, no. 9, pp. 1-5. [5] D. Courtel, "Continuous Quality Improvement in Telecommunications Software Development". In The first annual European Software Engineering Process Group Conference 1996. Amsterdam 24-27th June 1996, pp. (C309) 1 - 9, Amsterdam, 1996. [6] W. Dette, "Einfuehrung eines QM-Systems nach DIN ISO 9001 in der Entwicklung". In SQS, editor, Software-Qualitaetsmanagement 'Made in Germany' - Realitaet oder Wunschdenken?. SQM Kongress 1996. Koeln, 28th-29th March 1996, Cologne, 1996. [7] A. Dillinger, "Erfahrungen eines Softwareherstellers mit der Zertifizierung eines Teilbereiches nach DIN ISO 9001". In BIFOA,


editor, Fachseminar: Aufbau eines Qualitaetsmanagements nach DIN ISO 9000. Koeln, 26./27. April 1994, pp. 1-24, Cologne, 1994. [8] K. El Emam and L. Briand, "Costs and Benefits of Software Process Improvement". In International Software Engineering Research Network technical report ISERN-97-12, 1997. [9] M. Forrester, "A TickIT for Logica". In SQM, no. 16, 1996. [10] M. Forrester and A. Dransfield, "Logica's TickIT to ride extended for 3 years!". In TickIT International, no. 4, 1994. [11] S. A. Frangos, "Implementing a quality management system using an incremental approach". In M. Ross et al., editors, Software Quality Management III, vol. 1: Quality Management, pp. 27-41, Southampton - Boston, 1995. [12] A. M. Fulton and B. M. Myers, "TickIT awards - a winner's perspective". In Software Quality Journal, no. 2, 1996. [13] R. Havenhand, "TickIT Case Study: British Rail Business Systems". In SQM, no. 18, pp. 1-6, 1996. [14] B. Hepworth, "Making the best the standard. Users experiences of operating an ISO 9001 compliant quality management system and total quality management culture". In SAQ and EOQ-SC, editors, Software Quality Concern for People. Proceedings of the Fourth European Conference on Software Quality. October 17-20, Basel, Switzerland, pp. 208-223, Zuerich, 1994. [15] M. Hewson, "TickIT Case Study: Praxis". In SQM, no. 22, 1996. [16] A. Ingleby, J. F. Polhill, and A. Slater, "A survey of Quality Management in IT. Progress since the introduction of TickIT. Report form a survey of both certificated and non certificated companies". London, 1994. [17] International Organisation for Standardisation, "ISO 9000-3:1991. Quality management and quality assurance standards. Part 3: Guidelines for the application of ISO 9001 to the development, supply and maintenance of software". Geneva, 1991. [18] International Organisation for Standardisation, "ISO 9001:1994. Quality systems. Model for quality assurance in design, development, production, installation and servicing". Geneva, 1994. [19] International Organisation for Standardisation, "ISO 9004-1:1994. Quality management and quality system elements. Part 1: Guidelines". Geneva, 1994. [20] K. Kilberth, "Einfuehrung eines prozess-orientierten QM-Systems bei der ALLDATA". In H. J. Scheibl, editor, Technische Akademie Esslingen - Software-Entwicklung - Methoden, Werkzeuge, Erfahrungen '97. 7. Kolloquium - 23.-25. September 1997, pp. 377392, Ostfildern, 1997. [21] H.-G. Klaus, "Zertifizierung eines Softwareherstellers nach DIN ISO 9001 -Voraussetzungen, Ablauf, Vorgehensweise-". In BIFOA, editor, Fachseminar: Aufbau eines Qualitaetsmanagements nach DIN ISO 9000. Koeln, 26./27. April 1994, Cologne, 1994. [22] K. Lewin, "Group decision and social change". In Holt, Rinehart, and Winston, editors, Readings in social psychology, 3rd ed., pp. 197-211, New York, 1958. [23] C. B. Loken and T. Skramstad, "ISO 9000 Certification Experiences from Europe". In: American Society for Quality Control (ASQC) et al., editors, Proceedings of the First World Congress for Software Quality, June 20-22, 1995, Fairmont Hotel, San Francisco, CA, Session Y, pp. 1-11, San Francisco, 1995. [24] M. L. Macfarlane, "Eating the elephant one bite at a time". In Quality Progress, no. 6, pp. 89-92, 1996. [25] H. Mosel, "Erfahrungen mit einem zertifizierten QMS im BullSoftwarehaus". In BIFOA, editor, Fachseminar: Von der ISO 9000 zum Total Quality Management? Koeln, 16./17. April 1996, pp. 125. [26] H. Mosel, "Vier Jahre Zertifikat und was sonst noch notwendig ist". In SQS, editor, Software-Qualitaetsmanagement 'Made in Germany' - Realitaet oder Wunschdenken?. SQM Kongress 1996. Koeln, 28.-29. March 1996, Cologne, 1996. [27] B. Quinn, "Lessons Learned from the Implementation of a Quality Management System to meet the Requirements of ISO9000/TickIT in two small Software Houses". In Fifth European Conference on Software Quality - Conference Proceedings, Dublin, Ireland, September 16-20, 1996, pp. 305-314, Dublin, 1996. [28] C. Robb, "From quality system to organisational development". In M. Ross et al., editors, Software Quality Management II, vol. 1: Managing Quality Systems, pp. 99-113, Southampton - Boston, 1994. [29] K. Robinson and P. Simmons, "The value of a certified quality management system: the perception of internal developers". In Software Quality Journal, no. 2, pp. 61-73, 1996. [30] M. Schroeder and R. Wilhelm, "Flexibilitaet staerken. Erfahrungen beim Aufbau eines QM-Systems nach ISO 9000 in einem kleinen

Softwareunternehmen". In QZ - Qualitaet und Zuverlaessigkeit, no. 5, pp. 530-536, 1996. [31] J. Sidi and D. White, "Implementing Quality in an International Software House". In American Society for Quality Control (ASQC) et al., editors, Proceedings of the First World Congress for Software Quality, June 20-22, 1995, Fairmont Hotel, San Francisco, CA, Session W, pp. 1-13, San Francisco, 1995. [32] S. Steinke, "Erfahrungen bei der Einfuehrung und Verbesserung eines QMS". In SQS, editor, Software-Qualitaetsmanagement 'Made in Germany' - Modeerscheinung oder Daueraufgabe. SQM Kongress 1997. Koeln, 17.-18. April 1997, Cologne, 1997. [33] D. Stelzer, W. Mellis, and G. Herzwurm, "Software Process Improvement via ISO 9000? Results of Two Surveys Among European Software Houses". In Software Process - Improvement and Practice, no. 3, pp. 197-210, 1996. [34] A. Sweeney and D. W. Bustard, "Software process improvement: making it happen in practice". In Software Quality Journal, no. 4, pp. 265-273, 1997. [35] S. Verbe and P.W. Robinson, "Growing a quality culture: a case study - Oracle UK". In M. Ross et al., editors, Software Quality Management III, vol. 1: Quality Management, pp. 3-14, Southampton - Boston, 1995. [36] M. Vering and V. Haentjes, "Ist ISO 9000 ein geeignetes Werkzeug fuer Process Engineering? Ein Erfahrungsbericht aus der SAPEntwicklung". In m & c - Management & Computer, no. 2, pp. 8590, 1995. [37] S.D. Walker, "Maintaining your quality management system - what are the benefits?". In M. Ross et al., editors, Software Quality Management II, vol. 1: Managing Quality Systems, pp. 47-61, Southampton - Boston, 1994. [38] A. Warner, "Der Weg von der Qualitaetssicherung nach ISO 9001 zum Qualitaetsmanagement in einem Systemhaus". In H. J. Scheibl, editor, Technische Akademie Esslingen - SoftwareEntwicklung - Methoden, Werkzeuge, Erfahrungen '97. 7. 23.-25. September 1997, pp. 407-423, Ostfildern, 1997. [39] S. Zopf, "Ein Erfahrungsbericht zur ISO 9001 Zertifizierung". In Softwaretechnik-Trends, pp. 15-16, August 1994. [40] S. Zopf, "Improvement of software development through ISO 9001 certification and SEI assessment". In SAQ and EOQ-SC, editors, Software Quality Concern for People. Proceedings of the Fourth European Conference on Software Quality. October 17-20, 1994, Basel, Switzerland, pp. 224-231, Zuerich, 1994. Author Address: University of Koeln, Lehrstuhl für Wirtschaftsinformatik, Systementwicklung, Albertus-Magnus-Platz, D-50932 Koeln, Germany. Email: email: [email protected]; URL: http://www.informatik.uni-koeln.de/winfo/prof.mellis/welcome.htm

THE PERSONAL SOFTWARE PROCESS CONTEXT FOR EMPIRICAL STUDIES

AS A

Claes Wohlin Lund University

This article discusses the use of the Personal Software Process (PSP) as a context for doing empirical studies. It is argued that the PSP provides an interesting environment for doing empirical studies. In particular, if we already teach or use the PSP then it could be wise to also conduct empirical studies as part of that effort. The objective of this paper is to present the idea and discuss the opportunities in combining the PSP with empirical studies. Two empirical studies, one experiment and one case study, are presented to illustrate the idea. It is concluded that we obtain some new and interesting opportunities, in particular we obtain a well-defined context and hence ease replication of the empirical studies considerably.


Introduction Different decades have different trends in software engineering. In the 90s, we have seen a strong focus on the process and also on the use of empirical methods in software engineering. The need for a scientific approach to software engineering has been stressed, see for example [5]. In this article, we would like to look at the opportunity to combine two of the trends in the 90s, i.e., process focus and empirical studies. The objective of the article is to highlight and discuss the opportunities of performing empirical studies within the context of the Personal Software Process (PSP). The PSP is a well-defined process and the process is publicly available through the book by Humphrey [9]. This means that studies conducted within this context can be replicated rather easily, which is critical to the success of applying empirical methods in software engineering. The article is organized as follows. In the next section, the PSP is briefly discussed. This is followed by a brief introduction to empirical studies. Subsequently, the ability to use the PSP as a context for empirical studies is discussed. The advantages, challenges and opportunities are outlined. Then, two examples of studies conducted with the PSP as an empirical context are presented to illustrate the approach. Finally, the paper is concluded. The Personal Software Process The Personal Software Process (PSP) has gained lot of attention since it became publicly available [9]. The objective of the PSP is basically to provide a structured and systematic way for individuals to control and improve their way of developing software. We have seen papers, for example [10], presenting the outcome of the PSP, primarily from students taking the PSP as a course. The PSP is currently used in a number of universities and industry is also becoming interested in applying the PSP. At Lund University, we run the PSP as an optional course for students in the Computer Science and Technology program and the Electrical Engineering program. Most students take the course in their fourth year, and the course is taken by 50-70 students. The course is run for the second time during the autumn of 1997. The main objective of the course is to teach the students the use of planning, measurement, estimation, postmortem analysis and systematic reuse of experiences. It is from a course perspective more important to teach the students the techniques packaged within the PSP than actually teaching them the PSP for future use. Empirical Research Another area which attracts attention is the use of empirical methods in software engineering. The need for experimentation was stressed already in the 1980s [1], but it is during the last couple of years that we have seen a stronger focus on the use of empirical methods, for example, emphasized in [5]. Experiments and case studies will allow us to gain a better understanding of relationships in software engineering and they will also allow us to evaluate different hypotheses. Numerous books on design and analysis of experiments in general are available, for example, [13], and case studies are, for example, discussed in [14]. One major difficulty in empirical studies is the validity of the results. In other words how do we interpret the results and what conclusions can we draw? A particular problem is, of course, to find suitable subjects (participants in the experiment). It is desirable to use industrial software engineers, but many times this is unfeasible. A suitable starting point is therefore many

times to start with experiments at the universities using students as subjects, i.e. the experiments are conducted in an educational context. The use of students as subjects is, of course, a major threat to the validity, but on the other hand it could be good to start with an experiment in a university setting and based on the outcome we can, as part of a technology transfer process, replicate the experiment in industry and then continue with a pilot project. Experiments are regularly conducted with students as subjects, see for example [2]. The experiment can be done as part of a course in software engineering or as a separate activity where the students are attracted to the experiment based on some reward. Thus, we often have to resort to using the students and hence we would like to raise the question: is it possible to use the PSP as a context for empirical studies, including both experiments and case studies? If we use the PSP in the education could we also use the PSP course as a means for empirical studies? The PSP and Empirical Studies The PSP can be viewed from two different perspectives when it comes to empirical studies. First, it is important to evaluate the effect of the PSP. This means that the PSP is the object of the study. We have seen results published [10], but further studies are needed. This includes both reports on the outcome from taking the PSP as a course and from industrial use of the PSP. It is, however, not the intention here to discuss this matter. Second, assuming that we have started to teach and use the PSP, the PSP can be used as a context for empirical studies and hence as a vehicle for evaluating different methods and techniques, and to study different relationships in software engineering. The second issue is the main objective of this article. In particular, the objective is to highlight the opportunities and limitations of using the PSP as a context for empirical studies. Two different types of empirical studies can be identified: Experiments aimed at testing a hypothesis, for example, comparing two different methods for inspections. For this type of studies we apply methods for statistical inference, and we would like to show with statistical significance that one method is better than the other [13]. Case studies used to build models relating attributes to each other, for example, prediction models. One example of a prediction model may be to predict the number of faults in testing based on the number of defects found in compilation and the time spent in inspections. We mainly apply multivariate statistical analysis in this type of studies. The analysis methods include linear regression and principal components analysis [12]. Both of these types of studies are briefly illustrated within the context of the PSP below. It should be noted that the main objective is to illustrate empirical studies using the PSP rather than presenting the studies in detail. A key question to consider is, of course, why should we use the PSP as a context for empirical studies? Or we could negate the question, and ask why should we not use the PSP as a context for empirical studies? We would like to argue, since we often have to resort to using students in experiments anyway, that it is suitable to use the PSP (at least if we teach the PSP independent of our interest in experimentation). The PSP provides a context which includes collection of several measures which are valuable when experimenting. Moreover, we also believe that the PSP can be the starting point for


case studies, i.e., before going out in industry and perform a major study, we can gain a first (and valuable) insight by studying the PSP and its outcome. Thus, the advantages, challenges in using the PSP, and opportunities are as follows: Advantages Context. The context is given by the definition of the PSP as described by Humphrey [9]. We may want to change the proposed PSP slightly, but basically the context is provided, and hence we do not have to define the context and describe it very carefully to allow for others to understand our study from the context perspective. Replication. The context also forms the basis for replication. A major problem in empirical studies is that in order to come up with generally valid observations, we must be able to perform a study several times to build up general experience. Thus, the PSP may be one way to ease replication. In other words, experiments and case studies can be conducted at several places using the PSP simultaneously. The PSP provides a stable process and the process description is generally available. Measures. This is also closely related to the PSP. Measures are collected as an integrated part of the PSP, and it is fairly easy to add measures of specific interest for an empirical study. Thus, the PSP provides a good starting point for collecting measures to use for hypothesis testing and model building. Challenges Scaling. The PSP implements activities performed in large-scale projects, hence scaling down, for example, planning and estimation to the individual level. A major challenge in empirical studies is to generalize, see also validity, the observations. The major challenge in using the PSP as context is the ability to scale the observation to other environments, and in particular to large-scale software development. On the one hand, it is difficult to scale individual results to large project. On the other hand the PSP is supposed to act as a down-scaled project. Validity. The validity of the observations and findings is crucial. We would like to be able to generalize the observations. In order to do this, we must consider different types of validity, for example internal and external validity, [3]. The actual validity for different studies must be addressed separately, as the ability is highly dependent on the study and what we intend to generalize. Opportunities The PSP provides some opportunities for empirical studies. We may study the use of different techniques and methods, or investigate the relationships between different attributes. The main limitation of using the PSP as a basis for empirical studies is that we cannot use it to study group activities. It is possible to experiment with using different reading techniques on an individual basis, but we are unable to study the use of inspections and group meetings. Thus, we cannot expect to use the PSP as a context for experimentation for all types of empirical studies, but it is our firm belief that it opens some new opportunities and that the major inhibiting factor is our imagination. It is clear that the PSP provides opportunities for empirical studies. In the following section, we will show two examples of studies conducted with the PSP as a context. The main objective of the examples is to illustrate the use of the PSP as context, and not to provide any deep insight into the actual empirical studies.

Illustration of Using the PSP for Empirical Studies Introduction The main objective of this section is to illustrate the use of the PSP as a context for empirical studies. The actual results of the studies are presented here, but not in full detail. The objectives of the empirical studies are: to evaluate the difference in fault density based on prior experience of the programming language, and to investigate the relationships between different performance measures. For the first case, we have collected background information on the experience of the programming language used. In our particular case, the students should use C as the programming language independent of their prior experience. In the second case, we decided to formulate seven performance measures which we derived for all students. The objective is to investigate what dimensions we are able to measure. On a general level, we are normally interested in the following attributes: quality, productivity, cycle time and predictability. The quality is sometimes, for reasons of simplicity, measured in terms of fault content. Context The Empirical study is run within the context of the PSP. Moreover, the study is conducted within a PSP course given at the Department of Communication Systems, Lund University, Sweden. The course was given in 199697, and the main difference from the PSP as presented in [9] is that we provided a coding standard and a line counting standard. Moreover, the course was run with C as a mandatory programming language independent of the background of the students. The study is focused on the outcome of PSP. The PSP course is taken by a large number of individuals (this particular year, we had 65 students finishing the course). Thus, we have 65 participants (subjects) in the study. The experiment can be regarded as a quasi-experiment since students signed up for the course and hence we lack randomization [4]. In a case study, we do not expect to have randomization. In general, we have much less control in a case study than in an experiment. Planning As part of the first lecture, the students were asked to fill out a survey regarding their background in terms of experiences from issues related to the course, for example, knowledge in C. The students were required to use C in the course independently of their prior experience of the language. Thus, we did not require that the students had taken a C-course prior to entering the PSP course, which meant that some students learnt C within the PSP course. This is not according to the recommendation by Humphrey, see [9]. The hypothesis of the experiment based on the C experience was that students with more experience in C would make fewer faults per lines of code. WWSPIN Electronic Mailing List. The WorldWide SPIN is concerned with software process assessment and improvement, including: existing models and methods such as the CMM, Trillium and Bootstrap, ISO/IEC 15504, ISO 9000 and international news and experiences. To subscribe, send the message: SUB WWSPIN to [email protected] To post a message to people on the WWSPIN list, send it to [email protected]. This list is moderated.


The case study is based on investigating several performance measures and after the course evaluate if they measure several different dimensions, and in particular if we are able to capture quality, productivity, cycle time and predictability from the performance measures. It should be noted that we use all 10 programming tasks in the PSP course as the basis for determining the performance. The following measures were defined as performance measures: Total number of faults, Fault density, Program size, Development time, Productivity, Predictability of size, and Predictability of time. The objective of the second study does not require anything in particular during the course, since it is primarily an analysis at the end. For the first case, we have the following hypothesis: Null hypothesis, H0: There is no difference between the students in terms of number of faults per KLOC (1000 lines of code) based on the prior knowledge in C. H0: Number of faults per KLOC is independent of C experience. H1: Number of faults per KLOC changes with C experience. Measures needed: C experience and Faults/KLOC. The C experience is measured by introducing a classification into four classes based on prior experience of C (ordinal scale). The classes are: 1 No prior experience. 2 Read a book or followed a course. 3 Some industrial experience (less than 6 months). 4 Industrial experience. The second investigation requires that the following data are collected: Program size (estimate and actual), Development time (estimate and actual), and Number of faults. From these measures, we are able to derive the performance measures. The fault density of Faults/KLOC is also used in the first case. It should be noted that we are unable to evaluate the cycle time as we have no measures which capture the cycle time. This is difficult to achieve within the PSP. The best we can do is probably to measure delivery precision, primarily in terms of the number of late deliveries. We have, however, not kept track of this information or at least to the degree that we trust the data. The experimental design for language experience is: one factor with more than two treatments. The factor is the experience in C, and we have four treatments, see the experience grading above. The dependent variable is measured on a ratio scale and we can use a parametric test for this hypothesis. The ANOVA test is hence suitable to use for evaluation. For the case study, we would like to use principal components analysis to study what dimensions we are capturing with our seven performance measures. It is quite common that we collect a large number of measures, but basically we are only capturing a few dimensions due to multicollinearity between the different measures. Validity Evaluation This is a difficult area. In our particular case, we have several levels of validity to consider. Internal validity can be divided into: within the course this year, and between years. External validity can be divided into: students at Lund University (or more realistically to students from programs taking the PSP course), the PSP in general, and for software development in general.

The internal validity within the course is probably not a problem. The large number of tests (equal to the number of students) ensures that we have a good internal validity, probably both within the course this year and similar results can be expected in the future, if running the course in a similar way. Concerning the threats to the external validity, it is difficult to generalize the results to other students, i.e. students not taking the course. They are probably not as interested in software development and hence they come from a different population. The results from the analysis can probably be generalized to other PSP courses, where it is feasible to compare participants based on their background in terms of computer science or electrical engineering or experience of a particular programming language. The results are found for the PSP, but they are likely to hold for software development in general. This is motivated by the following observations for the two studies: There is no reason that people having different background experience from a particular programming language perform differently between the PSP and software development in general. Thus, for the language experience, we expect similar results in other environments. The performance measures can be collected for other environments than the PSP, and there is no reason that we should not get a similar grouping of the different measures. Thus, we believe that the results can be generalized to other contexts. Operation The subjects (students) are not aware of what we intend to study. They were informed that we wanted to study the outcome of the PSP course in comparison with the background of the participants. They were, however, not aware of the actual hypotheses stated. The students, from their point of view, do not primarily participate in an empirical study; they are taking a course. All students are guaranteed anonymity. The survey material is prepared in advance. Most of the other material is, however, provided through the PSP book [9]. The empirical study is executed over 14 weeks, where the 10 programming assignments are handed in regularly. The data are primarily collected through forms. Interviews are used at the end of the course, primarily to evaluate the course and the PSP as such. Data Validation Data were collected for 65 students. After the course, the achievements of the students were discussed among the people involved in the course. Data from six students were removed, due to that the data being regarded as invalid or at least questionable. Students have not been removed from the evaluation based on the actual figures, but due to our trust in the delivered data. The six students were removed due to: Data from two students were not filled in properly. One student finished the course much later than the rest, and he had a long period where he did not work with the PSP. This may have affected the data. The data from two students were removed based on the fact that they delivered their assignments late and required considerably more support than the other students, hence it was judged that the extra advice may have affected their data.


31

Median value of Faults/KLOC 66.0

Mean value of Faults/KLOC 72.7

Standard deviation of Faults/KLOC 29.0

19

69.7

68.0

22.9

6

63.6

67.6

20.6

2

63.0

63.0

17.3

Class

Number of Students

1 2 3 4

Table1: Faults/KLOC for the different C experience classes. C experience vs. Faults/KLOC

Degress of Freedom

Sum of squares

Mean square

F-value

p-value

Between groups

3

3483

1161

0.442

0.724

Within groups

55

144304

2624

Table 2: Results from the ANOVA test. Factor 1: Faults

Factor 2: Productivity

Factor 3: Predictability

Faults

0.869

0.395

0.019

Faults/KLOC

0.886

0.183

0.177

Development time

0.815

-0.282

-0.083

Program size

0.156

0.824

0.064

Productivity

-0.570

0.778

0.134

Size predictability

-0.218

-0.141

0.740

Time predictability

-0.036

-0.247

0.740

Performance Measure

Table 3: Results from the prinicpal components analysis.

Finally, one student was removed based on the fact that his background is completely different from the others. This means removing six students out of the 65, hence leaving 59 students for statistical analysis and interpretation of the results. Analysis and Interpretation Analysis of experiment For the experiment, we use descriptive statistics to visualize the data collected. From plotting the data, it is obvious that we have one outlier. If we look at the data including the outlier in the analysis, it seems that there is a weak tendency (when looking at the mean value) towards that more experience means lower fault density. If we remove the outlier, the tendency is still there although very weak. The data are summarized in Table 1. Thus, we do not expect to find any support for the hypothesis that language experience affects the fault density. The next step is to apply an ANOVA test to evaluate the hypothesis that more experience in C means fewer faults/KLOC. The results of the analysis are shown in Table 2. As expected, the results from the analysis are not significant. Thus, we are unable to show that there is a significant difference in terms of number of faults/KLOC based on C experience. Since the number of students in class 3 and 4 is very limited, class 2, 3 and 4 are grouped together to study the difference between class 1 and a grouping of class 2-4. A t-test was performed to evaluate if it was possible to differentiate between class 1 and the rest. No significant results were obtained.

Analysis of case study data For the case study, we are interested to investigate if we are able to capture different dimensions through our seven performance measures. In particular, we would like to see how many dimensions our seven measures actually capture. In order to do this, we apply a principal components analysis. The results of the analysis are presented in Table 3. From Table 3, we note that the seven measures can be grouped into three factors. The first factor seems to mainly capture faults. The development time is included in this factor, which may be regarded as a surprise, but it can be explained by the argument that a driving factor of time is the number of faults. People who have many faults take longer time to develop their programs, hence supporting the hypothesis that fault prevention and early fault detection is important for the development time. The second factor includes program size and productivity. This result indicates the difficulty we have in capturing productivity, i.e. people who write large programs seems to have a higher productivity. In this case when the students develop the same programs, productivity should be measured in terms of time to implement the functionality rather than as defined here. The third factor clearly captures the ability to estimate accurately, i.e. predictability. From the case study, we can see that we manage to differentiate between the three factors: quality (in terms of faults), productivity (although a questionable measure) and predictability. The fourth major factor (see the section on Planning), i.e. cycle time, is not measured and hence, of course, not visible among the factors found in the principal component analysis. Discussion We would like to see the Personal Software Process as an opportunity for empirical studies, in addition to its


original objective. Empirical studies are an important means to further understand, evaluate and improve software development. The empirical studies are often conducted in a student setting, and if we teach the PSP it may be combined with our need for empirical studies. Furthermore, it may be an important step in technology transfer. Experiments and case studies can be conducted in a controlled environment, before transferring the results to an industrial environment for further studies and implementation in the industrial development processes. It should also be noted that the empirical studies can be replicated easily based on the well-defined context provided by the PSP, hence providing a good basis for technology transfer decisions. The results from our two studies, briefly outlined in the paper, are interesting in themselves. The experiment shows that the experience of C programming does not influence the fault density. The case study illustrates that although trying to define seven different performance measures, we have actually only captured three main factors. It is also interesting to note that the factors are possible to separate, and that they are in accordance with expectations. We have already conducted several studies using the PSP as a context for empirical studies, and based on our experience and the results obtained we will continue our research in this direction. The studies so far include: two effort estimation studies (prestudy and main study) [6][7], a programming language comparison study [8] and a study of the performance in the PSP based on the individual background [15]. Finally, we would like to encourage others to perform empirical studies within the PSP both to replicate our studies and to perform other studies which could enlighten and improve our understanding of the underlying phenomena in software engineering. Acknowledgment I would like to thank Per Runeson, Dept. of Communication Systems for valuable comments on a draft of this paper. References [1] V.R. Basili, R.W. Selby and D.H. Hutchens, "Experimentation in Software Engineering", IEEE Transactions on Software Engineering, Vol. 12, No. 7, pp. 733-743, 1986. [2] L. Briand, C. Bunse, J, Daly and C. Differding, "An Experimental Comparison of the Maintainability of Object-Oriented and Structured Design Documents", Empirical Software Engineering: An International Journal, Vol. 2, No. 3, pp. 291-312, 1997. [3] T.D. Cook and D.T. Campbell, "Quasi-Experimentation - Design and Analysis Issues for Field Settings", Houghton Mifflin Company, 1979. [4] J. Daly, K. El Emam and J. Miller, "Multi-Method Research in Software Engineering", Proceedings 2nd International Workshop on Empirical Studies of Software Maintenance, WESS'97, Bari, Italy, pp. 3-10, 1997. [5] N. Fenton, S.L. Pfleeger and R. Glass, "Science and Substance: A Challenge to Software Engineers", IEEE Software, pp. 86-95, July, 1994. [6] M. Höst and C. Wohlin, "A Subjective Effort Estimation Experiment", Journal of Information and Software Technology, Vol. 39, No. 11, pp. 755-762, 1997. [7] M. Höst and C. Wohlin, "An Experimental Study of the Individual Subjective Effort Estimation and Combinations of the Estimates", Proceedings 20th International Conference on Software Engineering, Kyoto, Japan, April 1998 (to appear). [8] M. Höst and C. Wohlin, "A Comparison of Programming Languages within the Personal Software Process", Submission to Empirical Assessment & Evaluation in Software Engineering, EASE'98, Keele University, Keele, UK, March 1998. [9] W.S. Humphrey, "A Discipline of Software Engineering", AddisonWesley, 1995.

[10] W.S. Humphrey, "Using a Defined and Measured Personal Software Process", IEEE Software, pp. 77-88, May 1996. [11] W.S. Humphrey, "Introduction to the Personal Software Process", Addison-Wesley, 1997. [12] B.F.J. Manly, "Multivariate Statistical Methods: A Primer", Chapman & Hall, 1994. [13] D.C. Montgomery, "Design and Analysis of Experiments", 4th edition, John Wiley & Sons, 1997. [14] R.E. Stake, "The Art of Case Study Research", SAGE Publications, 1995. [15] C. Wohlin, A. Wesslén, M.C. Ohlsson, M. Höst, B. Regnell and P. Runeson, "A Quantitative Evaluation of the Differences in Individual Performance within the Personal Software Process", Technical report, Dept. of Communication Systems, Lund University, 1998 (in preparation). Claes Wohlin can be reached at: Dept. of Communication Systems, Lund University, PO Box 118, SE-221 00 Lund, Sweden; E-mail: [email protected]

SPICE TRIALS ASSESSMENT PROFILE Robin Hunter University of Strathclyde

This paper summarises the demographic information concerning the data collected in conjunction with Phase 2 of the SPICE Trials1 up until 15th December 1997. Further information about the SPICE Trials and the version of the emerging ISO/IEC 15504 international standard that was evaluated during these trials can be obtained from [1]. We first describe the main demographic factors for which a significant amount of data was collected. Then we summarise the trials in terms of process coverage, summarise the ratings and capability levels observed, present some initial analyses on the impact of criticality on process capability, and then present a summary and conclusions. Summary of Assessments and Projects A large amount of demographic information concerning the trials was collected (much more than for Phase 1). Some of this data concerned the Organisational Units (OUs) that were assessed and some concerned the projects that were assessed within the OUs. In this section we summarise this information. OU Data The Organisational Unit data (OU data) included the SPICE region in which the OU was situated, the industrial sector in which the OU operated, the target sector for which the OU produced software, the total number of staff in the OU, and the number of IT staff in the OU . From Figure 1 it is seen that the assessments were split roughly equally between two of the five SPICE regions, with 16 in Europe and 14 in the Southern Asia Pacific region, giving a total of 30 assessments for which we have data. The distribution shown in Figure 2 shows that, out of the 30 assessments, 90% (27/30) used the Part 5 assessment model. The remaining 10% used the Process Professional assessment model. Figure 3 shows the distribution of tools used. Most of the assessments (67%) did not use an assessment tool. Of 1

The interim Trials Report is available publicly and can be obtained from (goto the Trials page) or


16

20

14

15

3

5

2

D

So

ftw

ar

e

O

th e

r

. ev D

Se s. & rv .

od Pr

ef

IT

is

Pu

tri

D

/L

en

og

is

se

.

om

b.

U bl

ic

Fi

25

Figure 5: Primary business sector of OUs participating in the trials (y-axis is the number of OUs).

20 15

10

10

9 8

5

7

0

Part 5

Proc. Professional

Assessment Model

Figure 2: Distribution of assessment models used.

6 5 4

30

3

25 Number of Assessments

til

nc na

30

ec

0

y

1

Te l

South Asia Pacific

iy

Europe

Figure 1: Region where the assessments took place (the y-axis is number of assessments).

e

0

Number of Assessments

5 4

10

2

20

1 15

0 10 5 0

No Tool

SEAL

Proc. Professional

Figure 3: Distribution of assessment tools used. 15

Figure 6: Target business sector of the OUs participating in the trials (y-axis is the number of OUs). 4 3 2

15

8

1 0

10 5 0

6

10

25

50

75

100

150

250

500

1000

Figure 7: Approximate number of OU staff in participating OUs. Europe

South Asia Pacific

Figure 4: Region where participating OUs are located (y-axis is the number of OUs).

those that used a tool, 23% (7/30) used the SEAL tool from South Africa (available in [1]), and the remaining 10% (3/30) used the Process Professional assessment tool. Since more than one assessment may have occurred in a particular OU (for example, multiple assessments, each one looking at a different set of processes), we can

5 4 3 2 1 0

10

25

50

75

100

150

200

250

500

Figure 8: Approximate number of IT staff in participating OUs.


16 14 12 10 f 8 6 4 2 0

code size 15 number 10 of projects 5

6

Figure 11: Projects by size.

1

2

3

4

6

8

150

10

27

122 90

100 50 0

25

65 35

CUS

29 ENG

SUP

MAN

ORG

Figure 12: Coverage by process category (y-axis is number of instances).

20 f

7

0

Figure 9: Number of projects covered per trial (y-axis is the number of assessments). 30

13

16

Project data

15 10 3

er

D

B

M

an

th

ag

.

ys .S m om

C

c ifi ie

nt

2

Sc

O

pe

ra

tin

&

g

En

Sy

te ys lS tro

on C

g.

s.

m

IS

0

2

.

1

O

5

5

Figure 10: Product category for the assessed projects (y-axis is the number of projects).

see in Figure 4 that organisations involved in the assessments were split with 15 in Europe and eight in the Southern Asia Pacific region, giving a total of 23 different organisations. Figure 5 shows that 11 of the OUs were concerned with the production of software or other IT products or services . Figure 6 shows the target sectors (one or more) in which each of the OUs were involved. The data for the approximate number of staff and the approximate number of IT staff in the OUs are shown in Figure 7 and Figure 8 for 21 of the 23 OUs. The questions corresponding to these data both asked for approximate numbers of staff, rounded to a suitable number, 'such as' those shown . It would have been perfectly possible for a number greater than 1000 (in the case of Figure 7) and greater than 500 (in the case of Figure 8 to have been returned, and the database allowed for this. However, no such numbers were returned from the trials. As can be seen from this data, there was good variation in the sizes (both small and large) of the OUs that participated in the trials thus far. However, the same cannot be said for the business sectors. No organisations in the following primary business sectors participated in the trials (see Figure 5): business services, petroleum, automotive, aerospace, public administration, consumer goods, retail, health and pharmaceuticals, leisure and tourism, manufacturing, construction, and travel.

More than one project may be assessed in a single assessment. Project specific data we collected included the criticality of the product produced, the perceived importance of the product quality characteristics defined by ISO/IEC 9126, and the category to which the product belonged. We had data from the 76 projects involved in the trials . Approximately 80% of these were software development projects, approximately 4% were non-software development projects, and approximately 13% were continuous processes within the organisation not associated with a single project . The number of projects per trial is shown in Figure 9. It is evident that most assessments involved only one project. However, some covered up to 10 projects in a single assessment. The product categories for these projects are shown in Figure 10. We had data from only 56 of these projects. As can be seen, almost half of the projects involved the development of information systems of one sort or another. Of the information system category, two projects were non-software development. Of the operating system category, three were continuous organisational processes. Of the database management category, one was a continuous organisational process. The distribution of the projects according to code size is shown in Figure 11, where small means less than 10 KLOC, medium, 10-100 KLOC, and large more than 100 KLOC for a software system implemented in 3GL. The data was available for 26 of the 76 projects. Although it is highly dubious to collect line of code data across organisations internationally in such a manner, it still gives a rough indication of project sizes. Perhaps most interesting is the extent of inability of organizations to provide size data on their project (note that size in Function Points was also requested, but there even less data was collected). Process Coverage The process instances assessed during the trials (341 in all) were distributed over the five process categories defined by the ISO/IEC 15504 model (CUS, ENG, SUP, MAN, ORG) as in Figure 12. As can be seen, all the


5 4 3 2 1 0

1

4

5

6

7

8

10

14

15

17

18

19

25

29

30

Figure 13: Process instances per trial (y-axis is the number of assessments). Process Instances per Trials

30

24

18

12

6

0

Non-Outlier Max = 30 Non-Outlier Min = 1 75% = 18 25% = 6 Median = 7

Figure 14: Box and whisker plot showing the variation in the number of process instances rated per trial.

process categories were covered by a significant number of assessments although not to the same extent. The number of process instances per trial is shown in Figure 13. As can be seen, there is a peak at six process instances per trial and the maximum number is 30. The box and whisker plot in Figure 14 shows the variation and the median of seven process instances per trial. Another interesting statistic is the number of process instances assessed per project, ranging from one to 29, with an average of 4.5. Rating and Profile Analysis For each of the 341 individual process instances assessed, the ratings were recorded for each of the attributes. The attributes corresponding to the various capability levels are summarised in Table 1. The total numbers of process instances over all the trial assessments which were rated at each capability level are shown in Figure 15. For clarity, Figure 15 only shows the fully (F), largely (L), and partially (P) values. Process instances not achieved (N) or not assessed (X) are not shown in this figure. Notice that, as expected, the attributes corresponding to the higher capability levels less often receive the higher ratings than those corresponding to the lower levels. Less obvious, but worth noting, is that of the two attributes at level 2 (pm and wpm), pm is more often highly rated than wpm and of the two attributes at level 3 (pd and pr), pr is more often highly rated than pd. At levels 4 and 5 the difference between the ratings for the two attributes seems less significant. The pie charts shown in Figure 16 provide an alternative view for some of the same data and distinguish between attributes which were not achieved (N) and those which were not assessed (X).

The ratings of the attributes associated with a process instance may be used to compute the capability of a process. The capability of a process is defined to be the highest capability level for which the process attributes for that level are either rated largely or fully and the attributes for all lower levels are rated fully. A summary of this scheme is provided in Table 2. When the data in the database is analysed the number of process instances found to be at each of the capability levels is as shown in Figure 17. A comment on the definition of process capability may be appropriate here. Clearly there are two ways in which a process instance may fail to be rated at a particular capability level: The attributes at that level may not be rated fully or largely. The attributes at the next lower level may not be rated fully. The 65 process instances at level 2 were analysed to see which would have been rated at level 3 if this did not require the level 2 attributes to be rated fully rather than largely. The result was that 24 process instances would have been rated at level 3 (37%). Thus in a significant number of cases, process instances fail to achieve a particular capability level because of inadequacies at the previous level, rather than at the level in question. When performing the above analysis, one anomaly was noticed, namely a process which did not fully satisfy the level 2 attributes and yet fully satisfied the level 3 attributes. The rest of the data suggested that this was an isolated case! The numbers of process instances at each capability level may also be shown for each process category as in Figure 18, which shows the percentage of process instances in each category achieving at least a particular capability level. Notice that the process instances at level 0 tend to be in the SUP and MAN categories, while the level 4 process instances tend to be in the SUP category. Criticality Clearly, as can be seen from the previous sections, there is considerable scope for correlating demographic variables with process ratings. As an example of this, an analysis was performed of how the criticality factors concerning safety, economic loss, security, and environmental impact (as defined in ISO/IEC 14598) affected the process ratings. The capability levels of those process instances associated with projects that the OU considered to be critical with respect to one of the factors (a subset containing 88 of the 341 process instances considered above) are summarised in Figure 19. Most notable are the smaller percentage of level 0 process instances and the larger percentage of level 4 process instances in this set compared with the data relating to all the assessments shown in Figure 17. Clearly many other such analyses are possible. Further analysis of this type are planned. Conclusions About Assessments and Ratings The major findings from the analysis presented in this paper are: 1 Only two regions have participated in the trials by providing data thus far (i.e., December 1997): Europe and South Asia Pacific2. 2

Since we performed this analysis, data has been collected from Canada and Latin America, USA, and the Northern Asia Pacific.


Capability Levels

Process Attributes

Capability Process Attributes Levels

Rating

Level 1

process performance (pp)

Level 1

Process Performance

Largely or Fully

Process Performance

Fully

Performance Management

Largely or Fully

process definition (pd)

Work Product Management

Largely or Fully

process resource (pr)

Process Performance

Fully

process measurement (pme)


Fully

Level 2 Level 3 Level 4 Level 5

performance management (pm) work product management (wpm)

Level 2

process control (pco)


Fully

process change (pch)

Process Definition

Largely or Fully

continuous improvement (ci)

Process Resource

Largely or Fully

Process Performance

Fully


Fully


Fully

Process Definition

Fully

Process Resource

Fully

Process Measurement

Largely or Fully

Process Control

Largely or Fully

Process Performance

Fully


Fully


Fully

Process Definition

Fully

Process Resource

Fully

Process Measurement

Fully

Process Control

Fully

Process Change

Largely or Fully

Continuous Improvement

Largely or Fully

Level 3

Table 1: The attributes at each capability level (and their acronyms).

attributes 350 300 250 number of 200 process instances 150 100 50 0

pp

pm

F

wpm

pd

L

P

pr

Level 4

pme

pco

pch

ci

Level 5

Figure 15: Attribute ratings profile.

2 We have data from 30 assessments conducted in 23 different organisations in these two regions. 3 There was a good distribution in terms of Organisational Unit size (both large and small). However, there was no participation for OUs whose primary business sector was: business services, petroleum, automotive, aerospace, public administration, consumer goods, retail, health and pharmaceuticals, leisure and tourism, manufacturing, construction, and travel. 4 Most assessments involved only one project in the OU. 5 All processes in the version of the ISO/IEC 15504 Reference Model that was eveluated were covered. 6 The median number of process instances per assessment is seven. 7 In general, we found that the attributes corresponding to the higher capability levels less often receive the higher ratings than those corresponding to the lower capability levels. 8 In a significant number of cases, process instances fail to achieve a particular capability level because of inadequacies at the previous level, rather than at the level in question. 9 Approximately 19% of the process instances were at level 0, 50% at level 1, and 19% at level 2.

Table 2: Scheme for determining the capability level rating for the first three levels.

These findings pertain to the data that has been collected thus far, and of course may be affected when more data is collected before the end of the Phase 2 Trials. Acknowledgements The earlier work of Ian Woodman [2] is acknowledged as is the assistance of Khaled El Emam for providing the box and whisker plot in Figure 14 and to John Wilson for discussions on the database issues involved in the above analyses. References [1] K. El Emam, J-N Drouin, and W. Melo (eds.): SPICE: The Theory and Practice of Software Process Improvement and Capability Determination. IEEE CS Press, 1998. [2] I. Woodman and R. Hunter: Analysis of assessment data from phase one of the SPICE trials. In IEEE TCSE Software Process Newsletter, No. 6, Spring 1996. Robin Hunter can be reached at: Department of Computer Science, University of Strathclyde, Richmond Street, GLASGOW G1 1XH, UK; Email: [email protected]


Level 1 process performance

P 14%

X N 1% 4%

F 50% L 31%

Level 2

performance management

X 1%

N 16%

work product management

F 26%

X 1%

N 23%

F 22%

P 21%

Level 3 process resource

process definition X F 7% 11%

X 7% L 18%

N 36%

L 26%

P 28%

L 36%

F 17%

N 32%

L 28% P 16%

P 28%

Level 4 F 1%

process measurement

X 33%

process control

L 7%

P 7%

N 52% Figure 16: Distribution of attribute ratings by level. Software Process Newsletter: SPN - 17

X 32%

F L 0% 7%

P 6%

N 55%

Level 5 F 0%

process change

L 5%

X 32%

F 0%

continuous improvement P 9%

X 32%

L P 3% 9%

N 54%

N 56%

Figure 16: Distribution of attribute ratings by level (contd.).

process capability levels level 4 3% level 3 9%

level 5 0% level 0 19%

level 2 19%

level 1 50%

Figure 17: Distribution of capability levels across all process instances.

100% 80%

level 4 level 3

60%

level 4 level 5 0% level 3 8% 7%

level 0 10%

level 2

40%

level 1 level 0

20% 0% CUS

ENG

SUP

MAN

ORG

level 2 18%

Figure 18: Profile of process capability across all process instances per process category. level 1 57%

Figure 19: Process capability levels for high criticality products.


SOFTWARE PROCESS IMPROVEMENT IN CENTRAL AND EASTERN EUROPE Miklós Biró, MTA SZTAKI, Hungary (moderator) J. Gorski, Technical University of Gdansk, Poland Yu. G. Stoyan, A.F. Loyko, M.V. Novozhilova, National Academy of Sciences of Ukraine I. Socol, D. Bichir, SIVECO, Romania (INSPIRE INCO-Copernicus Project) R. Vajde Horvat, I. Rozman, J. Györkös, University of Maribor, Slovenia

The worldwide information technology (IT) market has been growing at a rate of 8-10% since 1994. This growing market offers new opportunities for acquiring or increasing market share. A competitive software industry can be a cornerstone of economical growth in Central and Eastern European (CEE) countries if they can take advantage of this opportunity, exploit their traditional strengths, and overcome their weaknesses, all of which are briefly analysed below. SWOT (Strengths, Weaknesses, Opportunities, and Threats) Analysis from the Perspectives of the Four Possible Levers of a Firm Levers are the means used by a firm to multiply its resources. Fundamentally, it is the use of levers that account for the differences in profitability among firms. Four possible levers of a firm are the financial lever, the operating lever, the marketing lever, and the production lever. What are the leverages that can be exploited by CEE companies or companies outsourcing their software development activity to CEE? CEE has a number of general strengths including a highly-educated workforce that is able to assimilate new skills rapidly and produce high quality goods for export at relatively low costs. For the same reasons, R&D capacity is high as well. Operating leverage is the relative change in profit induced by a relative change in volume. Because of its low operating costs, the CEE software industry has a high operating leverage. Consequently it can generate more profit than its less-leveraged competitors as soon as its volume reaches a given level. The relative lack of local managerial skills and experience, and the former neglect of the development of a quality culture, are weaknesses that have an impact on both the production and marketing leverages. Production leverage is the rate of growth of profits resulting from cost declines as a result of progress on the experience curve. Production leverage can be achieved only if management is able to properly organise production. Quality management is an important part in this organisation. The two main ingredients of marketing leverage are higher prices and innovative distribution. The achievement of either of these goals requires highlyperceived quality and advanced market management skills. As far as production and marketing leverages are concerned, CEE is making efforts in training managers to

obtain the necessary skills that were previously unheard of in the former economic system. The possibility of making use of financial leverage having and exploiting debt capacitydepends on the advent of general economic recovery and lower inflation, both of which require a rather long-term process. The notion of leverage used in this section is expounded in more detail in [14]. Quality Awareness in CEE Hungary The general Hungarian quality scene is best characterised by the increasing number of ISO 9000certified companies that have grown from very few at the beginning of the 1990's to over 500 today. However, up to now, few software development organisations have achieved ISO 9000 certification. One of these certified companies includes the Informatics and Systems and Control departments of MTA SZTAKI. Regarding the capability maturity of software development firms, we assessed some software companies with the help of the BOOTSTRAP software process assessment methodology. According to our assessments, the maturity levels of assessed softwareproducing units were between 1.25 and 2.75. To obtain a broader picture of quality awareness within the Hungarian software industry, we created a short questionnaire to which companies were asked to reply voluntarily and anonymously. Eighty-eight percent of respondents knew about ISO 9000 standards and 38% knew the BOOTSTRAP methodology. A few have heard about CMM, SPICE, and TickIT methodologies and standards, while other methodologies were not well known. The demand or requirement for formal certification has not yet become obvious. The majority of respondents (88%) do not or rarely require formal certification to ISO 9000 from their subcontractors. Neither are they usually required to have formal certification as a subcontractor. At the same time, the majority of respondents feel the need for formal certification of their quality management system. Some are planning for it or are currently undergoing certification. The initiations of quality management systems are present almost everywhere. The second half of the questionnaire was directed toward specific areas of quality management. Questions were asked about the level to which processes of a specific area have accomplished or the existence and level of detail of certain documents, the answers of which could be chosen from a range of four levels. Results were, of course, not precise enough to be conclusive at a general maturity level, but were satisfactory enough to make comparisons about the awareness of the various quality areas. Poland During March-April 1997, a market study was conducted to identifying the need for improvement of software processes in Polish institutions. The study concentrated on large institutions, based on the assumption that small and mid-sized enterprises do not have enough capital to invest in technology improvements (in Poland there are no well-defined schemes to support small and mid-sized business in technology advances). Sixty mostly-large institutions were contacted. They have been divided into three categories: 1. suppliers of IT infrastructure (hardware and software) and system integrators for end-users (for example, HP, Oracle, Unisys, and Computerland)


2. software developers (for example, CSBI, PROKOM, POLSOFT) 3. software clients (sometimes with a large software department), for example, banks, insurance companies, administration. The results show that there is interest in software technology transfer and in process improvement. An interesting observation is that client organisations seem to be more aware of this need than, for example, software development companies. It can be concluded that there is a very strong feeling that something must be changed. However, this feeling has not been directly translated into a deeper understanding of what should be changed and where the investments should be directed. Ultimately, the market needs more awareness-building activities, more success stories, and more demonstrations of positive examples. Almost all institutions that responded positively declared an interest in courses and training. Again this finding shows the need for awareness-building activities. It also provides a chance that such training seminars can lead to more concrete cooperation in the future. Romania More than 90% of Romanian software companies are private. State-owned companies still exist (no more than 10 in the whole country) such as the Institute of Research in Informatics. To raise the quality of software, ISO 9000 certification has been strongly emphasised as important. To stimulate the ISO 9000 certification initiatives, the ISO 9000-certified software companies will be provided with many incentives, such as reduced or no taxes. Software Technological Parks will be created and/or implemented and new standards will be developed for software engineering (such as mandatory and optional standards, recommendations, a Romanian keyboard, and Romanian IT terminology). Public administration activities will be fully informatised to improve their services and to simplify their procedural and administrative practices. Training is another important problem. It is being provided to change the structure of the specialised faculties and to add new domains, such as project management, marketing, and quality assurance for the IT industry. Strong cooperative efforts will be established between education and the software industry by giving students the opportunity to achieve practical experience by working in the software industry. The students scholarship will be tax-free. Many of the unemployed will be absorbed by a collateral industry, data production. A large number of software companies will be involved in this activity and it is estimated that about US$20 million will be earned from exporting data production. Slovenia At the beginning of 1994, the Laboratory for Informatics at the University of Maribor, the Ministry of Science and Technology of Slovenia, and the Slovenian local industry (11 organisations) initiated a project called PROCESSUSAssessment and introduction of quality systems. Two main goals were defined for the project: 1. research: development of a methodology that could be applied for SPI within a wide range of software companies and that would comply with ISO 9001 and CMM. [6, 7, 8, 9] 2. implementation: use of the methodology to introduce and maintain quality systems in participating organisations.

The first iteration of the project has finished and the results of the PROCESSUS project prove that the developed methodology is being directed toward the right goals [10, 11]. Statistics of achieved results within cooperating organisations are presented below: Large information organisations (1). The primary goal of this organisation was to use the methodology for consultation in activities with other companies. Some projects are already being launched in different software companies using the PROCESSUS methodology. IT departments within large enterprises (3). One of these IT departments has already arranged all of its procedures in accordance with software quality system requirements. The enterprise has already achieved the ISO 9001 certification. In two other cases, the SPI activities within IT departments initiated the quality improvement activities in other departments. The majority of procedures performed in IT departments have already been established in accordance with software quality system requirements. Independent software companies (7). Two independent companies have already achieved certification. Another two companies have cooperated, with the intention of applying for ISO 9000 certification. Within these two organisations some of their procedures have improved. The last three organisations stopped their SPI projects half way through. The reasons cited were lack of money and other resources, low motivation of management, and personnel issues. The next iteration of the project is being initiated and will be based on the developed model, experiences with partners from the first iteration of the project (the activities for personnel motivation will be emphasised), and market interests. Market research has shown that 9% of independent software companies have already achieved ISO 9000 certification, Twenty-five percent of organisations have already started an SPI project and 36% of organisations intend to start an SPI project. Ukraine In Ukraine, the state system of certification of production (so named UkrSEPRO) has been working since 1993. Ukrainian state standards of quality (SSU) started in 1996 with an immediate application of ISO 9000. In 1996, within the framework of cooperation with European organisations [12][13], the Ukrainian experts have been taking part in the project Qualification of Ukrainian Software Specialists (QUALUS) maintained by the German governments TRANSFORM program. An international seminar entitled "Quality of software: The information market" had been organised in which the report of Jurgen Heene, director of WIDIS GmbH, entitled "German experience in certification on ISO 9000" was discussed. In Kiev, the expertise of WIDIS GmbH was used in a number of consultations on the installation of quality systems. The analysis of activities within Ukrainian firms engaged in the IT field shows that the introduction of quality systems depends on the following major factors. The requirements of the customer. There are essential differences between the IT market within Ukraine and outside of it. The lack of quality systems and information about its features prevents the progress of production of Ukrainian firms in external markets. Moreover, many Ukrainian software firms working on projects for foreign customers have lost contracts, with resulting financial losses, as a result of


the absence of the necessary technology to maintain the quality of production. Within the Ukrainian market, only the government has required certification and the introduction of a quality system as part of the bidding process of tender participants. Sometimes those enterprises that need IT within the manufacturing production process for export to their foreign customers require its certification. The authors of this paper have seen that fact at Kharkov turbine factory when they were selling their own CAE system. The need for certification of the factorys production was because the customer was foreign. Scale of manufacturing. It is obvious that when small enterprises are engaged in simple and nonlabour consuming production, the problem of introducing quality systems and certifying their production processes is, as a rule, as yet unrealized. With an increase in the scale of manufacturing, the problem of how effective the organisation can be in taking measures to maintain levels of demanded quality becomes more urgent. Qualification of the employees of the firm. For an organisation to be effective in its manufacturing processes and to meet the level of quality that its customers demand depends on the knowledge and qualification of both managers and employers of the firm. In Ukrainian software firms working on large projects, the emphasis is on the use of various organisational technologies and on group work: drawing up the technical projects, organisation of working meetings, independent testing, and so on. Thus, the questions of quality are not allocated to a separate problem, but are considered to be necessary results. Conclusions We claim that it is possible to increase competitiveness in CEE and in countries outsourcing their software development activity to CEE simultaneously by way of mutually-fruitful cooperations, the precondition of which is the assessment and improvement of the capability of the CEE software industry. This joint interest manifests itself by several European Commission-supported initiatives, including the ColorPIE ESBNET ESSI proposal, the PASS ESSI PIE project, the INSPIRE INCO-Copernicus project, and so on. We would like to draw attention to a fact that must be taken into account if we want to achieve real results. In CEE countries, the IT business is conducted mainly by SMEs that are too small to invest in SPI. Usually they do not have enough capital and are very much opportunity driven. It is apparent that there is a need to further develop SPI models that would be applicable to SMEs, not only in CEE but in other parts of the world as well. Most of the present models target large companies and are too "heavy" to be applicable to SMEs. The main difference should be that the feedback loop from the investment to the actual benefit should be much shorter and the investment should be split into small slices. Without having such a model it is rather unlikely that SMEs will be able to enter the improvement path in a planned and systematic way. References [1] Biró,M.; Feuer,É.; Remzsõ,T. The Hungarian Quality Scene Potential for Co-operation. In: Proceedings of the ISCN96 Conference on Practical Improvement of Software Processes and Products. (ed. by R.Messnarz). (International Software Collaborative Network, Brighton, 1996) pp.56-63. [2] Biró,M. IT Market and Software Industry in Hungary. Documentation of the ESI 1997 Members' Forum.

[3] J. Gorski, Assessment of the market needs concerning improvements of software technologies, Technical Report, Centre of Software Engineering/ITTI, Poznan, May, 1997 (in Polish) [4] Vlad Tepelea, "Informational Society - here and now" ComputerWorld Romania, No. 14 (84) from August 1997, Pg.5. [5] Mircea Sarbu, Mircea Cioata, "Software: National Priority"- PC Report Romania, No. 60 from September 1997, Pg 15-16. [6] International Organisation for Standardisation, ISO 9001, Quality Systems - Model for quality assurance in design/development, production, installation, and servicing, ISO 9001 : 1994 (E), International Organisation for Standardisation, Geneva, Switzerland, 1994 [7] International Organisation for Standardisation, ISO 9000-3, Guidelines for the application of ISO 9001 to the development, supply and maintenance of software, ISO 9000-3 : 1991 (E), International Organisation for Standardisation, Geneva, Switzerland, 1991 [8] M.C.Paulk, C.V. Weber, S. Garcia, M.B.Chrissis, and M. Bush, Key Practices of the Capability Maturity Mode, Version 1.1, Software Engineering Institute, CMU/SEI-93-TR-25, February 1993 [9] M.C.Paulk, B. Curtis, M.B.Chrissis, and C.V. Weber, Capability Maturity Model for Software, Version 1.1, Software Engineering Institute, CMU/SEI-93-TR-24, February 1993 [10] Rozman, R. Vajde Horvat, J. Györkös, M. Hericko, PROCESSUS Integration of SEI CMM and ISO Quality Models, Software Quality Journal, March 1997 [11] R. Vajde Horvat, I. Rozman, Challenges and Solutions for SPI in a Small Company, Proceedings of European Software Engineering Process Group '97 Conference, Amsterdam, 16-19th June, paper C307c. [12] Ribalchenko V. ISO 9000 and quality ON. - Computer review, N 7 (80) from 26.02.1997, Pp.29-30. [13] V. Ribalchenko, S. Ryabopolov: Certification. - Computer review, N 17 (90) from 23.05.1997, Pp.26-29. [14] M. Biro, T. Remzso: Business Motivations for Software Process Improvement. ERCIM News (European Research Consortium for Informatics and Mathematics) No.32 (1998) pp.40-41. (http://www-ercim.inria.fr/wwwercim.inria.fr/publication/Ercim_News/enw32/biro.html) Miklos Biro can be reached at: MTA SZTAKI, Kende u. 13-17, Budapest, H-1111 Hungary; E-mail: [email protected]

EUROPEAN SPI-GLASS Colin Tully Colin Tully Associates

SPI - different paths to salvation: continued In the last issue, we embarked on a comparison between ESSI (the European Systems and Software Initiative, part of the European Commission's ESPRIT programme) and the SEI's software process programme, as contrasting mechanisms for driving the adoption of SPI (software process improvement). Two contrasts were drawn. The first contrast, with respect to theoretical basis, was between the software management theory embodied in the CMM, on which the SEI's process programme is based, and the economic theory underpinning ESSI. The second contrast, with respect to model variety, was between the USA's monotheistic faith in the CMM and the combination of polytheism and atheism found in Europe. Perhaps crudely, the difference between the two approaches, as characterised so far, may be summed up as "give them the tool" (USA) and "give them the money and let them decide on a tool" (Europe). Some readers may find themselves recalling arguments over third-world aid, in which a similar dichotomy occurs. It should be


stated at once that this column takes no position on the relative merit of the two approaches. We now continue our comparison of the European and American approaches by considering three more contrasts. Third contrast: funding channels Saying that the two programmes are dominated by different "theories", one of which is economic, should not disguise the fact that of course both programmes have an economic dimension. In both cases, substantial sums of public funding have been deployed, reflecting the concerns felt by European and American administrations about the strategic importance of software capability. It is not our purpose, even if the data were available, to compare the amounts of public money invested in these initiatives. It is more interesting to compare the ways in which funds are channelled. The contrast is between indirect funding for the promotion of SPI in the States and direct funding of specific SPI projects in Europe. Indirect funding of improvement in the USA Funding from the US Government (specifically the Department of Defense), to encourage SPI, has been indirect. It has been channelled in two main ways. First, by funding of the SEI in general, and in particular through continued budgetary approval for the SEI's software process programme over many years, the DoD subsidised the development of the CMM. Without that investment, the main engine for the spread of SPI in America would not have existed. Second, by exploiting its massive procurement power, and by mandating specific maturity levels for procurement contracts, the DoD fostered the first (critical) phase of industrial take-up. Without that commercial incentive, the CMM might have remained of mainly academic interest. The investment and the commercial incentive were, of course, aimed just at the defence sector of US industry, not explicitly at American software producers as a whole. The DoD has demonstrated some uncertainty over whether its sponsorship of the SEI should extend to supporting widespread roll-out of the CMM across all industrial sectors. Nevertheless, the SEI has taken what chances it could to promote such roll-out; and that push, combined with the pull of US industry's cultural propensity to accept new ideas, has ensured the take-up of CMM-driven SPI in thousands of software-producing organisations. It is interesting to note that there has been an attempt, conscious or otherwise, to replicate some elements of this approach in Europe. The attempt was quite unrelated to the ESSI programme which is the subject of this column's current attention. The Commission funded the development of a model called Euromethod, to improve the customer-contractor process in the publicsector procurement of large information systems. That investment was then to be followed by the commercial incentive of mandated use of Euromethod in bidding for such contracts. It is not clear what degree of success Euromethod has had in meeting its initial goals; but it seems so far not to have achieved the second-phase roll-out that characterised the CMM's success in America. This funding can be described as indirect because in no case did the US Government directly fund specific projects for SPI. In the first stage it established the SEI, and allowed it to exercise its own judgment about how to deploy the funding it was given. In the second stage it simply said to contractors, "If you want to be allowed to

bid, show us you've improved to a specified level" - a requirement that struck immediately at a substantial proportion of contractors' business. There was no money devoted directly and specifically to develop a model or to meet the costs of improvement. Direct funding of improvement in Europe Funding from the European Commission has, by contrast, always been directly channelled to specific projects, undertaken by industrial consortia with or without academic input. Bids for project funding have been required in all cases to comply with Commission work programmes in force at the time, so that the Commission has exercised fairly tight control over the nature of project proposals. With respect to SPI, projects have been of two kinds. First, as part of mainstream R&D within the Commission's long-running ESPRIT programme, there have been a number of projects to develop models and methods. Leading examples include Bootstrap (a process maturity appraisal model), ami (a method for introducing metrics into the software process), and REBOOT (a method and appraisal model for introducing systematic reuse into the software process). These projects are generally recognised as having produced models and methods of good quality. Effort has been put into publishing and disseminating their results, and into establishing various mechanisms to try to promote take-up after the projects have ended: the Bootstrap Institute, the ami User Group and a number of REBOOT follow-on projects. Inevitably, however, such individual take-up mechanisms cannot match the impact of the SEI, which represents a massive statement of long-term government commitment. The problem with a project is that by definition it is of limited duration. However excellent its results, and however strong the commitment of its participants, momentum is almost bound to fall away at the end of the project - at the very point where, if take-up is to be achieved, momentum needs to increase. Further, the project-oriented approach by its nature produces a multiplicity of models and methods, with no provision for their integration. It thus directly leads to the polytheistic fragmentation described in the last issue. Projects of the second kind have been best practice projects - constituting the ESSI component of the ESPRIT programme. ESSI funding supports the direct marginal costs of SPI projects within individual software producing organisations. As we have already observed, this is a radical shift from the technological innovation sought by normal ESPRIT projects (the development of new models and methods) to organisational innovation (the development of new practice). Nevertheless, support is still for a single limited-life project. Proposals for support are required to demonstrate the organisation's commitment to longerterm SPI. But, even if that commitment is honestly made at proposal time, it may prove hard to sustain two or three years later, after the end of the project. In the end, funding is offered and accepted for a tightly bounded package of work, not for a long-term programme. The contrast with the sustained effort needed to achieve level 3, to qualify for the DoD approved list, is stark. Fourth contrast: exercise of central authority This comparison follows directly from the nature of the funding channels, as just discussed. It can be presented briefly.


Controlling the model, and controlling capability, in the USA The exercise of central authority in America has been shared between the SEI and the DoD, each with a clear role. The SEI exercises central authority to control the CMM, its development and deployment. The major parts of its role are: to be the development authority for the CMM and CMM-based methods; to promote, and to exercise quality control over, their application; to maintain a data repository of appraisal results; and to publish and disseminate information. The development authority role is partly exercised through collaborative mechanisms such as working groups, reviewer groups and correspondence groups; that degree of collaboration lessens the extent of centralisation, although the SEI remains the ultimate authority. Quality control is exercised through the licensing of training courses and the registration of lead assessors. Dissemination includes organising an annual conference of SEPGs (software engineering process groups, and supporting SPIN (software process improvement network) groups nationwide. The DoD exercises central authority by setting required capability levels for its approved contractors, and by conducting capability evaluations in selecting for specific contracts. Controlling projects, in Europe Central authority is exercised by the European Commission over the projects it funds, in three ways: defining the broad parameters of projects from time to time, in terms of subject matter, consortium structure, eligible costs, funding limits etc; evaluating project proposals, to select the specific projects to which funding is to be awarded; and exercising quality control over projects selected for funding, chiefly through detailed contract negotiations before the project, and through review of key deliverables during and after the project. Some readers may reflect that there is some similarity between the DoD's role and the Commission's role, insofar as they both enter into projects as one side in a customer-contractor relationship. The differences are substantial, however. In the DoD's case, it funds projects with the intention of acquiring real delivered systems, of which it will be the user. In the Commission's case, it funds projects with the intention of enhancing industrial capability, of which the "users" will be not the Commission itself but the project participants.

Being model-driven determines the priorities for improvement, depending on an organisation's current level. At level 1, the prescribed focus in on the set of level 2 key process areas; and similarly for level 2 and above. There are exceptions to the model-driven norm. A small proportion of companies, including acknowledged SPI leaders such as Boeing, Hughes or Motorola, have carried out in-depth analyses of the business importance of the software process, from which they have developed company-specific improvement priorities and programmes. Within those strategic programmes, the CMM always plays a role, providing a key performance indicator and a default set of improvement priorities. But it is a part in a much larger whole: the driver is process improvement for its own sake, and such companies can be said to be process-driven. Single-issue-driven improvement in Europe The predominant approach in Europe is to identify a single process issue on which to launch a SPI initiative. This is a natural effect of the diversity of models and methods, and of the short time-scale of ESSI projects. Such companies may be described as single-issuedriven. Known examples of such single issues include the introduction of object-oriented technology, client-server architecture, metrics, project management, requirements capture, reuse, testing, and defect management. These are all process changes, but many of them are focused on the methods and tools that support various key process areas (to that extent they have a strong technological flavour) rather than on the process seen as a whole or on process assessment. Europe also has SPI leaders, such as Ericsson, Philips and Siemens, who have graduated to being processdriven in the same way as the American leaders discussed above. To be concluded In the next issue, European SPI-Glass will conclude its comparison of features of American and European SPI. Acknowledgement Some material in this article is adapted from a forthcoming book to be published by John Wiley & Sons Ltd. Colin Tully can be reached at: Colin Tully Associates, 97 Par Meadow, Hatfield, Hertfordshire, AL95HE, UK. E-mail: [email protected].

SPICE SPOTLIGHT Alec Dorling SPICE Project Manager

Fifth contrast: improvement drivers An empirically based comparison can be drawn between the predominant drivers for SPI in American and European organisations. Again, this contrast can be presented briefly. Model-driven improvement in the USA By far the predominant driver for American organisations that have embarked on SPI is to climb the scale of maturity levels, as a result of undertaking CMM-based appraisals. This is a natural effect of the dominant position of the CMM in the States. Such companies may be described as model-driven.

IVF, Centre for Software Engineering

This edition of SPICE Spotlight brings news of two European funded projects that are currently providing a major boost to the SPICE project. These are the SPIRE (Software Process Improvement in Regions of Europe) project under the EC ESPRIT ESSI (European Software and Systems Initiative) and the PULSE project under the EC department of industry's new SPRITE-S2 (Support and guidance to the PRocurement of Information and


TElecommunications Systems and Services) pilot programme. SPIRE The SPIRE project aims to lower the barriers to successful software process improvement by Small Software Developers (SSDs), defined as organisations employing up to 50 software staff, including small software companies and small software units in larger organisations. The SPIRE project is assisting over 60 SSDs in four European regions (Sweden, Italy, Ireland and Austria) to carry out short mentor assisted software process improvement projects. Experienced mentors guide the SSDs through an assessment of needs, the preparation of a sound plan for a cost-effective small software process improvement project, implementation of the project and evaluation of results. The assessment of needs entails the mentor working with the SSD to define the organisation's business needs and assisting the SSD in carrying out a SPICE selfassessment using one of two software tools (Bootcheck or Synquest) which embody a SPICE version 2 compatible assessment model and which provide assessment results as SPICE conformant profiles. These tools are ideal for use in mentor assisted selfassessments which are completed within 3 to 5 hours. Additionally a seperate confidential staff attitutes survey is undertaken. The whole process is completed in a one day on-site visit. Following analysis of the results, priority areas for process improvement matched to business needs are defined which provide the basis for discussion of potential improvements. The SSD then proposes a focussed improvement project which must be completed within 6 months and must demonstrate quantifiable business benefits. The improvement project plans are then put forward to an independent regional panel which review and approve the individual projects. The SSD can obtain support funding of up to 110K ECUs as a contribution to its own costs plus it also has the assistance of an experienced mentor for up to 10 days free of charge. The mentor will ensure that the project maintains momentum to completion. At the end of the improvement project, a second mentor assisted self-asssessment is performed to compare before and after process capability results. It is intended to submit some of the SPIRE assessment results to the SPICE trials phase 2. This will provide valuable data for comparison of assessment approaches and also quantative data following completed process improvement actions. Based on the experience gained in these projects, SPIRE will generate case studies and other deliverables of value to all SSDs, and disseminate them widely throughout Europe. The experiences gained are expected to have a major impact on company awareness of the benefits of software process improvement. The SPIRE project started in March 1997 and runs until September 1998. All the initial assessments have been completed, with improvement projects being performed between March and August 1998. The SPIRE consortium consists of the Centre for Software Engineering (Ireland), Etnoteam (Italy), IVF center for Software Engineering (Sweden), ARC Seibersdorf (Austria), SIF (Northern Ireland). SPIRE maintains web sites at all partner sites. The home page can be found at .

PULSE The PULSE project is one of 9 projects funded under the EC's new SPRITE pilot programmes. The SPRITE programme aims at the application, validation and/or demonstration of existing and new instuments of support and guidance of software and systems procurement. All projects are linked to standardisation initiatives. All projects commenced in January 1998 and will run for 12 months. The PULSE project aims to combine two approaches for assisting organisations to improve their procurement processes; defining and verifying a formal methodology for identifying and assessing the processes used by an organisation for IT procurement, and identifying a set of organisational actions that improve the way in which procurements are managed and the success of IT procurement teams. The PULSE project will achieve its aims by: developing a methodology with associated tools to allow organisations to assess and benchmark their procurement capabilities and to determine those areas where improvement actions should be taken in order to meet their specific business objectives, and identifying new organisational and communication techniques that allow better integration and teamwork between the three key areas (purchasing, technology development and strategic planning and standards) for any IT procurement. The parts of the project are known as the PULSE methodology and TEAM working aspects respectively. As part of the PULSE methodology the project will: develop an acquisition process reference model develop a detailed acquisition assessment model define an appropriate assessment method develop a software based assessment tool trial the assessment method with user partners across Europe define a training syllabus and certification scheme for assessors develop a methodology licensing scheme present the PULSE reference model to ISO as a plugin extension to an existing standard The original scope of SPICE was intended to include assessment of the customer acquisition processes. It was recognised that project success depended on the capability of the acquisition partner as well as the supplier. The development of SPICE was predominantly undertaken by the world's experts from the software engineering community and from the influence of major purchasers wishing to assess the capability of their software suppliers. The original intent of assessing customer acquisition processes somehow was sidestepped and customer-supplier processes provided the main focus in the model. The PULSE project will ensure that the intended focus is put back on track. The project has already researched a representative set of procurement practices and existing models around the world. By the end of March it will have developed an acquisition reference model as a plug-in extension to the ISO/IEC 15504 reference model. A detailed assessment model and software assessment tool will then be developed and assessments will be undertaken in major European to validate the model. The PULSE project has already created significant interest from major players outside the project in


Australia, the UK and Hungary. A major milestone in the project will be the presentation of the PULSE reference model at the ISO meeting in South Africa in May this year. Based on the experience gained thus far in the development of the PULSE reference model, input is also being provided to the revision requirements for ISO/IEC 12207 Software Lifecycle Processes, which are being finalised in Venice in the Spring of 1998. Expectations are that ISO/IEC 12207 processes will be extended to include systems engineering and system acquisition processes. The PULSE consortium consists of IVF Centre for Software Engineering (Sweden), ATB (Germany), CR2ADI (France) and the Open Group (UK). The project also has 12 major associate user partner organisations represented from, amongst others, the defence, aerospace, pharmaceuticals, industrial and public administration sectors. The PULSE project manager can be contacted at [email protected] Alec Dorling can be reached at: SPICE Spotlight, IEEE Software Process Newsletter, IVF, Centre for Software Engineering, Argongatan 30, S-431 53 Mölndal, Sweden. Email: [email protected]

ANNOUNCEMENTS Call for Participation: Sixth European Workshop on Software Process Technology (EWSPT-6) 16-18th September 1998, near London, UK. For updated information: http://www-dse.doc.ic.ac.uk/~ban/misc/ewspt98.html General Chair: Bashar Nuseibeh, Imperial College, London, UK Programme Chair: Volker Gruhn, University of Dortmund, Germany Programme Committee: Nacer Boudjlida, CRIN, Nancy, France Jean-Claude Derniame, CRIN, Nancy, France Gregor Engels, University of Paderborn, Germany

Aims and Scope

Alfonso Fuggetta, CEFRIEL and Politecnico di Milano, Italy Bertil Haack, WBRZ, Berlin, Germany Carlo Montangero, University of Pisa, Italy Bashar Nuseibeh, Imperial College, London, UK Lee Osterweil, University of Massachusetts, Amherst, USA Brian Warboys, University of Manchester, UK Vincent Wiegel, COSA Solutions, The Netherlands Alexander Wolf, University of Colorado, Boulder, USA Sponsored by: ESPRIT BRWG PROMOTER (Process Modelling Techniques: Basic Research) The software process community has developed a wide range of process modelling languages, process modelling tools, and mechanisms for supporting the enactment of software processes. The focus of this workshop is on extending the focus of this research to the application of software process technology in practice. To emphasise the broadened focus of the workshop, its organisation will incorporate a variety of new kinds of sessions. These include: Academics on trial: In sessions of this type, academics will attempt to "sell" their research to practitioners, who, in turn, will demand economically usable technology. Industrial presentations: In sessions of this type, practitioners will explain their requirements and experiences of process technology. Software Process - Improvement and Practice. Articles scheduled to appear in the next issue of the Software Process - Improvement and Practice journal, published by Wiley (http://www.wiley.co.uk), include: Evan Aby Larson and Karlheinz Kautz: Quality Assurance and Software Process Improvement in Norway Ashok Dandekar, Dewayne E. Perry, and Lawrence G. Votta: Studies in Process Simplification Jim Arlow, Sergio Bandinelli, Wolfgang Emmerich and Luigi Lavazza: A Fine-grained Process Modelling Experiment at British Airways Martin Verlage: Experience With Software Process Modelling 9th International Symposium on Software Reliability Engineering CFP: This will be held on 04-07 November, 1998, in Paderborn,

EMPIRICAL SOFTWARE ENGINEERING An International Journal

EMPIRICAL SOFTWARE ENGINEERING, An International Journal provides a forum for researchers and practitioners to report both original and replicated studies. These studies can vary from controlled experiments to field studies, from data intensive to qualitative. Preference will be given to studies that can be replicated or expanded upon. The aim of the studies should be to expose, in an experimental setting, the primitives of software engineering. Papers on the supporting infrastructure for experimentation would also be appropriate. The focus of the journal is on the collection and analysis of data and experience that can be used to characterize, evaluate and show relationships among software engineering artifacts. As such, a repository will be made available for access and dissemination of the data and artifacts used in studies. Upon acceptance of a paper for publication, authors will be asked to provide, when appropriate, an electronic appendix (containing data sets, experimental materials, etc.), which will be made available on the Internet on a Kluwer-owned server. Detailed instructions for submitting the electronic appendix will be made available to authors of accepted papers. Given an appropriate emphasis on the collection and analysis of supporting data, the following topics would all be within the journal's purview: A comparison of cost estimation techniques An analysis of the effects of design methods on product characteristics An evaluation of the readability of coding styles The development, derivation and/or comparison of organizational models of software development Evaluation of testing methodologies Reports on the benefits derived from using graphical windowing-based software development environments The development of predictive models of defect rates and reliability from real data Infrastructure issues such as measurement theory, experimental design, qualitative modeling and analysis approaches Visit the Empirical Software Engineering Home Page at: http://www.cs.pdx.edu/emp-se/


EMPIRICAL SOFTWARE ENGINEERING An International Journal Editorial Board List: January 15, 1998 Editors-in-Chief:

Victor R. Basili, University of Maryland, USA. [email protected] Warren Harrison, Portland State University, USA. [email protected]

Associate Editors

H. Dieter Rombach, University of Kaiserslautern, Germany. [email protected] Ross Jeffery, University of New South Wales, Australia. [email protected] Koji Torii, Nara Institute of Science and Technology, Japan. [email protected]

Editorial Board

William Agresti, MITRE Corporation, USA. [email protected] Motoei Azuma, Waseda University, Japan. [email protected] Lionel C. Briand, Fraunhoher Inst. For Experimental Software Eng., Germany. [email protected] Bill Curtis, TeraQuest Metrics, Inc., USA. [email protected] Michael K. Daskalantonakis, Motorola, Inc., USA. [email protected] Michael Deutsch, Hughes Network Systems, USA [email protected] Norman Fenton, City University, London, UK. [email protected] Robert Grady, Hewlett-Packard, USA. Watts S. Humphrey, Software Engineering Institute, USA. [email protected] Chris Kemerer, University of Pittsburgh, USA. [email protected] Frank McGarry, Computer Sciences Corp., USA. frank_mcgarry.ssd)[email protected] Stan Rifkin, Master Systems, Inc., USA. [email protected] Norman F. Schneidewind, Naval Postgraduate School, USA. [email protected] Walter F. Tichy, University of Karlsruhe, Germany. [email protected] June Verner, Drexel University, USA. [email protected] Anneliese von Mayrhauser, Colorado State University, USA. [email protected] Larry G. Votta, Bell Labs Innovations, Lucent Technologies, USA. [email protected] Elaine Weyuker, AT&T Bell Laboratories - Research, USA. [email protected] Marvin Zelkowitz, University of Maryland, USA. [email protected] Stuart H. Zweben, Ohio State University, USA. [email protected]


EMPIRICAL SOFTWARE ENGINEERING An International Journal Table of Contents - Volumes 1 & 2

Volume 1, No. 1, 1996 Editorial - Warren Harrison and Victor R. Basili Peer Reviewed Articles: Function Point Sizing: Structure, Validity and Applicability Ross Jeffery and John Stathis The Impact of Software Evolution and Reuse on Software Quality - Taghi M. Khoshgoftaar, Edward B. Allen, Kalai S. Kalaichelvan and Nishith Goel Comparing Ada and FORTRAN Lines of Code: Some Experimental Results - Thomas P. Frazier, John W. Bailey, and Melissa L. Corso Viewpoint: On the Application of Measurement Theory in Software Engineering - Lionel Briand, Khaled El Emam, Sandro Morasca Volume 1, No. 2, 1996 In this Issue - Warren Harrison Editorial - Victor R. Basili Peer Reviewed Articles: Evaluating Inheritance Depth on the Maintainability of Object-Oriented Software - John Daly, Andrew Brooks, James Miller, Marc Roper, and Murray Wood The Empirical Investigation of Perspective-Based Reading - Victor R. Basili, Scott Green, Oliver Laitenberger, Filippo Lanubile, Forrest Shull, Sivert Sorumgord, and Marvin V. Zelkowitz Increasing Testing Productivity and Software Quality: A Comparison of Software Testing Methodologies Within NASA - Donald W. Sova and Carol Smidts Volume 1, No. 3, 1996 In This Issue - Warren Harrison and Victor R. Basili Peer Reviewed Articles: An Instrument for Measuring the Success of the Requirements Engineering Process in Information Systems Development - Khaled El Emam and Nazim H. Madhavji Repeatable Software Engineering Experiments for Comparing Defect-Detection Techniques - Christopher M. Lott and H. Dieter Rombach Estimating Test Effectiveness with Dynamic Complexity Measurement - John C. Munson and Gregory A. Hall Volume 2, No. 1, 1997 In this Issue - Warren Harrison and Victor R. Basili Editorial - An Alternative for Empirical Software Engineering Research? - Warren Harrison Peer Reviewed Articles: Computer-Aided Systems Engineering Methodology Support and Its Effect on the Output of Structured Analysis - David Jankowski A Replicated Experiment to Assess Requirements Inspection Techniques - Pierfrancesco Fusaro, Filippo Lanubile, and Giuseppe Visaggio Monitoring Smoothly Degrading Systems for Increased Dependability - Alberto Avritzer and Elaine J. Weyuker Volume 2, No. 2, 1997 In This Issue - Warren Harrison and Victor R. Basili Guest Editor's Introduction - Lionel Briand Empirical Evaluation of Software Maintenance Technologies - Filippo Lanubile Methodologies for Performing Empirical Studies: Report from the International Workshop on Empirical Studies of Software Maintenance - Chris F. Kemerer, Sandra Slaughter

Fundamental Laws and Assumptions of Software Maintenance - Adam A. Porter The Practical Use of Empirical Studies for Maintenance Process Improvement - Jon D.Valett Qualitative Analysis of a Requirements Change Process Khaled El Emam and Dirk Hoeltje Evaluating Impact Analysis - A Case Study - Mikael Lindvall On Increasing Our Knowledge of Large-Scale Software Comprehension - Anneliese von Mayrhauser and A. Marie Vans Applying QIP/GQM in a Maintenance Project - Sandro Morasca Early Risk-Management by Identification of Fault-Prone Modules - Niclas Ohlsson, Ann Christin Eriksson and Mary Helander Problems and Prospects in Quantifying Software Maintainability - Jarrett Rosenberg Experience With Regression Test Selection - Gregg Rothermel and Mary Jean Harrold Lessons Learned from a Regression Testing Case Study David Rosenblum and Elaine J. Weyuker NASA Shuttle Software Maintenance Evolution - Norman Schneidewind The Study of Software Maintenance Organizations and Processes - Carolyn B. Seaman and Victor R. Basili Report from an Experiment: Impact of Documentation on Maintenance - Eirik Tryggeseth Volume 2, No. 3, 1997 In this Issue - Warren Harrison and Victor Basili Peer Reviewed Articles: How Software Engineering Tools Organize Programmer Behavior During the Task of Data Encapsulation - Robert W. Bowdidge and William G. Griswold A Controlled Experiment to Evaluate On-Line Process Guidance - Christopher M. Lott An Experimental Comparison of the Maintainability of Object-Oriented and Structured Design Documents Lionel C. Briand, Christian Bunse, John W. Daly and Christiane Differding Correspondence: Comments to the Paper: Briand, Eman and Morasca: "On the Application of Measurement Theory in Software Engineering" - Horst Zuse Reply to Comments to the Paper: Briand, El Eman, Morasca: "On the Application of Measurement Theory in Software Engineering" - Lionel Briand, Khaled El Emam, and Sandro Morasca Volume 2, No. 4, 1997 In this Issue - Warren Harrison and Victor R. Basili Peer Reviewed Articles: A Study of Strategies for Computerized Critiquing of Programmers - Barry G. Silverman and Toufic Mehzer Visual Depiction of Decision Statements: What is Best for Programmers and Non-programmers? - James D. Kiper, Brent Auernheimer and Charles K. Ames Viewpoint: Meta-Analysis - A Silver Bullet - for Meta-Analysts - Andy Brooks Workshop Report: Process Modelling and Empirical Studies of Software Evolution - PMESSE `97. - R. Harrison, L. Briand, J. Daly, M. Kellner, D.M. Raffo and M.J. Shepperd


Germany. ISSRE 98 is sponsored by IEEE Computer Society. For further information contact the publicity chair: Lionel Briand, Fraunhofer IESE, Sauerwiesen 6, D-67661 Kaiserslautern, Germany, E-mail: [email protected]. 13th IEEE International Conference on Automated Software Engineering - ASE'98: Call For Papers. October 13-16, 1998, Honolulu, Hawaii, USA. The IEEE International Conference on Automated Software Engineering brings together researchers and practitioners to share ideas on the foundations, techniques, tools and applications of automated software engineering technology. Both automatic systems and systems that support and cooperate with people are within the scope of the conference, as are computational models of human software engineering activities. ASE-98 encourages contributions describing basic research, novel applications, and experience reports. The solicited topics include, but are not limited to: architecture, automating software design and synthesis, automated software specification and analysis, computer-supported cooperative work, groupware, domain modeling, education, knowledge acquisition, maintenance and evolution, process and workflow management, program understanding, re-engineering, equirements engineering, reuse, testing, user interfaces and human-computer interaction, and verification and validation. Paper Submission deadline: May 8, 1998 (email abstracts by May 1, 1998). Send six copies to David Redmiles, Information and Computer Science, University of California, Irvine, CA 92697-3425, USA; Tel: +1 714 824-3823; Fax: +1 714 824-1715; Email: [email protected]. Latest information can be obtained from "http://www.ics.uci.edu/~ase98" International Software Engineering Research Network (ISERN) Technical Reports for 1998 available. ISERN is a community that believes software engineering research needs to be performed in an experimental context. By doing this we will be able to observe and experiment with the technologies in use, understand their weaknesses and strengths, tailor the technologies for the goals and characteristics of particular projects and package them together with empirically gained experience to enhance their reuse potential in future projects. ISERN consists of a group of organizations around the world conducting, sharing, and promoting empirical research in software engineering. The Technical Reports of ISERN for 1998 are now available on the Web at: http://www.iese.fhg.de/ISERN/pub/isern_biblio_tech.html. The available titles are: Quality Modeling based on Coupling Measures in a Commercial Object-Oriented System Benchmarking Kappa for Software Process Assessment Reliability Studies SPICE: An Empiricist's Perspective Defining and Validating Measures for Object-Based High-Level Design Implementing concepts from the Personal Software Process in an Industrial Setting The Internal Consistency of the ISO/IEC PDTR 15504 Software Process Capability Scale A Comprehensive Empirical Validation of Product Measures for Object-Oriented Systems A Case Study in Productivity Benchmarking: Methods and Lessons Learned The Repeatability of Code Defect Classifications Studying the Effects of Code Inspection and Structural Testing on Software Quality A Comparison and Integration of Capture-Recapture Models and the Detection Profile Method Automated Software Engineering Data Collection Activities via the World Wide Web: A Tool Development Strategy applied in the Area of Software Inspection Evaluating the Usefulness and the Ease of Use of a Webbased Inspection Data Collection Tool Cost Implications of Interrater Agreement for Software Process Assessments Success or Failure? Modeling the Likelihood of Software Process Improvement Investigating Reading Techniques for Framework Learning A Comparison of Tool-Based and Paper-Based Software Inspection

Automatic Collation of Software Inspection Defect Lists Explaining Cost for European Space and Military Projects Communication and Organization: An Empirical Study of Discussion in Inspection Meetings Communication and Organization in Software Development: An Empirical Study Applying Meta-Analytical Procedures to Software Engineering Experiments Statistical Analysis of Two Experimental Studies Estimating the number of remaining defects after inspection Applications of Measurement in Product-Focused Process Improvement: A Comparative Industrial Case Study Business Impact, Benefit, and Cost of Applying GQM in Industry: An In-Depth, Long-Term Investigation at Schlumberger RPS An Assessment and Comparison of Common Software Cost Estimation Modeling Techniques COMPARE: A Comprehensive Framework for Architecture Evaluation A Comprehensive Investigation of Quality Factors in ObjectOriented Designs: An Industrial Case Study

Production Team The Software Process Newsletter production team are: Victoria Hailey (VHG Corp.): Copy Editor Dirk Hoeltje (Positron Inc.): SPN webmaster

Send Articles for SPN to: Khaled El Emam Fraunhofer - Institute for Experimental Software Engineering Sauerwiesen 6, D-67661 Kaiserslautern, Germany [email protected] All articles that appear in the newsletter are reviewed.

STEERING COMMITTEE The members of the Steering Committee of the Software Process Committee are: Jean-Normand Drouin (Canada) Alfonso Fuggetta (Italy) Katsuro Inoue (Japan) Marc I. Kellner (U.S.A.) Nazim H. Madhavji (Canada) H. Dieter Rombach (Germany) Terry Rout (Australia) Wilhelm Schaefer (Germany) Lawrence G. Votta Jr. (U.S.A.)


SOFTWARE PROCESS NEWSLETTER

SOFTWARE PROCESS NEWSLETTER

Suggest Documents

SOFTWARE TEST PROCESS DEVELOPMENT

Software as Labor Process

SOFTWARE PROCESS MODELLING

Teaching Software Process Modeling

software process improvement

Software as Labor Process

Process-Based Software Engineering

Software Process Models - CiteSeerX

Software Process Management - CiteSeerX

Software Process Models

The software process

Software Process Modelling - CiteSeerX

Software Process Decision Support: Making Process ... - CiteSeerX

BOOSTER*Process A Software Development Process ... - CiteSeerX

BOOSTER*Process A Software Development Process ... - CiteSeerX

Software Process Measurements using Software ... - Semantic Scholar

Software Process Improvement Works! - Software Engineering Institute

Software Process Improvement Framework for Software ... - GUPEA

Automating Software Development Process - SERSC

Software Process: A Roadmap - CiteSeerX

Process mining software repositories - CiteSeerX

Process Models in Software Engineering

Software process/project model - CiteSeerX

Process Models in Software Engineering