tive survey of the verification and validation processes at. 11 Swedish ..... Automation has mentioned that the process of changing project documents is slow, with several mandatory steps to go through; to perform ..... are not always entirely positive, for example when it comes to requests of a .... When is the testing finished?
Verification and Validation in Industry - A Qualitative Survey on the State of Practice Carina Andersson and Per Runeson Software Engineering Research Group Department of Communication Systems Lund University, Box 118, SE-221 00 Lund, Sweden {carina.andersson,per.runeson}@telecom.lth.se
Abstract Verification and validation activities take a substantial share of project budgets and need improvements. This is an accepted truth, but the current practices are seldom assessed and analyzed. In this paper, we present a qualitative survey of the verification and validation processes at 11 Swedish companies. The purpose was to exchange experience between the companies and to lay out a foundation for further research on the topic. The survey is conducted through workshop and interview sessions, loosely guided by a questionnaire scheme. It is concluded from the survey that there are substantial differences between small and large companies. In large companies, the documented process is emphasized while in small companies, single key persons have a dominating impact on the procedures. Large companies use commercial tools while small companies make in-house tools or use shareware. Common to all the surveyed companies is that verification and validation is considered important, and thus having rather high status. An experience from the survey as such, is that the information exchange between the companies during the survey was considered very valuable to the involved subjects.
1. Introduction Verification and validation (V&V) are the activities performed during a software development project to ensure that the right system is developed and meets the expectations of its customers (validation) and that the developed system is correct and conforms to its specifications (verification) [Boehm79]. The verification and validation of software systems take a substantial share of project budgets. Classical rules of thumbs indicating how much time is spent on V&V still seem to be valid. Brooks, for example, devotes 50% of the budget for testing in his classical book “The Mythical Man-Month” [Brooks75]. However very little effort is spent actually surveying the current status of V&V activities today. Other parts of the
development process are surveyed, for example requirements engineering [Lubars92, Groves00], and particular attention has been paid to the application of scenarios [Weidenhaupt98]. A recently conducted quantitative survey, with focus on lead time consumption, identifies the test process as taking very much of the lead-time when developing distributed real-time systems [Bratthall00]. In order to get a better picture of the current practice in industry, a qualitative survey has been conducted and is reported in this paper. The purpose is to learn how V&V is conducted in a set of companies, investigate variations between the companies, and trying to identify patterns in the observations. The survey serves two different purposes for the stakeholders in the survey. For the researchers, the survey acts as a starting point for further research on improvement of V&V processes. They want to identify issues relevant to industry for further research. For the industry participants, the survey acts as an experience interchange between peers in different non-competing companies. The paper is structured as follows. In Section 2, the methodology used in the survey is presented. Section 3 discusses the trustworthiness of the study. Section 4 presents the surveyed companies and their basic characteristics. In Section 5 we report the observations made, the analyses conducted and the findings made during the survey, and finally, a summary is presented in Section 6.
2. Methodology The empirical research conducted within software engineering has so far primarily been of quantitative nature. However, software engineering empiricists should, like Robson in the social science research, play down the differences and regard the differences between qualitative and quantitative research as primarily technical [Robson93]. A quantitative study can provide exact answers to research questions, including statistical analysis with certain confidence levels [Wohlin00]. However, the studied questions are for practical reasons of detailed nature, for
example, “is inspection technique A more efficient than technique B?” [Thelin01], or “are inspection meetings worth spending effort on?” [Porter97]. A qualitative study is less exact of its nature [Robson93]. No statistical significance is achieved, and no exact figures are reported. On the other hand, broader questions can be researched, like “why do/ do not people use inspections?” or the question of this study, “how are companies conducting V&V?”. Both quantitative and qualitative research should be conducted in order to explore, explain and improve software engineering practice. This study is a qualitative study, based on data collection in workshops and interview session, using the focused interview technique [Robson93]. The procedures followed are presented in the next sub-section.
2.1. Procedures The survey was conducted in two cycles. First, five companies were surveyed using the workshop format. The companies attended a workshop series, organized within a local software process improvement network (SPIN)1. The attendants assigned themselves to the workshop based on their interests. The workshop cycle was conducted as follows: 1. The workshop host presented the company V&V activities. The hosts were free to present any topic, as long as they included a list of strong and weak issues regarding their V&V process. 2. The researchers checked that a list of questions was covered (see Appendix A) and raised questions if not. 3. The researchers summarized the meeting in a report, which was proofread by the company representative. 4. The findings were analyzed, compiled into a joint report and fed back to the companies. In the second cycle another six companies were surveyed using a more direct interview format. Those companies were approached by the researchers, and selected to achieve diversity with respect to size, age of company, and application domain. This cycle was conducted in a similar manner, except that the only participants at the meeting were the interviewees and the researchers. Finally, within the second cycle, one of the first five organizations was interviewed again, since they were at a point of major process change at the time of the first interview. The second interview was conducted six months later, when some of the changes had taken place.
2.2. Researcher View A qualitative survey is not an objective study, but a view of the world seen from a researcher’s viewpoints. In order to enable critical review of the observations and conclusions of the study, the viewpoints of the researcher are reported here, leading to the research questions investigated. Below, a few statements summarize the values of the researchers, which performed this investigation. • Process focus. The documented development process is 1. http://www.spin-syd.org
a means for communicating within an organization and for capturing experience. A process focus, i.e. to have a defined process, may contribute to manage and control the organization’s activities. • Balance between process and people. All knowledge and experience cannot be elicited in a documented process. Software engineering is hence dependent on individuals. The process is assumed to be more important for large companies, while individuals may be more important for smaller companies. • Inspections. It is assumed, and to some extent empirically shown [Basili87], that inspections are efficient means for early defect removal. Inspections are also assumed to contribute to information spreading within a project or an organization. • Structured V&V. It is assumed that a structured approach to V&V, for example statistical testing and model-based development, would help many companies, and would improve their efficiency and effectiveness. The view is further defined in terms of the set of questions raised, and might affect unconsciously in the observations and the interpretation of the observations. Hence this open presentation of the view provides the readers with means to make their own interpretation. The research questions derived from the researcher viewpoints are more specifically: RQ1. How much is the documented process emphasized by the organizations? RQ2. Are there any relations between the process emphasis and characteristics of the organizations or their products? RQ3. Which criteria are ruling the selection and improvement of the process?
3. Trustworthiness In a quantitative study, the concept of validity is central when judging the result of an empirical study [Wohlin00]. In qualitative studies, the concept of trustworthiness is in focus instead [Lincoln85]. Different aspects of trustworthiness are credibility, transferability, dependability and confirmability. They are presented below together with comparisons to their validity counterparts, and an analysis of the trustworthiness of the current study. Credibility concerns the identification and description of the subjects in the study. The parallel to quantitative studies is internal validity, in particular instrumentation, while for example, the validity threat of statistical regression is not applicable. The primary techniques used for improving credibility are triangulation and peer debriefing. Triangulation involves the use of multiple sources to enhance the rigour of the research and peer debriefing contribute to guarding against researcher bias. One measure taken towards triangulation is to have two kinds of questions: direct questions where the interviewees are asked how they perform their V&V, and indirect questions where they are asked what is good and bad in their V&V process. Peer debriefing includes that the researchers in pair performed the analysis of the current study, and the results are
reported back to the subjects of the study in writing and in seminars. Transferability is the counterpart to external validity. The purpose of qualitative studies is not to make generalizations, but to explore the variation between and within the different cases. We have tried to vary parameters, such as age of subject organizations, size and application domain, to get a sufficiently varying foundation for the study. Dependability issues are highly related to credibility issues. The dependability is related to how the study is conducted, i.e. investigation procedures are systematic, well documented and clear, with the purpose to provide safeguards against researcher bias. The validity counterpart in quantitative studies is reliability [Robson93] or conclusion validity [Wohlin00]. In this study, the investigation procedures are openly reported (see Section 2.1), the researcher view is presented (see Section 2.2), data collection is documented, although not publicly available for confidentiality reasons. The presented observations reflect the survey participants’ answers, i.e. it is the company representatives’ own pictures which is presented, with a risk that it is polished. On the other hand, the participants had nothing to gain by giving false information. However, in any improvement program, the starting point is to elucidate the existing situation by means of the involved personnel. Confirmability has to some extent its counterpart in construct validity. Is the study designed to show what it is intended to show? In qualitative studies, the means for evaluating confirmability are to assess the process and to assess the collected data. One option is to assign an auditor to assure the quality of the study. This is not conducted within the current study. In summary, the study is performed according to a structured method and reported as openly as confidentiality issues allow. Additional measures could have been taken, such as triangulation using different methods or involving external survey process auditors.
4. The Surveyed Companies In the survey eleven companies operating in the southern part of Sweden have taken part. The companies are listed in Table 1, and referred to as pseudonyms for confidentiality reasons.
4.1. Products The participating organizations are characterized in terms of their developed products, and the products are categorized regarding application domain or type. Radar develops large real-time systems, including image analysis. Phone develops consumer products in the area of communication. Network develops software, primarily embedded in their own products for communication and storage. The core product of Automation is a stand-alone software product, used to develop applications in the domain of control and communication. Read, Security and Monitor do all develop products with embedded software for image processing, but do also co-operate with industrial partners who use the products as components in their systems. The surveyed parts of Product, Sales and SubSales develop support systems for internal or semi-internal use. The core products of Product and Sales are not in the software development domain, but still they have additional software development to manage the internal needs of sale support systems and products information systems. SubSales is a subcontractor to Sales, and is developing the software framework, to which Sales add the domain specific parts.
4.2. Size The variation in size among the participating companies is noticeable, ranging from small companies like SubSales, with 6 developers, to Radar with approximately two hundred developers, see Table 1. The size of the organizations relate to the development staff on one site. If other departments exist elsewhere, these are not included in the presented figures.
Table 1. Participating organizations. Company
Products
Customer
Business model
# Developers
Age (years)
Radar
Radar image processing
Defence
Contract
200
>20
Phone
Communication
Private
Market
150
~15
Network
Communication
Business
Market
120
18
Automation
CASE tools and control
Business
Market
80
~25
CASE
CASE tools
Business
Market
50
11
Read
Image processing
Private
Contract/Market
30
6
Security
Image processing
Business
Market
20
5
Monitor
Image processing
Business
Market
20
3
Product
Support systems
Internal
Contract
7
4
Sales
Support systems
Internal
Contract
7
10
SubSales
Support systems
Subcontractor
Contract
6
11
The history of the companies is also different; some of them are established many years ago, whereas others are recently started. The ages in Table 1 reflect the number of years the surveyed activities in each company have been active.
4.3. Product Values The developed products are also categorized due to whether the products are focused on features or characteristics. Different products may have their richness in the features or in the characteristics of the functionality. For a scanner for example, may the characteristics be of most importance; the quality of the image or the speed of the image transfer. Of minor interest to the user may the features be; the possibilities to manipulate, like rotating, the image. Radar, Read, Security and Monitor are companies that develop products that are dependent on the characteristics of the functionality but comprise a smaller number of features, which requires another testing focus, see Figure 1. Often the quality of the output is of most concern in a product that has more characteristics. For example in image processing, algorithms may be changed, and the testing has to verify that it has improved the characteristics for a given set of situations. Phone Network Automation CASE Product Sales/SubSales
Primarily features
Radar Read Security Monitor
Primarily characteristics
Figure 1. Features vs. characteristics
5. Observations This section describes the observations that were made during the workshops and interviews. It is, as mentioned before, the researchers’ viewpoints that are reported here. The observations are grouped into different categories. In Section 5.1, observations related to the development process are reported. Section 5.2 reports on the development environment. The more subjective values within the companies are reported upon in Section 5.3. Hence, the first two subsections contain issues, which are more objectivity observable, while the latter depend on the interviewees’ view of their companies.
5.1. Process The observations made regarding the V&V process are focused on the documented process models, the documentation used in the project, the specific test methods used, the organization and finally issues on communication about the V&V activities.
5.1.1. Process Models. The discussions concerning the V&V process proceeded from a general test process, according to Figure 2, since all of the companies had some kind of model to follow, though not always documented. Everybody was able to map his or her activities into this general model, though naturally with different degrees of formalism. The survey displayed a spectrum of process definition, ranging from a very well defined process at Radar to a very informal and unemphasized at SubSales. Descriptions of each company process are available in Appendix B. Table 2 shows the relationships between company age and the degree of emphasis of the process. Table 3 instead focus on company size. A common opinion among the companies is that it is important to increase the visibility of the verification and validation process, and allow the members of the organizations to understand the importance of the verification and validation. Communicating the process is of high priority, and often a part of the ongoing improvement work. Worth noticing is the overall awareness of the need for continuous improvement, though the approach taken to improvements varies a lot. Some of the companies have a history without any formal process and existing documentation, but today they realize the need of a defined, visible process. Training and motivating the organization is of importance, especially including the developers who need to have an understanding of the verification and validation process, as they perform the early tests. 5.1.2. Documentation. The type of produced artifacts and the consistencies between them often depend on how well the companies’ processes are defined. Among the recently started companies, the quality of the produced documents often varies, without company standards and at varying levels of detail. However, as the organizations are small, this is not considered very harmful. Radar is working in the defence domain, which implies strict documentation as a natural thing, with high requirements of detailed documentation, which still is a necessity. The process defines very clearly which documents should be produced in each phase. For example, already during the module tests, the developers should document formal test results. The developers at Phone also have to document their results from the module tests before delivering to the next phase. Extensive documentation is also required in the other process phases at Phone. Automation has mentioned that the process of changing project documents is slow, with several mandatory steps to go through; to perform the change, to convene a group of Module testing
Integration testing
System testing
Acceptance testing
Figure 2. The general test process
Table 2. Process emphasis versus company age. 0-6
7-12
13-19
Emphasized
20Radar Automation
Phone Network CASE Security Monitor Read Not Emphasized
Product
Sales SubSales
reviewers, to review, to update, and to sanction the proposal. This is considered inefficient, partly depending on the size of the company. Some of the other surveyed companies have the advantage of being able to introduce changes more quickly. One example is Security, which emphasized their existing positive attitudes to changes, the ease of introducing new routines, such as new templates. They do not have the need of going through several instances when making decisions. Read has not had any prescribed documentation, but this is changing. This is a part of their new process, which also points out that the test strategy will be decided early in the projects. Monitor recently started organizing its documentation and has noticed quality variations, depending on who has written the documents. This is however an area that is recognized as a future improvement area in this company, and the goal is to make the documentation more uniform. Sales is another company with informal documentation, including few rudimentary test specifications. 5.1.3. Methods. None of the companies in the survey use any statistical methods, like reliability growth models, to decide when to stop the testing. A few examples of used V&V methods were presented during the survey. Radar and Automation are experimenting with the use of factorial design methods to decide on which test cases to run [Berling00, Cohen96]. Monitor has evaluated factorial design methods, but concluded them to be too extensive to apply in their context. Security and Monitor are developing test matrices, which visualize the selection of combination that will be covered by the test cases and document that the chosen test cases cover all requirements. This has improved the traceability between requirements and test cases, which in the survey often is mentioned as a difficulty. Automation has informally tried pair specification in a smaller team, with good results. Pair specification is an analogy to pair programming, which means that developers work operatively in pairs in front of the computer [Beck99]. How-
ever, it is not written down as a method and there are no instructions for this. Among the more recently started companies the spirits and ideas from the first employees are still clearly visible. These ideas have affected the companies’ present strategies. One example is Security, which from the start has had an employee that considered code inspections as important and valuable to the company. His interest in this area has affected the other employees and the importance of code inspections has been emphasized from the beginning, and thereby all of the code is thoroughly reviewed. Both Automation and Security have mentioned their inspections as actions they are considering as strengths in their companies. At Automation it is part of the well-defined process to review produced documents, however, they also mention that the inspection of code is an action that needs improvement to be more efficient. 5.1.4. Organization. The larger companies have separate test teams, while the smaller companies usually have one responsible person for verification and validation. However, at the small companies the rest of the employees are often involved part-time in the testing. The developers generally perform the unit and module testing. The various phases of a life-cycle model, see Figure 2, may overlap to a considerable extent, and many test tasks may actually span multiple phases [Kit95]. This gives an indistinct delivery to the testers, that might be hard to define. Radar has a separate organizational unit for system verification. Depending on the customer requirements, software modules may be tested by the developer or by the system verification unit as an independent V&V procedure. As the system is growing larger and larger, the development unit now performs some integration tests that were previously conducted by the system verification unit. This implies that subsystems are delivered to the system verification unit instead of modules. Developers at Phone have to sign a test report when they have performed the module tests, consisting of a specified
Table 3. Process emphasis versus # developers. -9
10-49
Emphasized
50-100
101-
Automation
Radar Phone Network
CASE Security Monitor Read Not Emphasized
Product Sales SubSales
number of test cases. Thereafter the code is delivered at an acceptance meeting, to the integration test group. After the integration tests another acceptance meeting takes place before delivering to the system tests. Network has a quality assurance (QA) group, which is responsible for the system testing and is also engaged in reviews. In the opinion of the QA-group, it is better to do all the testing after the development is finished, otherwise there is a risk that the developers reintroduce defects in the product and the need of regression tests increases. The QA-group receives the product after the integration tests are performed with the criterion that all functionality should have been implemented. At Automation the separation between module and integration tests of the components is difficult, since its development is based on daily build, followed by the testing of the whole system. The testers are responsible of the system tests, which are performed after the integration of the product. The acceptance tests are carried out as separate projects, without influence by former test teams. The test group at CASE has recently grown noticeably as a result of reorganizations. From having a test group that was small and primarily consisted of students, having a job on the side, the company now has a test group consisting of professional testers. The project staff is almost equally divided into developers and testers. The subcontractor SubSales performs most of the programming of Sales’ product, while the major part of the testing is performed by Sales themselves. This influences Sales’ test organization; the majority of the project members are involved in the test activities. At Product, a similar organization is used, but in this case there are two sub-departments, one responsible for development and one for test. 5.1.5. Communication and Experience. The small companies in the survey emphasize the possibilities for informal communication. In companies with only a few employees, it is easy to start communicating, for example around the water
tank, about things that are important to know in order to do the job properly, and to spread understanding. Automation openly discuss the problems regarding communicating and establishing the processes and instructions with the employees. They would also like to see improvements in the area of using the experience available in the company, but not accessible in a useful way. For example, it is desirable to use existing test experiences to make better test specifications, however they have no implemented method for this. Even though it is desirable to pass test experiences on to junior employees, the available experiences, which often anyway result in good quality products, are mentioned. At several companies, a major part of the personnel has worked for many years with the systems. This has made the organizations flexible when it comes to testing, and the problem solving ability is noticeable. Most of the training is performed internally at each company, comprising of, for example, test tools and environments, processes and methodology. Other skills are obtained through experience. 5.1.6. Interpretations. Our interpretation of the observations concerning the documented V&V process is the following: • The process is more emphasized in large companies. Smaller companies have started to identify inconsistencies in the documentation standard, but have not considered these being too harmful. This difference is not related to age of the companies, but the age and the size is correlated on most of the cases observed. • The approach taken to improvement is very much dependent on the people in the company. Read, Security and Monitor have a joint background and some core technologies in common, but Monitor is focusing on hands-on activities, Read is defining process models and Security is working with an assessment model. The choice of improvement model is to a large degree influenced by key persons making the improvement decisions.
•
The use of structured approaches to V&V is sparse. Instead the selection of test cases is very much based on the experience of the staff. The unit and module tests are mostly performed informally by the developers. This is an area often mentioned as a candidate for improvement. On the other hand, no one has reported particular problems that can be traced back to the lack of structured methods specifically.
5.2. Tools and Environment The observations within tools and environments for V&V are made concerning defect reporting systems used, test automation and configuration management tools and procedures. 5.2.1. Defect Reporting Systems. The defect reporting systems at the companies are generally used when someone else than the developers are testing the product. These systems are sometimes built at the companies, sometimes bought externally. The larger companies use a commercial tool, while the smaller use in-house tools, freeware or simply e-mail communication. In one small company, Monitor, an alarming e-mail is sent as a broadcast to all the developers, and the responsible developer acts on this. At Product and Sales, with their small project sizes, the problems are often handled informally, reported over the telephone. At larger companies a change control board, more or less formally, appoints a developer to correct a found defect. Mentioned at Phone is the lack of information whether the fault is reported during function or system testing, which could be desirable for fault monitoring. 5.2.2. Test Automation. Practically everyone has mentioned a wish of increasing the use of automatic tests. However, the lack of time for introducing and implementing automation conflicts with this wish. Read, who has currently restructured its V&V process, has also thought of test automation, but is considering this to be of less importance. There are other areas that have higher priorities at the moment. All the testing is however not done manually at the companies. Radar uses simulated environments in its verification, to reduce the costs of testing the product in the operational environment. Phone has tools to ensure that memory leaks are found during the module tests, and to measure code coverage. During the function and system test phases more automation is desirable. CASE has changed their software architecture to support of using test scripts. Read has a robot to support some testing of their products. Security and Monitor have recorded material in their databases, series of images and movies, to test their products with, in order to be able to verify adjustments in the image analysis algorithms. Sales is using automatic test programs combined with manual tests when performing regression tests. Their intention is to make SubSales, their subcontractor, to also perform automatic tests before delivery. The problem with distributed systems is mentioned during the survey. Different configurations of third part products
imply that a lot of test time is required. At Sales the lack of routines for this at the installation tests is noticeable, and considered to be an area in need of improvements. 5.2.3. Configuration Management. The documents and other artifacts produced are handled with different degrees of formalism in the companies. The larger companies have well defined processes with routines and rules that imply a documentation that is formally reviewed and version managed. Commercial tools are used for the configuration management. Automation and Security both consider their configuration management as working well. Monitor has recently taken the step towards a more organized configuration management, while at Product the configuration management is handled very informally, without any possibility to fall back to document versions between baselines. 5.2.4. Interpretation. Our interpretation of the observations regarding tools and environment falls back on the differences between small and large companies. • Small companies have in-house or freeware tools, large companies have externally developed tools. This is, of course, a matter of the ability to invest in powerful tools, related to the less visible cost of spending working hours to develop and maintain internal solutions. When it comes to configuration management, again the large companies are more organized, but also smaller companies may have a well organized configuration management. • Regarding automation, the companies seem to have a balanced view between the vision of saving test execution time and the awareness that test automation requires extensive investment in test scripts and environments. The result so far is that little test automation is conducted.
5.3. Corporate Values A third area of observations is softer issues related to attitudes towards the V&V activities, key priorities for the company and their relations to business partners. Observations related to these issues are here gathered under the headline of corporate values. 5.3.1. Attitudes. Among the small companies all the employees usually take joint responsibility for the developed product. Everyone is doing one’s share to get a complete product according to the specifications. Hence the developers usually give positive response when the testers find faults in the product. This attitude seems harder to achieve in a larger company, where it is harder to grasp the final product due to its size and complexity, and where processes, roles and responsibilities are defined to ensure that the right person takes action. The general positive will is present in these companies as well, but there is a need for more structure. Although there is a strong will, the attitude against the testers are not always entirely positive, for example when it comes to requests of a
more thorough documentation. The attitudes may also change during the project, varying from positive to more negative when the pressure increases in the final phases of the project. 5.3.2. Key Priorities. All participants were asked about their priorities between costs, quality, and time; which factor they considered to be of most importance in their projects. The most common answer was quality, while time had second priority, but these answers are not completely unambiguous. Radar, CASE, Monitor and Sales consider the time schedule to have highest priority, and if it comes to time pressure, instead they compromise and reprioritize the functionality. Phone is currently restructuring their development process to be more incremental. Considerable delays in former projects have forced this change, since release dates are of high priority. The new process facilitates the possibility to release a well tested product without implementing a subset of the planned features if the project of any reason is delayed. At Network they consider it more acceptable with a greater number of faults in a new product, which implies that the priorities are different in different projects. Security also mentions that the process conformance is dependent on which kind of project is running. If it, for example, is a pilot project with the purpose of showing proof of a concept, more short cuts may be taken. Product instead has the choice of delaying its releases until they consider themselves to be ready or postponing some implementations to later releases. When considering the priorities of time, the issue of planning was mentioned as a problem in most of the surveyed companies. In Monitor, it is the goal of their new process that the tests should be planned at an early stage, as early as the start of the project planning activities. All of the surveyed companies agree with Monitor that this is of importance and their ambition as well. Read has also mentioned in their proposed process that the plans of having activities throughout the product’s life-cycle should be established at project start, instead of that the test work starts at the end of the projects. Especially the planning for changes, e.g. change request, is mentioned as an area that needs improvements. Time plans are often changed due to many participants involved in the projects, and thereby unexpected occurrences. 5.3.3. Relations to Business Partners. Half of the interviewed companies have their business relations to a market, rather than to specific customers. Hence some of these issues do not apply to their situation. Some of Radar’s verification activities are performed in cooperation with their customers, though it is important to have communicated the agreements of the acceptance activities, which vary from customer to customer, before these are supposed to be performed. Sales mentioned their attempt to be proactive by distributing known problems in the software to their customers. However, this gave negative feedback. The customers instead became disappointed in the amount of existing bugs, even though the amount was not higher than in earlier releases. Like Sales, Product has only internal customers, which allows them to be able to be conservative in the growth of
functionality. Implementations can easily be postponed to later releases, and the developers can be fastidious when it comes to “flashy” features. Sales is an evident example of organizations having very close relationships with some of their subcontractors. It is a part of the agreement that SubSales delivers semi-manufactured products, very much like an internal development group, while other subcontractors to Sales deliver a final product. The relations to SubSales are very informal and personal; contacts are handled by telephone, even when it comes to changes in the specifications. This relation, which occasionally is considered not to be businesslike enough, and its difference from relations with the other subcontractors, has been pointed out as a culture problem. 5.3.4. Interpretation. Our interpretation of the observations regarding corporate values and attitudes is the following: • V&V is not a discipline where inexperienced staff is considered to be suitable. Instead most of the companies agree on the need for skilled software testing professionals. In the surveyed companies the testers achieve a fair share of recognition and rewards for contribution made to product quality and the status of being part of a test team has improved. • Regarding key priorities, the answers are ambiguous, though the choice is often between time and quality, which results in compromises in functionality. Differences can be seen by observing the companies in view of their customers. The companies with internal customers are not exposed to competition, and hence can compromise more freely about functionality and occasionally release dates.
6. Summary All of the companies have problems and difficulties in their V&V processes, which might be improved by new approaches and ideas. Several similar experiences have been recognized, even though it is gained under different circumstances in different environments. The attempts of finding some patterns in making comparisons between the surveyed companies resulted in an almost linear dependency between the level of emphasized process and the size of the companies. A question asked at the moment of the analysis was when a company grows over the threshold, and when the need of more defined procedures gets significant. No conclusions on this matter were however drawn. Noticeable was how one key person in the small companies could have considerable impact, while this was a little bit harder to discover at the larger companies. The general opinion among the participants of the survey is that this has been valuable to all involved parties. The comprehensive discussions, between the participants during the workshop, of general V&V process issues have shown new possible approaches, both for the recently started companies and the more established ones. During the meetings the discussions have mostly concerned generally applicable situations and ideas such as methods, artifacts, and tools, while
specific techniques, test selection procedures, and detailed levels of test environment have been left out. Thus, the discussions have been of interest to everybody and the exchange of information has been successful, despite the participants’ different basic conditions. Regarding research methodology, the survey is conducted using a structured procedure, reported as openly as possible. Qualitative methods are relatively new to the software engineering domain; hence a practice has to evolve on how to perform such studies. This study represents an initial attempt, aiming at both characterization of current practice in industry as a starting point for further research and for experience exchange between peers in the surveyed companies.
Groves00
L. Groves, R. Nickson, G. Reeve, S. Reeves and M. Utting, “A Survey of the Software Development Practices in the New Zealand Software Industry”, Proceedings 12th Australian Software Engineering Conference, Canberra, Australia, pp. 189-201, 2000.
Kit95
E. Kit, Software Testing, in the Real World, Addison-Wesley, 1995.
Koomen99
T. Koomen and M. Pol, Test Process Improvement, A practical step-by-step guide to structured testing, Addison-Wesley, 1999.
Lincoln85
Y. S. Lincoln and E.G.Guba, Naturalisitic Enquiry, Newbury Park and London, 1985
7. Acknowledgement
Lubars92
M. Lubars, C. Potts and C. Richter, “A Review of the State of the Practice in Requirements Modeling”, Proceedings 9th IEEE International Symposium of Requirements Engineering, San Diego, CA, USA, pp. 2-14, 1992.
Porter97
A. A. Porter, H. P. Siy, C. A. Toman and L. G. Votta, “An Experiment to Assess the Cost-Benefits of Code Inspections in Large-Scale Software Development”, IEEE Transactions on Software Engineering, 23(6):329-346, 1997.
The researchers are thankful to the participating companies within SPIN-syd for their contribution. Thanks also to Daniel Karlström, Thomas Olsson and Dr. Martin Höst at the Department of Communication Systems at Lund University for reviewing an earlier draft of this paper and being involved in discussions on the topic. This study was partially funded by The Swedish Agency for Innovation Systems (VINNOVA) under grant for The Center for Applied Software Research at Lund University (LUCAS).
8. References Basili87
V. R. Basili and R. W. Selby, “Comparing the Effectiveness of Software Testing Strategies”, IEEE Transactions on Software Engineering, Vol. 13, No. 12, pp. 1278–1298, 1987.
Beck99
K. Beck, “Embracing change with Extreme Programming”, IEEE Computer, Vol 42, No. 10, pp. 70-77, 1999.
Berling00
T. Berling and P. Runeson, “Application of Factorial Design to Validation of System Performance”, Proceedings 7th Annual IEEE International Conference and Workshop on the Engineering of Computer Based Systems (ECBS), Edinburgh, Scotland, UK, pp. 318-326, 2000.
Boehm79
B. Boehm, “Software Engineering: R & D Trends and Defence Needs”, Chapter 19 in Research Direction in Software Technology (P. Wegner ed.), Cambridge, MA, MIT Press, 1979.
Bratthall00
L. Bratthall, P. Runeson, K. Adelswärd-Bruck and W. Eriksson, “A Survey of Lead-Time Challenges in the Development and Evolution of Distributed Real-Time Systems”, Information and Software Technology, Vol. 42, No. 13, pp. 947-958, 2000.
Brooks75
F. P. Brooks, The Mythical Man-Month, AddisonWesley, 1975.
Cohen96
D. M.Cohen, S. R. Dalal, J. Parelius and G. C. Patton, “The Combinatorial Design Approach to Automatic Test Generation”, IEEE Software, pp. 83-88, September 1996.
Robson93
C. Robson, Real World Research, Blackwell, 1993.
Thelin01
T. Thelin, P. Runeson and B. Regnell, “UsageBased Reading - An Experiment to Guide Reviewers with Use Cases”, Information and Software Technology, 43(15):925-938, 2001.
Weidenhaupt98 K. Weidenhaupt, K. Pohl, M. Jarke, and P. Haumer, “Scenarios in System Development: Current Practice”, IEEE Software, Vol. 15, No. 2, pp. 34-45, 1998. Wohlin00
C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell and A. Wesslén, Experimentation in Software Engineering: An Introduction, Kluwer Academic Publishers, 2000.
Appendix A Questionnaire
Appendix B Detailed Description of the Process models
Test phases Describe the different phases of the verification and validation process. Describe the artifacts belonging to each phase. Is the process iterative or like the waterfall model? Does the project concern a new product or evolutionary development? When is the testing finished? Who decides when the testing is finished?
Radar: The process is very formal and well emphasized, to ensure that the right thing is done at the right time. It is considered a necessity to be able to sort out the created complexity, derived from large projects, with many participants. The procedures are quite well followed since the customer requires it in many cases. Phone: Has recently formally introduced an incremental process model, which includes design and implementation, function test, and system test. The incremental process has forced more detailed planning, which turned out to reduce the amount of errors that were introduced earlier due to insufficient planning. Network: Claims its process to be quite mature, and one of the company’s strengths. The process is well defined in their development handbook, and is continuously improved by input from developers, testers and managers with assistance of the quality assurance engineer, whose main responsibility is project follow-up and process assessment. Automation: Is dependent on corporate level decisions, e.g. a joint high-level project management model. They have a well-defined process, developed at corporate level. At lower levels the activities are defined in terms of templates and guidelines. However, as the process documentation is very extensive, there is a tendency to find discrepancies between the documented process and the actual work. CASE: The process has been changed recently, as a result of a merge of two companies. Each product release consists of several increments, which are tested separately. The process is now more formal and the activities are defined in terms of templates. Read: Is currently changing its process model radically. They have recently made an assessment of the former V&V activities in order to draw conclusions from the findings and define a new V&V process. The assessment reflected an inconsistent life cycle model without a testing strategy, and therefore also several improvement possibilities. After the second interview with Read parts of the new process model had been put into practise. Security: Has recently evaluated its V&V process according to the Test Process Improvement (TPI) model [Koomen99]. The history shows the need of improvements in the areas of documenting the test process, templates for the test plan, test specification, and test reports. However, formal reviews of the documents and the code were made, and both a configuration management, as well as a defect tracking system existed. Through this evaluation a controlled improvement of the test process has been started, with assistance of the TPI model, that shows the company where and how to introduce improvements. Monitor: Is a young organization and is presently evolving a model for their projects to follow. At start-ups, with only a few employees, the need of a well-defined process model, with pronounced V&V activities, is less evident. Nevertheless, when a small company is growing, there is a risk that the people-focused way to run the development continues.
Scope of the verification and validation process, what is the goal? Rank the following parameters: Cost, Time, Quality (few faults, depending on the functionality, etc.) Consequences Do they find the faults they want to? Are they finished on time? Roles How is the company organized? Who is doing the testing? Developers, Testers, Customers, Externally,... Training What kind of training and experience do the testers have? Tools Which are used? For what purpose? Level of details How much of the product is tested? Characteristics. The functionality that is used the most, or the functionality that probably has the most faults, or all functionality. Available data Are there any data collected and recorded as faults are reported and controlled? Number of found faults, consumed time. Purpose/Usage? Attitudes From the test team’s perspective? Status What is a successful test, (creative destruction?). From the developers’ perspective? Unusual ideas Are the products tested in any specific way? Personal assessment of the verification and validation process Mention three well respectively three less well performed characteristics.
Though, Monitor is aware of this risk, and is currently improving their V&V process. Their approach to improvement is to introduce several hands-on activities, such as code inspections, configuration management, and the use of a database for defects reporting. Monitor is connected to Read, as subsidiary, but is not dependent on the parent company’s process models in any way. Product: The V&V steps are defined in terms of different test environments. It is required to go through all of them, although it is not defined to what extent. The developed system interfaces towards the corporate management information system, and hence the financial consequences of software failures can be very large. Documents, such as test plans, are produced but not considered to be of any major use as the V&V depends on very experienced people. Sales: Has a well-defined release process, with release dates set on a yearly basis, and gradually freezing points before these dates. Two weeks before release, the system is put in a “deep freeze” state, meaning that only minor adjustments, correcting the most critical parts are permitted. The company considers this as a condition for maintaining high quality and reducing the costs for continuous changes. SubSales: Has a very informal development process and, after unit testing, delivers the products to Sales, which handles the rest of the testing. The informality is also reflected in the contacts between the two business partners.