Using Quality Audits to Assess Software Course Projects Wilson Pádua Synergia Systems and Software Engineering Laboratory Computer Science Dept. - Federal University of Minas Gerais Av. Antônio Carlos, 6627 - Belo Horizonte - MG – Brazil
[email protected] Abstract Assessing software engineering course projects should evaluate the achievement of proposed goals, as well as the compliance with mandated standards and processes. This might require the examination of a significant volume of materials, following a consistent, repetitive and effective procedure, even for small projects. Quality assurance must be performed according to similar requirements. This paper discusses the use of quality audits, a comprehensive quality assurance procedure, as a tool for the assessment of course projects.
1. Introduction Course projects are widely used to teach software engineering principles and practices. This author has used them in a sequence of industry-oriented graduate courses, using a model-driven software process, since 2002 ([1], [2]). Adequate assessment methods provide course projects with fair grading and highlight incorrect application of techniques and processes. Assessments should quantify how successful a course project was, both in delivering a product that meets agreed requirements, and in following a prescribed process, using certain techniques and releasing specified artifacts, by scheduled milestones. Their results should give the students useful feedback, pointing out what was wrong, and guiding them to avoid similar mistakes in the future. Assessments should be objective, ensuring fairness and feedback, and providing visibility into the progress of the projects. Those goals reflect the standard IEEE and CMMI definitions of product quality and quality assurance. Feedback, visibility and objectivity are goals of the CMMI Process and Product Quality Assurance process area. Assessment should also be effective. Even small projects deliver a significant amount of materials, such as code, models, plans and reports, requiring quick assessment and grading, for timely feedback. Similarly, realistic quality assurance must keep appraisal efforts under control. The remaining sections discuss how industrial grade quality assurance procedures might also be employed for course project assessments. Section 2 considers quality and compliance issues in course projects, showing how final project audits may be employed for assessment. Section 3 discusses practical aspects and issues found in assessing through such audits. Section 4 shows how the audits contributed the project assessment goals. Finally, Section 5 presents conclusions and ongoing work.
2. Quality audits in software course projects Every software development project must achieve milestones, where it must pass partially or fully automated system tests, after many development (unit and integration) tests. Organizations also require other kinds of appraisals, to ensure conformance with standards and intrinsic artifact quality attributes. Automated checks, such as model validation or static code analysis, detect violations of formal rules. Human reviewers, using inspections, reviews and use
evaluations, uncover other anomalies; but human appraisals miss some defects, and inject defects in the appraisal reports, which need to be double-checked, to a reasonable extent. According to the IEEE Standard for Software Reviews [3], an audit is “an independent examination of a software product, software process, or set of software processes to assess compliance with specifications, standards, contractual agreements, or other criteria”. Final project audits must appraise results of other appraisals, working as final quality audits. Iterative development proceeds through iterations that implement increasing subsets of the requirements. Quality audits should work as iteration exit gates, to ensure that defects do not accumulate, compromising future iterations. Checklists derived from process standards help to ensure repeatability, objectivity and consistent defect disposition, and to classify and count defects, for the assignment of quality grades. Quality audits check reports of model and code inspections, and verify proper execution and correct configuration of static checks and tests, inspecting their specs, scripts, logs and reports, perhaps repeating some automated checks and tests. Managerial artifacts, such as project plans and reports, record the most important project data, used for tracking and control of current projects, estimation of future projects, and process improvement. Quality audits may check their compliance, intrinsic quality and traceability to design and requirements.
3. Applying audit-based project assessments Our case study covers course projects developed by twelve classes, in an industry-oriented graduate program, from 2002 to 2007. Each class had two to four project teams of five to eight students, using the process outlined in [1, 2], to develop small information systems. Function points were used to limit project sizes (about 100 FP), and to normalize collected data. Every project should be implemented in Java code, traceable to UML design and analysis models. A reusable model and code framework (about 4,500 lines of code and 60 classes) provided frequently used resources, such as user interface skeletons and persistence services. The typical project duration was about six months, divided in one-month iterations. Each of the four course modules spanned one or two iterations. Usually, the first two iterations focused on requirements specification, and the application was designed, implemented, tested and evaluated in the remaining iterations, following a quasi-spiral life cycle model. The students implemented a simple CRUD use case in the third iteration, later developing progressively more complex use cases. An average project was verified and validated by around 100 automated system test cases, 150 automated unit test cases and 20 inspections. Table 1. Quality audit checklist composition Checklist items Conformance to standards Traceability between artifacts Correctness of managerial artifacts Correctness of review reports Correctness of test reports Correctness of use evaluation reports Correctness of configuration records Total
Critical 22 0 0 0 4 0 0 26
Major 103 17 17 42 15 4 0 198
Minor 74 9 5 54 0 3 18 163
Audit-based assessments were held at every iteration end. Each team performed a first audit, on its own material. A teaching assistant used the same checklist to assess the iteration package, providing, in a separate log, details and rationale for every defect found. Defects were classified by the checklist as critical, major or minor. If the students disagreed with their assessment, they
might ask for instructor’s arbitration. When defects were challenged, the instructor usually found that either the students, or, less often, the teaching assistant, misunderstood some process guideline or checklist item wording. Those were later rewritten, to improve their clarity. Table 1 summarizes a recent audit checklist. Critical defects invalidate significant portions of the material, such as the absence of mandatory artifacts or attachments, or failures of specified tests and static validations. Defect penalties, very low in the first iterations, were progressively raised in the last ones. For these, a penalty of 5% of the grade was levied on each critical defect, equivalent to five major or fifteen minor defects. Those who failed a course module had to submit amended materials, or to repeat the module, if they failed again. For thirteen projects in four classes, in 2005 and 2006, the development process, the course schedule and the project size remained very stable, and their collected data could be meaningfully averaged and compared. For those classes, Table 2 shows defects found in the last project audit, together with functional size, productivity and assessment grade. Columns CR, MA and MI show, respectively, total critical, major and minor defects. Size (FP) is given in non-adjusted function points, and productivity in function points per person-month. Grade is defined as 100% minus the weighted penalties for defects. For two smaller classes in 2007, the projects were somewhat shorter; their results did not show any significant difference. The passing grade was 60%. Students who did well in the theoretical exams might pass with somewhat smaller grades. As the data show, a single team had a very low grade in their last assessment. The grades were achieved without undue effort, as shown by the productivities, which ranged from a marginally acceptable level of 6.3 to the very good level of 29.9. There is no correlation between grade, which reflects final quality of the materials, and productivity, which depends on the rework performed, since the projects have similar size and complexity. Table 2. Quality and productivity data for a sequence of projects Team 1 2 3 4 5 6 7 8 9 10 11 12 13
CR 0 0 0 0 1 0 0 1 1 0 0 4 0
MA 23 12 12 12 17 22 30 37 15 10 5 19 3
MI 48 35 24 21 49 88 123 13 10 15 5 11 4
FP 101 106 109 98 117 102 116 108 110 126 122 113 151
FP/PM 19.0 14.9 16.5 8.7 6.3 9.8 14.1 29.9 16.7 28.3 11.9 14.6 20.3
Grade 61% 76% 80% 81% 62% 49% 29% 54% 77% 85% 93% 57% 96%
4. Results discussion Product quality requires meeting all functional requirements. For adequate coverage, the audit checks their traceability to enough well-designed test cases. Since each test case failure counts as one critical defect, their very small count in Table 2 means that every team had zero or very few test case failures. At least 95 function points had to be implemented; completely passing their tests. Audits checked test logs for mandatory audit trails of assertion messages and database snapshots, to confirm passed test cases and respective coverage of requirements. The quality audits verified some non-functional product quality attributes, checking whether performance requirements were tested, and whether user interfaces complied with usability
guidelines. Manual tests performed during use evaluations checked remaining usability issues, as well as unusual input combinations, not entirely covered by the automated test suites. Checking process compliance is the main purpose of a quality audit, and our first audit section checks about two hundred process compliance items. Moreover, it exacts the heaviest penalties on the most severe non-compliance issues: missing artifacts or artifact sections, and formally defective artifacts, such as models containing non-valid UML. The small number of critical defects in Table 2 shows that such problems seldom appear in final project deliveries. Other data confirm a satisfactory degree of process compliance. Some measures reflect compliance with the process guidelines for design and coding through their low coefficients of variation across projects: lines of code per class (10%) and per function point (13% for application code and 18% for test code). The effort distribution per project iteration and per process discipline was also similar [2], for projects with quite different productivities. This was expected from application of the mandated process, as opposed to code-and-fix development. During the preliminary audit, performed by the students, the audit checklist reminds them about the most important process guidelines, reportedly summarizing and clarifying them. Improvements in the wording of the checklists have enhanced process compliance. The set of all iteration data (not shown here, for brevity) indicates performance improvements by each audit; teams seldom repeat defects found in previous audits, especially the most severe. The teaching assistant must justify every logged defect, as the violation of a checklist item, derived from the process standards and guidelines. Strong traceability from defect logs to checklists and standards is recommended for industrial-grade quality appraisals [4]. Allowing students to challenge the assessment logs also enhances fairness and objectivity. Audits include repetitions of automated tests and static validations, and cursory reviews on inspection and test logs, themselves designed to ease quick reassessment. As a general rule, a proficient appraiser may check most audit items in about one or two minutes. This means about one or two workdays per assessment, which has been confirmed by the practice.
5. Conclusions and ongoing work In this case study, quality audits were used to objectively and effectively assess course projects, foster product quality and process compliance, and provide useful feedback. Collected data suggest that those goals were achieved. Audits also improved the quality of data provided by the project reports, helping to detect faked or mistaken data, and to ensure that reports were correctly and consistently filled. Recently, a new process version reflected changes in the process references (UML, CMMI, IEEE software standards), and experience from its application in real-life projects. It was ported to the Eclipse Process Framework, to use the facilities provided by that tool, especially for developing plug-ins that may further automate checking. Many guidelines and procedures were rewritten, to take advantage of the quality audit findings. Thanks to the steady improvement provided by the audit feedback, the overall structure and contents of the appraisals were kept. We thank IBM Rational for supporting this work, within the IBM Academic Initiative.
6. References [1] W. Pádua, “A Software Process for Time-constrained Course Projects”, in Proceedings of the 28th. International Conference on Software Engineering, Shanghai, China, May 2006, pp. 707-710. [2] W. Pádua, “Using Model-Driven Development in Time-Constrained Course Projects”. in Proceedings of the 20th Conference on Software Engineering Education and Training, Dublin, Ireland, Jul. 2007, pp. 133-140. [3] IEEE, “IEEE Std. 1028-1997 (R2002): IEEE Standard for Software Reviews”, in IEEE Standards Collection – Software Engineering, IEEE, New York, NY, 2003. [4] T. Gilb and D. Graham, “Software Inspection”, Addison-Wesley, 1993.