Developing an evaluation module to assess software maintainability - description and development issues Teade Punter Eindhoven University of Technology / KEMA Nederland B.V. Faculty of Technology Management Pav D-11, PO box 513, 5600 MB Eindhoven, The Netherlands
[email protected]
Abstract Third party software product evaluation is independent assessment to determine whether customer’s needs and supplier’s promises about a software product are met. Third party evaluation faces three problems: universal pass/fail criteria to judge about software products do not exist, measurement programs cannot be set up in a feasible way and the results of the metric- and checklist approach -which are both relevant- are not interpreted together. This paper is about an evaluation module that should deal with these problems. The module consists of four steps and a supporting tool. First an appropriate set of metrics is selected for the product under evaluation. Pass/fail criteria and weight for the metrics are set. To determine the metrics properly, techniques should be assembled to tune them to the specific evaluation. These three steps result in an advice which will be input for the assessment of the evaluation. Having conducted the assessment its results are evaluated to learn for future evaluations. Keywords: software product evaluation, product assessment, maintainability, evaluation module, software product quality, situated metrics
1. Introduction Software has increasing importance throughout society. Telecom networks, stock exchanges, medical devices, banks: all of these organizations and products fully depend on software. Software product quality should therefore a concern to everyone. Huge efforts have been made to deliver software quality and improving the current processes: e.g. by using structured development methods , inspections of code. These efforts are important to build quality into software, however they do not provide an answer to the question ‘whether quality is high enough’. To determine whether customer’s needs and supplier’s promises about the product are met, software product evaluation is needed. It is the systematic examination of the extent to which the software is capable of fulfilling the specified requirements (ISO 8402, 1994). Examples of evaluation are acceptance testing, inspection of code or security audits. Instead of evaluation by the customer (first party) or supplier (second party) software product evaluation can also be conducted by an independent body: this is denoted as third party evaluation. The main benefit for customers as well as suppliers is independent evaluation. This will result in attaining the right level of software product quality, which is in fact the fit between customer’s needs and supplier’s promises about the product.
2. Problems for third party evaluation In evaluating software products we are confronted with three problems: universal pass/fail criteria to judge about software products do not exist, measurement programs cannot be set up in a feasible way and the results
1
Teade Punter
05/06/98
Accepted for Empirical Assessment and Evaluation in Software Engineering/EASE’98
of the metric- and checklist approach -which are both relevant- are not interpreted together. Each of these problems is elaborated below. Universal pass/fail criteria to judge about software products do not exist - Software product quality is the fit between customer’s needs and supplier’s promises. It is often expressed as a set of quality characteristics -like functionality and reliability. To determine the extent to which this quality is build into the software, measurement is needed. However pass/fail criteria to interpret the measurement -‘is it a good or bad result’- are missing. Especially third party evaluation we is confronted with many different software products that have different application areas and therefore entirely different criteria. The lack of universal criteria seems to conflict to objectivity and reproducibility of evaluation -requirements stated by ISO CD 14598 (1996). Measurement programs cannot be set up in a feasible way for third party evaluation - The usual start for software measurement is to identify the quality requirements of a software product and then express them as a set of quality characteristics. Each of those -sub- characteristics, must be captured by one or more metrics. Normally metrics are selected by starting up a measurement program -and using e.g. Goal Question Metric-paradigm (Basili et al, 1994). However this is not a feasible approach for third party evaluation. The throughput of assessment it often too short to set up complete measurement programs. Budget is a constraint too. Results of the metric and checklist approach are not interpreted together - The application of software quality metrics -like Lines of Code, Number of Statements- by using automated tools is a well-proven instrument in software engineering. Experiences at KEMA Connect show that it is also necessary to take checklists into account. Advantages of these technique are its flexibility -to customize an assessment to a software product- and the possibility to address other information sources of the product, like a manual or its service organisation (Punter, 1997). The relevance of checklists is confirmed by its application in EDP-auditing. To assess software product quality well, both techniques should be taken into account. However -measurement- results derived of both kinds of techniques are of different formats: metrics results in numbers, while checklist answers are qualitative notions -like yes or no. Although these problems do not address third party evaluation only, the solution to these problems requires a specific approach when dealing with third party evaluation. To cope with these problems the idea came about to construct an evaluation module for a specific quality characteristic, namely: Maintainability. The Evaluation Module to assess Maintainability of software (EMM) is developed at KEMA Nederland B.V. -a Dutch certification body which conducts third party software product evaluation- and Eindhoven University of Technology. It is a procedure to design an evaluation -see box 2. The result is an advice to conduct an assessment of the product. This paper describes the procedure and the supporting tool which contains metric descriptions and techniques to perform the assessment. Evaluation module is a package of evaluation technology for a specific software quality characteristic or sub characteristic. The package includes evaluation techniques, inputs to be evaluated, data to be measured and collected and supporting tools (ISO/IEC 14598-1, 1998). It is a guide line describing the application of specific evaluation technique(s) on a given part of a software product with the objective to evaluate a specific quality characteristic of the software product. The concept of evaluation module originates from the Scope-project which stated that self-consistent and modular evaluation procedures are needed to conduct evaluation (Bache Bazzana, 1994). Evaluation technique - a tool and/or set of competences to collect data, to perform interpretation of data during the evaluation. An example is a source code analyser to compute code metrics. Different type of evaluation techniques consists. Normally the following types are distinguished: inspection, execution analysis, static analysis, modelling (Bache and Bazzana, 1994).
Box 1
Evaluation module and evaluation technique
3. Guidelines for solution Three problems for third party evaluation were distinguished -see section 2. This section focuses upon the guidelines which are starting points for developing the Evaluation Module Maintainability. To deal with the problem ‘universal pass/fail criteria to judge about software products do not exist’, we state that appropriate pass/fail criteria depend on the requirements of each individual software product. They are not constant for different products, but are determined by the application area of a product -e.g. fail safe versus administrative environment. Understanding of criteria per application area is needed. Therefore we should learn from previous experiences. For each metric its application area should be characterized, which will also reflect the appropriate pass/fail criterion for that situation. We propose to characterize metrics on basis of the attributes of their application area and store data about
2
Teade Punter
05/06/98
Accepted for Empirical Assessment and Evaluation in Software Engineering/EASE’98
this. By characterizing metrics upon their attributes, it might be possible to collect data about their appropriateness during evaluations which were conducted -see section 5 and 6. We think that those historical data will benefit to select appropriate metrics and pass/fail criteria for future assessments. This point addresses the second problem -‘Measurement programs cannot be set up in a feasible way for third party evaluation’- also. Using experience of existing assessments, should result in an approach to set up the measurement program for the evaluation within constraints of budget and time. The evaluation module maintainability uses also existing knowledge by adapting existing evaluation techniques -see section 7. The third problem -‘results of the metric and checklist approach are not interpreted together’- is addressed by translating checklist questions into a format which aims at more objective and reproducible evaluation. The resulting questions address indicators for software quality, by defining counting rules. Sets of associated questions are defined as metrics. By defining those -checklist or question based- metrics, they should be made comparable to software quality metrics -see section 7.
4. Overview The Evaluation Module Maintainability consists of a procedure and a tool to support the evaluation design. The first three steps of the procedure -select metrics, set pass/fail criteria, assemble techniques- result in an assessment advice which consists of a selection of metrics, the techniques to apply them and their pass/fail criteria. The assessment advice -which is formally addressed as the evaluation plan- is the starting point for the assessment. The fourth step of the Evaluation Module Maintainability is to evaluate the assessment results, in order to learn from assessment experiences. An overview of the steps is presented in figure 1. Subject of evaluation: quality requirements/quality factors, information sources
Moment of evaluation Perspective of evaluation
select metrics
metric base
selected (compound) metrics set pass/fail criteria selected metrics, pass/fail criteria and weights technique base
assemble techniques
assessment advice (selected metrics and techniques + pass/fail criteria) assessment (measurement) assessment result experiences evaluate assessment legenda: = step of EMM, evaluation design
Figure 1
= evaluation execution/ assessment
Procedure of Evaluation Module Maintainability
3
Teade Punter
05/06/98
Accepted for Empirical Assessment and Evaluation in Software Engineering/EASE’98
Figure 1 also denotes the concepts of metric- and a technique base. These are the basic parts of the supporting tool The metric base contains descriptions of metrics, see section 5. The technique base contains existing checklist- techniques, see section 7. ISO CD 14598 (1996) -the current standard for software evaluation- distinguishes four activities: analysis, specification, design and execution. In this paper analysis is not regarded as a separate activity, instead it is conducted during specification, then three activities remain: Specification of evaluation - determination of quality requirements. Result is a set of quality characteristics and appropriate evaluation levels: a quality profile. A description about how to determine a quality profile can be found in (Eisinga et al, 1995), this is also subject of the Space-Ufo project (Esprit 22290). Design of evaluation - determination of techniques and evaluation criteria to establish the quality characteristics software product. Result is an evaluation plan that is agreed upon by both the customer and the evaluator. The Evaluation Module Maintainability is subject of the design of evaluation. Result is an evaluation plan. Execution of evaluation / assessment - the determination whether and the extend to which the quality requirements of the software product specified in the quality profile- are satisfied. The evaluation execution is conducted according to an evaluation plan. Result is an evaluation report.
Box 2
Evaluation process
5. Select metrics (step 1) In order to determine the metrics which are customized to handle the assessment of a specific software product, we first need to know what is under evaluation? (subject), in which phase of its life cycle is the software product evaluated? (moment) and for who is the product evaluated? (perspective). For each of the areas which are addressed by those questions (subject, moment and perspective) we need information to select the right metric for the right product under evaluation. The subject considers the quality requirements of the product and the information sources necessary to conduct the evaluation. Normally the selected set of quality requirements determine the choice of metrics. They are expressed as a set of quality characteristics -see e.g. ISO FCD 9126 (1998). Each characteristic of this set must capture one or more metrics. Metric selection is also influenced by the scope of information sources. For example it will be questioned: is the evaluation of the product restricted to the source code or is documentation about e.g. support- taken into account too? Besides this facet the selection of metrics is determined by the availability of information source too. For example, in case of non availability of technical documentation of the product, it is not possible to use the metric ‘indicator for consistency between screen lay-out and program flow’. The moment of evaluation which is addressed by the metric, is about the phase in the life cycle of the software product during which it is assessed. To assess a product in its requirements phase, other metrics are needed than in case of assessing the same product in development or use phase. To identify metrics for this, the distinction of life cycle phase according to ISO 122207 (1996) is used: a metric addresses e.g. a request for proposal, requirements analysis definition or architectural design. The perspective of evaluation addresses the person for who the evaluation is executed: developer, maintainer or operator (ISO FCD 9126, 1998). Perspective considers the people who are involved but also the objective of the evaluation. Further elaboration of this subject is necessary for this. Subject, moment and perspective of the evaluation are so-called external attributes of a metric: they determine the context for which it is allowed to apply the metric. This denotes the notion of application area of the software product, which is normally used to state that measurement is executed differently for different products. We think that the external attributes are more adequate, because they characterize metrics. Besides the external attributes, a metric is also identified by internal attributes. These are about the way to apply the metric. The following internal characteristics are addressed by Evaluation Module Maintainability: • the way of measurement: is measurement conducted direct or indirect? A direct metric does not depend upon a measure of any other attribute, an indirect metric does, • the dependence of measurement on system behavior: external or internal? An external metric is an indirect metric of a product derived from measures of the behavior of the system of which it is a part. An internal metric is derived from the product itself, either direct or indirect, • type of metric: size (syntax: size, complexity; hybrid: e.g. halstead; semantic: function points), effort (execution, user, person hour) and cost (Rudolph, 1997), • type of scale - nominal, ordinal, interval, ratio.
4
Teade Punter
05/06/98
Accepted for Empirical Assessment and Evaluation in Software Engineering/EASE’98
To select metrics we should have information about which external and internal attributes are required by the evaluation of a particular product. The supporting tool is able to collect these attributes per metric. At this moment an inventory of attributes of metrics -based upon literature study ISO 9126 (1991), part 2 and 3, Quint (1991), West (1994) and Afotec (1996)- is made. Experiments are necessary to elaborate on this further. Although it is able to select metrics on basis of required attributes using the supporting tool -see figure2, the current status of the Evaluation Module Maintainability does not allow automatic selection only. To deal with this situation, expert opinion is used to propose per experiment a selected set of metrics. This will be conducted by using evaluator’s knowledge and existing references for software maintainability -like Oman and Hagemeister (1992).
readability of code indicator for readability of code indicator for modularity readiness of parameterisation change recordability ration effortless changeability indicator for consistency of code
Figure 2
Example of screen of supporting tool: selected metrics for required set of attributes.
6. Set pass/fail criteria and weights (step 3) Having selected the metrics, pass/fail criteria and weights should be set for each of them. This results in a sheets like for example the one presented in figure 3. A pass/fail criterion is a target value for a particular -component of a- product. Relative weights are used, e.g. weight of metric with ID M100 is half of the weight of metric with ID M101. The criteria and weights are set by the evaluator -in accordance to customer’s needs and supplier’s promises. Metric readiness diagnostic function
indicator for availability of diagnostics to analyse faults fault detection rate
Figure 3
Metric ID M100
M101 M103
Counting rule
Weight
Criterion
X=A/B, A-number of failures of which maintainer can diagnostic to understand cause effect relation, B = total number of registered and analyzed failures (n56*Item56 +n515*Item515 +n518*Item518+n550*Item550)/4 faults detected per time unit (week)
0,5*M101
1/1
1*M100, 2*M103 0,5*M101
0,8 10/week
Example of sheet with selected metrics and associated pass/fail criteria and weights.
5
Teade Punter
05/06/98
Accepted for Empirical Assessment and Evaluation in Software Engineering/EASE’98
The main challenge during the second step is to interpret the metrics -their pass/fail criteria and weights- in a way that they represent the quality in use (ISO FCD 9126, 1998) for the particular product. This deals with the dilemma that metrics apply to properties of components of the software product or process actions, e.g. modules and statements, while quality in use is about the whole system. Interpretation of metrics is a job for the evaluator upon which the customer’s and supplier should agree. However, it might be supported by proposals made upon results of the metric base of the Evaluation Module Maintainability. The metric base contains the attributes of metrics. We expect that those attributes will determine the target value(s) and weight(s) of a metric for a particular product too. Note that this is a research hypothesis only and that assessment experiences are needed to elaborate on this further.
7. Assemble techniques (step 3) After selecting the metrics and having determined their pass/fail criteria and weights, the way to measure the metrics should be determined. To assess the software product and determine the values of the selected metrics, techniques are necessary. Examples of evaluation techniques to assess maintainability are a structural analysis of code with the Logiscope-tool or inspection via checklists. The third step of the Evaluation Module Maintainability assembles the techniques that are necessary for the assessment. The result is in an assessment advice, which is the basis for the evaluation plan -see box 2. The Evaluation Module Maintainability provides two ways to apply evaluation techniques. At first a complete technique -its dedicated metrics and pass/fail criteria can be applied. For example the complete Afotec-checklist -considering 101 questions- is used. Second possibility is to assemble only the parts of techniques which address the selected metrics. The advantages of this approach is to tune techniques to the selected set of metrics and therefore to a particular product under evaluation. An example of a combination is: to apply static analysis code for the metrics ‘average size of statements’ and ‘number of nested levels’ and inspect the technical documentation by answering checklist questions for the metric ‘indicator consistency between code and documentation’. Different technique parts will be connected on basis of the selected metrics. During the technique assembly it is expressed how these abstract metrics should be determined. For example the readiness diagnostic function is defined as X=A/B where A is the number of failures of which maintainer can diagnostic to understand cause effect relation and B = total number of registered and analyzed failures. To determine this metric it should e.g. made clear what is defined as failure, and what are possibilities for diagnostic. A particular problem during technique assembly is to integrate the results of checklists and metric approaches. Checklist techniques do normally not define their metrics. To express checklists in terms of metrics it is necessary to define the combination of the related questions -addressed further as item- as a counting rule. A general description of a counting rule for metrics based upon these items is: Metricitem = (n1*Item1 + n2*Item2 + n3*Item3 + ... + nn*Itemn) / n where: 1, 2, 3 and n are identifying keys for items -to search in the supporting tool, n1, n2, n3 and nn are the weights of the items 1, 2, 3, n and denominator n is the total amount of items which are applied for the metric An item/question is also a metric by itself. An item is a question and the answering possibilities about a product property or a process action (Punter, 1997). An example of a question is: ‘Is a description of the data structure of the software product present?’ It is a question about data structure, which tells something about readability of documentation. Its accompanying answering possibilities are: 0 - a data structure does not exist or is preliminary (only entities, no relationships explicated), 1 - approved and reviewed data structure (by development team) is present. Answering possibilities are the instructions which obtain the value of a question (counting rule). An item is the atomic part of a compound metric -see figure 5. A compound metric which is based upon a set of items is also denoted as indicator (ISO 9126, 1991). In order to support the technique assembly, the technique base contains checklist items. To conduct this, questions of existing questionnaires -like Afotec (1996), KEMA (1996), West (1994)- were checked for this purpose and were stored according to the specific format of items. The technique base contains for each item, its question, the related answering possibilities and its associated metric. Also an additional description of the terms used in the question or answering possibilities was given. This provides a context of meaning for the evaluator
6
Teade Punter
05/06/98
Accepted for Empirical Assessment and Evaluation in Software Engineering/EASE’98
when answering the questions, which will benefit objectivity and reproducibility of evaluations. Figure 4 provides some items for metrics which address changeability. Metric
Item question
answering_possibilities
indicator for consistency is an cross-reference present in which relations between between code and modules and logical data stores are described? documentation is description present in which relations between data structures and its implementation in the software product is described? indicator for modularity are procedure/functions with many branches and a high of code degree of nesting generally smaller than continuously structured procedures/functions with a low degree of are the procedures divided into relevant and well-defined sub-procedures
0: no cross reference is present; 1: a well established cross-reference is available
indicator for readability of code
0: of the modules considered some have the same names; 2: most of the considered modules are different 0: no rarely, 2: majority of module names are self explanatory (can only be an answer when all modules are examined)
are all modules in program description uniquely named?
are the names of the modules self explanatory?
Figure 4
0: no such description is present; 1: a description is found; 2: a well established description is available 0: no, not even attempted to follow this principle; 1: the principle is partly followed; 2: yes; 0: yes; 1: to some extent; 2: no;
Example of items and their compound metrics.
Besides the construction of items, technique assembly addresses also other subjects. Because we deal with different kind of metrics, attention should be paid to the way the metrics are applied. Internal attributes of the metrics are regarded for this, e.g. scale type. A scale type is inherent to a metric. Regarding the items already stored, most of the answering possibilities belong to a ordinal scale: answering possibilities can be ranked on a criterion. For example the item consisting of the question ‘is documentation available’ provides an answering possibility with the distinction -yes or no. The distinction is also intended to be an arrangement: the yespossibility is of greater value than the negative answer. We suppose that the associate metric is of an ordinal scale too. Most software quality metrics also belong to an ordinal scale (Fenton, 1991). Code length might be measured by counting the lines of code or the lexical tokens in a program listing. Distinctions as well as arrangements can be given. E.g. code length of 20 lines is smaller than code length of 30 lines. Sometimes it is possible to obtain equal distances between the ranks. In those cases the software quality metrics belong to an interval scale. Using different metrics for an assessment will result in scale transformations. However they are formally invalid. By making the scale transformations explicit, it is possible to trace the choices for the transformations (Abran and Robillard, 1997). Dealing with scale transformations is an example of a subject which should be addressed during technique assembly. The technique assembly results in an assessment advice. This consists of the selected metrics, its pass/fail criteria and weights and the techniques -e.g. checklist parts and dedicated modules of tools- to measure them. The assessment advice is the basis of the evaluation report. Latter addresses also the administrative issues for an evaluation. The advice is input for the assessment: the framework to assess the product.
8. Evaluation of assessment (step 4) After the assessment is conducted, its results should be evaluated to learn for future assessments. First the appropriateness of the assessment advice is regarded. This is the extent to which the combination of metrics, their criteria, weights and techniques reflects the expectations customer, supplier and evaluator about the assessment. After this overall judgement an inventory per metric is made: for each metric its appropriateness high, middle, low- for the assessment is determined. In case of an appropriate metric, its external attributes -see section 5- and also the pass/fail criterion and weights are stored in the metric base of the supporting tool. This ‘context of use’ for a metric is stored according to the data model presented in figure 5.
7
Teade Punter
05/06/98
Accepted for Empirical Assessment and Evaluation in Software Engineering/EASE’98
consists_of compound quality characteristic
quality characteristic
object attribute
constructed_of external attributes of metric
consists_of compound metric
measurement result
metric constructed_of
internal attributes of metric
Figure 5
Basic relationships to collect metric attributes
The right side of figure 5 represents the assessment of a particular software product: the properties of the product objects are measured by using metrics. The measurement results are compared to specific pass/fail criteria -these are provided by the assessment advice. Note: measurement results are actual values, pass/fail criteria are target values. In case of an appropriate metric, its external attributes are added to the metric base. The structure of the metric base is reflected in the left part of figure 5. Central element is the metric, which can be a compound metric, e.g. in case of a ‘checklist metric’ the items/questions are the metrics, while the indicator is the compound metric. The external and internal attributes of the metrics are represented as separate entities for the current version of the metric base. Quality characteristic is an external attribute of a metric, but is also related to the product properties. This results in a separate entity to denote its special position. The presented data model is still under construction. Further elaboration is needed, for example to deal with the possibility that a metric can have conflicting attributes, e.g. a metric which is applied successfully for information sources documentation and source code. Although this restriction the current model provides an overview of the relating concepts of the Evaluation Module Maintainability.
9. Conclusions The Evaluation Module Maintainability provides a mechanism to select metrics and assemble techniques for assessing a software product. This approach is an alternative way to set up a measurement program for third party evaluation. It ought to be used during design evaluation. Its result is input for assessment. Starting point is to characterize the metrics by their external attributes -like quality characteristic or information source. We think these attributes reflects the application area of a software product. Experience about these metric attributes should be stored in a metric base. This is used to select metrics -and their pass/fail criteria and weights- for future assessments. Having selected the metrics, the way to determine them should be elaborated. For this techniques are applied, e.g. a checklist or a static analysis. To match the selected metrics for a particular assessment necessitate combinations of different techniques. Integrating the results of these techniques is often difficult. Especially the integration of results from the checklist and metric approaches is a problem. This is addressed by expressing checklists in terms of metrics. Questions are denoted as metrics which results in a value. The items are aggregated in a compound metric. The current technique base of the supporting tool contains checklist questions which are stored according to this format.
8
Teade Punter
05/06/98
Accepted for Empirical Assessment and Evaluation in Software Engineering/EASE’98
The Evaluation Module Maintainability and its supporting tool is a prototype. Experiments are needed to elaborate on it further and validate the concepts. Examples of questions which should be addressed are: to which extent do the attributes of a metric represent the application area of a software product?, is it possible to derive pass/fail criteria and weights of metric by metric attributes? The most important question is about the feasibility of the approach for third party evaluation. Experiments are planned for 1998. The Evaluation Module Maintainability addresses the quality characteristic Maintainability. We remark that the procedure -the steps- can be applied to other quality characteristics too. However, the added value of the approach depends on the contents and construction of the metric and technique base, which is yet available for maintainability only.
Acknowledgements This paper is part of a research about Software product evaluation. Experiment(s) to validate the Evaluation Module will be executed in the context of the Space-Ufo project (Esprit 22290). The author would like to thank the anonymous reviewers for their comments and also Rob Kusters (Eindhoven University of Technology) for discussing the datamodel and Bruno Peeters (Gemeente Krediet, Brussels) for his suggestions about checklist approaches.
References − − − − − − − − − − − − − − − − − − − − −
Abran, A. and P. Robillard, ‘Function Point Analysis: an empirical study of its measurement processes’, in: IEEE Transactions on Software Engineering, December 1996. Afotec Pamphlet, Software Maintainability Evaluation Guide, Department of the Air Force USA, 1996. Bache, R. and G. Bazzana, Software metrics for product assessment, London, McGraw-Hill, 1994. Basili, V.R., Caldiera, C., Rombach, H.D., ‘Goal Question Metric Paradigm’, in: Marciniak, J.J. (ed), Encyclopaedia of Software Engineering, John Wiley and Sons, 1994. Eisinga, P.J., J. Trienekens, M. Van der Zwan, Determination of quality characteristics of software products: concepts and case study experiences, in: Proceeding of the 1st World Software Quality Conference, San Fransisco, 1995. Fenton, N., Software metrics -a rigourous approach, London, McGraw -Hill, 1991. Hatton, L., ‘Reexamining the Fault Density - Component Size Connection’, in: IEEE Software, march/april 1997. IEEE 1219, Standard for Software Maintenance, IEEE Standards Board, New York, 1993 IEEE 1061, Standard for a software quality metrics methodology, IEEE Standards Board, New York, 1992 ISO 9126, Software Quality Characteristics and Metrics, 1991. ISO 8402, Quality management and quality assurance - Vocabulary, 1994 ISO CD 9126, Software Quality Characteristics and Metrics, 1998. ISO CD 12207, Software Life Cycle Processes, 1996. ISO CD 14598, Software Product Evaluation, 1996. KEMA, Checklist maintainability, internal report, 1996. Kitchenham, B., S. Lawrence Pfleeger en N. Fenton, Towards a framework for Software Measurement Validation, IEEE Transactions on Software Engineering, December 1995. Oman, P. and J. Hagemeister, ‘A taxonomy of software maintainability’, internal paper, 1992. Punter, T., ‘Requirements to evaluation checklists’, in: Software Quality from a Business Perspective, Trienekens and van Veenendaal (eds), Deventer, Kluwer Bedrijfswetenschappen, 1997. Quint, Specifying software quality, Deventer, Kluwer Bedrijfswetenschappen, 1991 (in Dutch) Rudolph, E., Software metrics: theory and practice, in: Proceeding of 8th European Software Control and Metrics Conferences (ESCOM), 26-28 May, Berlin, 1997. West, R., Improving Software Maintainability, CCTA, 1994.
9