spection [2], walkthrough [3], peer review [4], since the ... menting the formal specification of âA Research Man- agement Policyâ and ...... Dorset House, 1990.
An Investigation of the Approach to Specification-based Program Review through Case Studies∗ Fumiko Nagoya, Shaoying Liu, and Yuting Chen Department of Computer Science Faculty of Computer and Information Sciences Hosei University Email: {i02t9012, sliu, i03t0301}@k.hosei.ac.jp
Abstract
However, most documentation used in computer industry are written in natural languages and/or some semi-formal graphical notations, review for their consistency and validity can only be done at an abstract level and cannot be automated to support static analysis in depth. Furthermore, these traditional review methods emphasize the importance of the way to organize reviews and rely on the quality of the reviewers’ experience and personal skills, therefore they do not provide a systematic and rigorous way to support reviews due to the fact that informal specifications lack formal syntax and semantics. In this paper we propose a new approach to rigorously reviewing programs based on their formal specifications. Although the techniques suggested in this approach is still in its preliminary status, it presents an important idea and the potential for developing more mature and effective techniques for reviews. The fundamental idea of the approach is to use a formal specification as a standard to check whether its program correctly implements all the required functions and properties in the specification. This approach offers several advantages. Firstly, a precise check-list can be systematically derived from a formal specification due to formal syntax and semantics to enhance the rigor of reviewing the program. Secondly, in addition to the check-list, specific review tasks for understanding the structure of program in a review can also be provided to facilitate the review process and enhance the credibility of the review process. Finally, using the fact that the formal specification provides a high potential to support automation, the proposed review approach can be automated to enhance the efficiency and trustability of the review process. Our major contribution in this paper is to propose the idea of specification-based program review and to conduct two case studies to discover how it can be ap-
Software review is an effective means to enhance the quality of software systems. However, traditional review methods emphasize the importance of the way to organize reviews and rely on the quality of the reviewers’ experience and personal skills. In this paper we propose a new approach to rigorously reviewing programs based on their formal specifications. The fundamental idea of the approach is to use a formal specification as a standard to check whether all the required functions and properties in the specification are correctly implemented by its program. To help investigate the effectiveness and the weakness of the approach, we conduct two case studies of reviewing two program systems that implement the same formal specification of “A Research Management Policy” using different strategies, and present the evaluation of the case studies. The results show that the review approach is effective in detecting faults when the reviewer is different from the programmer, but less effective when the reviewer is the same as the programmer. Keywords rigorous review, formal specification, program analysis, verification
1 Introduction
Review has been used for static analysis of software under different names, such as static analysis [1], inspection [2], walkthrough [3], peer review [4], since the review technique was developed by Michael E. Fagan at IBM in 1970’s [5]. It can be applied to both software documentation (e.g., specification, design) and code. ∗ This work is supported by the Ministry of Education, Culture, Sports, Science, and Technology of Japan under Grant-inAid for Scientific Research on Priority Areas (No. 15017280).
1
where a path in the program is a sequence of statements and conditions from the start-statement to an end-statement (it is possible to have more than one end-statements) of the program. The start-statement is the first statement to be implemented when a program (e.g., a method in Java) is executed and an endstatement is the statement that is finally executed before the termination of the execution. Since apart from the required functions in the specification, the program may also implement some functions necessary to support the required functions (e.g., the functions for reading input data from GUI and for displaying the outputs on the GUI), it is likely to have additional paths in the program that are not directly related to the written functions. Therefore, the relation between a specification and its program can be defined by a function Mi . Definition 1 Let S = {f1 , f2 , ..., fn } be a specification containing functions f1 , f2 , ..., fn and P = {p1, p2,..., pm} be a program containing program paths p1 , p2, ..., pm . If P satisfies S if and only if there exists a function Mi from S to P that satisfies the following condition:
plied in practice, whether it is effective in detecting faults, and if so, under what conditions it is effective, the possible weakness for further improvement, as well as the supportability of the review approach by software tools. We choose SOFL (S tructured O bject-Oriented F ormal Language) as the specification language based on which our review approach is discussed. Since SOFL shares the important features with the commonly used formal specification languages, such as VDM-SL (Vienna Development Method - Specification Language) [6], Z [7], and classical Data Flow Diagrams [8], the result of this paper is also applicable to those formal and semi-formal specification languages. There are numerous publications on SOFL so far [9][10][11][12], we therefore assume that the reader is aware of the SOFL specification language. For the reader who is unfamiliar with SOFL, he or she can refer to the second author’s homepage at http://wwwcis.k.hosei.ac.jp for details of those publications. Omission of the introduction to SOFL is also made for the sake of space. The remainder of this paper is organized as follows. Section 2 describes the principle of specification-based review. Section 3 presents a review strategy supporting the principle of specification-based review. Section 4 discusses the issue of review management. Section 5 presents two case studies of applying the specificationbased review method to review the programs implementing the formal specification of “A Research Management Policy” and evaluate the result of the case studies. Finally, in Section 6 we conclude the paper and point out the future research. 2
→ i ∀f ∈S ∃p∈P · Mi f p Where Mi f p means that program path p correctly implements function f in the specification. Definition 2 The approach to reviewing program P against its specification S based on the mapping function M defined in Definition 1 is called function-path M
: S
P
( )=
( )=
i
strategy. To help the reader understand the essential idea of the function-path strategy, we choose a simple specification and its implementation as an example to explain how the strategy is applied. Consider the specification of process B:
The principle of specification-based review
The essential idea of the specification-based review is to ensure that every functional requirement defined in a specification is implemented correctly in the corresponding program. Let S : [pre_S , post_S ] denote a process in a SOFL specification, where pre_S and post_S represent the pre- and postconditions of S , respectively. To simplify the description, we assume that post_S is given as a disjunctive normal form : S1 or S2 or ... or Sn (any predicate expression can be transformed into an equivalent disjunctive normal form by applying the rules in the predicate calculus). Since the truth of each disjunctive clause Si (i = 1...n) leads to the truth of the entire post_S , it expresses a rather independent functional requirement under the precondition for the program to implement. To implement the required function specified in the specification, there must exist a path in the program to implement it,
process B(x: int y: int) z: int pre x > 0 or y > 0 post z = x + 1 or z = y - 10 end_process |
It gives a formal definition of the functions of process B in Figure 1. The process takes either data flow x or y but not both and generates output data flow z. The precondition requires that either x or y is greater than zero and the postcondition presents a definition of z based either on x in the expression z = x +10 or on y in the expression z = y -10. Taking both the preand postconditions into account, we form the following predicate expression: 2
x
processB
y
from a text field on the GUI (x or y is available when the corresponding event of inputting a value to x or y occurs on the GUI) and provides an integer as output.
z
TextField input3, input4; int number3, number4, z; ... //initialization of all the declared global variables void actionPerformed(ActionEvent e) { int z = 0; if (e.getSource() == input3) { number3 = Integer.parseInt(e.getActionCommand()); if (number3 > 0) z = number3 + 10; } else if (e.getSource() == input4) { number4 = Integer.parseInt(getActionCommond()); if (number4 > 0) z = number4 - 10; } //the output of this method is represented by the global z.
Figure 1. A process in SOFL
pre_B and post_B (x > 0 or y > 0) and (z = x + 1 or z = y - 10) x > 0 and z = x + 1 or x > 0 and z = y - 10 or y > 0 and z = x + 10 or y > 0 and z = y - 10 und to a concrete value in its type, it is like when a token is available in the Petri Nets) while y is not, z must be defined using the expression z = x + 1 in order to ensure that the postcondition is met by output z, the desired function defined by the specification based on x is z = x + 1 in the postcondition. Similarly to the case of taking input x, when y is available, the desired function defined by the specification is z = y - 10 in the postcondition. Thus, we extract the only desired functions denoted by S1 and S2 , respectively, from the above disjunctive normal form:
Since when x is available (i.e., when x is bo
To effectively and accurately identify all the paths in the program, comprehensible representation of the program is helpful for the human reviewer. We employ the Condtrol-Flow Diagram as such a representation for programs. For example, we convert the program given above into the control flow diagram in Figure 3, with necessary omission of the statements which are not directly related to the representation of the desired functions (e.g., number3 = Integer.parseInt(e.getActionCommand())). Such a conversion can be automatically done with a software tool. We are currently working on the tool, but it is still too early to report in this paper. After it is completed with sufficient functions, we will report it in our another article.
S1 : x > 0 and z = x +10 S2 : y > 0 and z = y -10
To help the reviewer easily identify the desired functions and facilitate the establishment of the relation between specification functions and program paths during the review process, we adopt the graphical notation known as predicate-tree to represent all the functions in a process specification. Thus, the above two functions S1 and S2 are represented in the predicate-tree as shown in Figure 2. or
F e.getSource() == input3
T
number3 > 0
T
z = number3 +10
F
and
and
e.getSource() == input4 F
x>0
z = x + 10
y>0
T
number4 > 0
T
z = number4 - 10
F
z = y - 10
Figure 2. An example of predicate-tree
Figure 3. The control-flow-diagram
Suppose process B is implemented as a Java method, given below, that receives the input data flow x or y 3
Having derived all the paths from the diagram, the next task is to build the corresponding relation between the desired functions in the specification and paths in the program. To this end, it is needed to establish a mapping between the variables in the specification and those in the program to show each variable in the specification is represented by an appropriate variable in the program, because the specification and program may adopt different variables for the same data items. Such a mapping can be expressed by a mapping table which the reviewer needs to prepare in advance. Then we compare the predicate expressions in the specification and the paths in the program in the way that functional matching leads the progress of discovering the relation between the functions in the specification and the paths in the program. For example, variables x and y in the specification of process B are represented by the program variables number3 and number4, respectively. We then compare the conditions in the predicate expressions S1 and S2 and the program paths by examining the conditions and statements. As a result, the relation between functions and paths are established. Of course, this process can be more complicated and may require human judgements, but for programs that are rigorously implemented based on their specifications the establishment of the function-path relation can be easier. Based upon the function-path relation, we can conduct a rigorous review by reading through and comparing both the predicate expressions and their corresponding paths. Figure 4 shows the review process that focuses on the review of whether function S1 is correctly implemented by the corresponding program path. Examining the two paths in the program, we understand that the condition x > 0 is implemented by the test condition number3 >0 and the definition z = x +10 in function S1 is correctly implemented by the statement z = number3 +10 in the corresponding path. Similarly, the condition y > 0 and the definition z = y -10 are implemented by number4 > 0 and z = number4 -10 in the program, respectively.
or
and
x>0
B
and
z = x + 10
e.getSource() == input3
number3 > 0
y>0
z = y - 10
z = number3 +10
E
Figure 4. The mapping process
values, types, constraints) may directly affect the correct implementation of both the CDFD of the module and all the processes occurring in the CDFD. To ensure that an entire specification is implemented by a program, we need not only to review the correctness of processes, but also the implementation of types and representations of variables as well as enforcement of the required type and variable invariants in the implementation. For this reason, three level reviews, inlcuding module, process, and function reviews, are desirable based on the general structure of SOFL specifications. Table 1 shows the items to be reviewed at different level.
Level
Module Process
Object Items to be checked type var inv
type consistency external data structures GUI and states
process
input data output data data structures and access predicate-path consistency implementation of operators comment input data output data predicate-path consistency function calls
CDFD
3 The Review Strategy
A formal specification in SOFL is usually composed of a CDFD (Condition Data Flow Diagram) and the associated module. The CDFD is a graphical notation for defining the communication among processes (similar to operations in VDM or procedure in Pascal), while the module provides a mechanism for defining all the components in the CDFD, such as processes, data flows, and data stores. To this end, necessary constants, types, and variables may be declared in the module and the correctness of the declarations (e.g.,
Function
function
inter-relation of processes
Table 1. Review items
4
3.1
Module level
3.2
A process specification is composed of a name, input and output variables denoting incoming and outgoing data flows, external variables representing the data stores the process accesses, a precondition and postcondition for defining the functionality of the process, and a comment that interprets the formal specification in natural language. Figure 6 shows the general structure of a process specification.
A module is a collection of declarations, including types, store variables, type and state invariants, process specifications, and possibly function definitions. Figure 5 shows the general structure of a module. module type var inv behav
Process level
ModuleName; TypeDefinition; VariableDefinition; TypeAndStateInvariants; CDFD_No;
process
ProcessName(input)output
ext
ExternalVariables
process Init();
pre
Precondition
post
Postcondition
process_1;
comment InformalComments
…
end_process;
process_n; function_1; … function_m;
Figure 6. The structure of a process
end_process;
At the process level, we need to check input variables, output variables, external data structures, program paths, and implementation of abstract operators of abstract data types (e.g., union, len, elems) to ensure that they are all implemented correctly.
Figure 5. The general structure of a module
At the module level we need to check types, external data structures, and GUI in order to ensure all the abstract types and variables defined in the module are implemented properly in the program. For instance, the real number type real in the specification should not be implemented by the integer type int, because otherwise some values in real would be represented by no integers in type int in the program. Since two kinds of store variables are allowed to be declared in the var section, which are existing store variables and local store variables, respectively, they should be implemented properly in the program. For example, an existing store variable is required to be implemented as a file, while a local store variable should be implemented as other kind of data structure (e.g., array, vector). A type invariant or invariant on the CDFD is expected to be sustained throughout the entire program. Furthermore, since the CDFD associated with a module describes the integration of the components (e.g., processes) defined in the module, it is also needed to check the inter-relation among processes in the CDFD to ensure that the interface consistency of processes is not violated (e.g., the precondition of a process must be ensured by the postconditions of its preceding processes). To ensure all of these issues are examined, the questions from Q2-1-1 to Q2-1-3 of the Table given in Appendix present a check-list for the module level review.
3.2.1 In
put and output variables checking
The types that are defined in the module may be used in many places to declare input and output variables of processes. The reviewer needs to check if they are used appropriately in the declarations. It is necessary to ensure the existence, correctness, and consistency of the input and output variables. The questions from Q3-1-1 to Q3-1-2 in the Table given in Appendix provide a check-list about input variables, and those from Q3-2-1 to Q3-2-2 present a check-list concerning output variables. 3.2.2
External variables checking
External variables of a process may be used in its specification as readable or writable variables, indicated by the keywords rd and wr, respectively. If a variable is declared as rd type, a method in the program needs to be defined and called in order to read the current value of the variable. If a variable is declared as wr type, a method needs to be defined and called to read and update the variable. For this reason, the reviewer needs to pay attention to the implementation of the usage of the related external variables. The questions from Q3-3-1 to Q3-3-6 in the Table given in Appendix present a check-list for review of the representation of 5
the external variables in the program. The purposes of the first and second questions are to ensure a correct implementation of the readable and writable external variables, while the third and fourth questions suggest a need of checking the suitability of the method reading from and updating the external variables. The fifth question asks whether the definitions of methods to read and update the external variables are correct. The question Q3-3-6 is for the assurance of the fact that the type of the variables used in a method call is acceptable. 3.2.3
able, output variable, function-path consistency, operators of data types, and other functions calls. Since the similar check-list and principle for process level review can be applied to function level review, we omit the discussion on function level review for the sake of space.
4 Management of Review Process It is important to document the review data for management. The manager needs to collect information of review results and evaluate the software quality. The document of review data provides the evidence for the presence of defects, the possible reasons, and the response to them. We provide a template illustrated in Figure 7 for filling in the review data that are necessary and useful for evaluation of review results. The template provides a guideline for documenting review data and traceability of defects. In the template we need to record the item number, question number, specification type of item, class name of program, line number of program, root cause, and improved statements. The items offer fairly clear need of modification and the fact of improvement. All the detected defects that may lead to the damage of the quality of the program are recorded in the template. The item number is assigned by the reviewer for the sake of management. The question number indicates where the question is raised on the check-list. It is possible for a reviewer and the third party to track the defect by the specification type of item, the class name of program, and the line number in the program. For instance, the reviewer may write “the type of output variable is wrong” or “it does not call the loading method”. In this case the cause for the fact that the functional requirement defined in the specification is not implemented correctly in the corresponding program needs to be identified.
Path checking
It is necessary to check that every functional requirement defined in a specification is implemented correctly in the corresponding program. As we already described in Section 2, the reviewer must try to ensure this point using the function-path method. In the path checking, we need to check whether all the desired functions expressed in the specification of a process are correctly implemented by the paths in the corresponding program. The questions from Q3-4-1 to Q3-4-4 in the Table of Appendix show a check-list for review of the predicate-path consistency. 3.2.4
Operator checking
In process specifications the operators defined on abstract data types may be used in expressions, such as len(x) (the length of sequence x ) and hd(x) (the head of a non-empty sequence x). It is necessary to consider that the implemented program reflects the operators accurately. We prepare a question for checking such operators, which is intended to ensure that the required functions in the specification are implemented accurately in the program. For instance, if the abstract operator is hd, the argument of the operator must be implemented as an array or similar and the result of the application of the operator must yield the first element of the array. 3.3
5 Case Studies To assess the effectiveness of the proposed review method and uncover its weakness to be improved, we have conducted two case studies with different programs based on the same formal specification defining Shaoying Liu’s Lab Research Award Policy for students published as a Hosei University technical report [13] by the second author. Figure 8 shows the CDFD of the specification for the “Research Management Policy” system. The specification contains one CDFD in which five processes, four data stores, and five data flows are used, and one associated module in which fifteen types are defined; four existing external variables
Function level
A function can be defined either in an explicit or implicit manner similar to that for a process, but it differs from a process in several ways. A function neither accesses nor updates any data stores (i.e., it has nothing to do with external variables), and it supplies only one output value as the result of function application, while a process allows for more than one output values as a result of process execution. The function level review aims to check the similar items to those for process level, including input vari6
student_to_ remove
Item No
student
Question No
Register _ Student
Remove_ Student
Specification Type of Item
1
la b_st udent
2
l eft_st udent
Class Name of Program
Line No of Program
accept_note
Receive_ Notification
Root Cause
3
Improved statements
unpa id_student s
award_com Award_Money
Figure 7. The template of review
4
representing the four stores in the CDFD are declared; four type and data flow invariants are defined, and five processes occurring in the CDFD are specified. We use the programming language Java in each case study to implement the specification using the strategy that each process is transformed into a class and the associated methods. This principle is also applied to the transformation of functions, data types, and data stores in the specification. We conducted two case studies manually in order to examine the effectiveness and weakness of the proposed review method under different conditions. The first case study is intended to examine how effective the review method is in detecting faults in the program that is implemented by the reviewer himself or herself. For this purpose, we deliberately let the first author implement the specification written by the second author, and then let the first author review her own program against the specification. In contrast to this case, the second case study is designed to examine the effectiveness of the review method under the condition that the reviewer is different from the programer. In this case, we asked some students on our project to implement the specification and let the first author review the program. To ensure the credibility of the case studies, the second author inserted classified bugs in the program on each case study, without telling what and where they are to the first author, before the actual review process finished. 5.1
grants
r es ea rch_budget
Update_ Budget
Figure 8. Research Management Policy
to application of the function-path method. In this study the second author inserted classified faults in advance into the program which the first author wrote, without telling the first author. Then the first author conducted a review of her own program independently. Table 2 summarizes the results of the case study. The table include the following items: the number of the faults inserted in the program, the number of the faults found by review, and the fault detection rate. As indicated in the table, thirty five faults in total were inserted in the program and we could detect twenty one of them with our review method. The overall fault detection rate is 60%. Analyzing the details of the review results, we find that the review method is quite effective in detecting some kinds of faults, such as type consistency (86%), function-path consistency (75%), implementation of operators (100%), implementation of comment (100%), and function calls (100%), but less effective in identifying some other kinds of faults that have less relevance with the specification, for example, the fault in the display messages and in loop conditions of iteration statements (e.g., while, for, and do-while statements). The most undetected faults are related to boundary conditions used in the iteration statements. The specific distribution of the faults in those locations are as shown in Table 3. The reason why many of faults contained in loop conditions were not effectively detected is that the reviewer tends to believe her own program is correct in almost every aspect. Such an over-confidence has prevented the reviewer from paying
Case study 1
This case study focuses on the examination of the effectiveness of the review method in reviewing the program written by the reviewer herself based on the formal specification, in particular we pay great attention 7
great attention to complicated loop conditions. On the other hand, since the reviewer is familiar with her own program, she can easily find faults in uncomplicated situations. Therefore, apart from applying the proposed review method, there is a need to take specific strategy for review (e.g., examine every loop condition) in order to enhance the fault detection rate. Items type consistency display messages data structures and loop conditions function-path consistency implementation of operators implementation of comment function call method call GUI constructor Total
and found that most of the loop statements are actually for the implementation of the GUI of the system, which is not directly related to implementation of the desired functions. In the second case study, we put more emphasis on the examination of complicated conditions, including the conditions involved in iteration and selection statements. Similarly to case study 1, the second author inserted the classified faults into the program written by one of the students on our research project and the first author actually conducted the review independently. The review is done by taking the following specific steps: 1. The reviewer converts every conditional statement and loop statement in the program to a control flow diagram. Since the diagram presents the program paths comprehensibly, it helped the reviewer considerably understand the structure of the program and identify all the program paths. 2. The reviewer evaluates whether every statement of conditional branch in the control flow diagram is valid and accurate in meeting the specification functions. During the evaluation of conditional branches, the reviewer refers to the definitions of the methods and variables which are defined in the program from time to time. 3. The reviewer converts the conjunction of pre- and postconditions of each process specification into an equivalent disjunctive normal form by applying the rules in the predicate calculus and identifies the desired functions. 4. The reviewer checks whether all the desired functions are correctly implemented in the program. The result of the second case study is summarized in Table 4. Eighty three faults in total were inserted in the program beforehand and we could detect sixty seven of them with our review method. The overall fault detection rate is 80.7%, becoming much higher than the detection rate 60% for the first case study. An important reason is that the reviewer tends to be more critical than in the first case study and to be more rigorous in examining every aspect of the program. As a result, fault detection rate for loop conditions in iteration statements is improved from 50% to 100% . We also find that the fault detection rate in reviewing the function-path consistency is improved from 75% to 100%.
Inserted Inserted fault faults faults detection found rate 7 6 86%
4 16
0 8
50%
0%
4
3
75%
1
1
100%
1
1
100%
2 0 0 35
2
100%
21
60%
-
Table 2. The result of case study1
Location of the mistakes for-loop while-loop display messages miscellaneous Total
error numbers
4 3 4 3 14
Table 3. The distribution of errors in display messages and loop conditions
5.2
Case study 2
The second case study is an improved experiment applying our review strategy using the program implemented by the third party (a student) based on the same specification as for the previous case study. The first author conducted the review manually by following the redefined review plan. In case study 1, we learned that our strategy turn out to be conditionally less effective in detecting faults involved in the loopconditions of iteration statements, such as while, dowhile, and for statements. We analyzed those faults
5.3
Evaluation of case study results
In comparison with the first case study, the higher fault detection rate in the second case study benefits 8
Items
Inserted Inserted fault faults faults detection found rate type consistency 4 4 100% display messages 0 0 data structures and 55 55 100% loop conditions function-path 5 5 100% consistency implementation of 1 0 0% comment function call 0 0 method call 6 3 50% GUI constructor 12 0 0% Total 83 67 80.7%
their categories, and summarizing the review results. After completing the review, both the first author and second author work together to determine which inserted faults are found and which are not, as well as the natures and locations of both the detected and undetected faults. The documentation for review includes classifying the detected faults, record workload for review activities, and produce the final report for analysis of the review result. In accordance with the data in Table 5, the total time consumed in the second case study is 45 hours, 10 hours longer than that for the first case study. As mentioned before, it results from the need to understand both the structure and semantics of the program without being aware of the implementation idea beforehand. There is always a trade-off between the effectiveness and consumed time in the review process. To help enhance the efficiency of the review method, a powerful software tool is important. We are constructing a software tool that automatically translate a program into a functionally equivalent control flow diagram, derive program paths, support mapping from the components in the diagram to the corresponding component in the program, and analyze the review results.
Table 4. The result of case study 2
from the following activities in review: •
•
The translation of programs into control flow diagrams. Since understanding both the structure and semantics of the program under review is essential to ensure a trustable review, translating the program into a control flow diagram greatly helps the reviewer in understanding the meaning of the program because the diagram is a recognized comprehensible notation.
Review time for case time for case activities study 1 (/hour) study 2 (/hour) Planning 5 5 Review 10 20 Analysis 8 9 Documentation 12 11 Total 35 45
The reviewer pays greater attention to the examination of all the aspects of the program and takes a critical attitude toward the quality of the program. Since the reviewer reviews other programer’s program, she tends to be suspicious about almost every aspect of the program, therefore, it helps her to detect more subtle faults in the conditions involved in iteration and selection statements.
Table 5. The times consumed for review activities
However, since the reviewer is unfamiliar with the structure of the program and the idea how the program is constructed, she spent more time for reading the definitions of the methods and variables which are defined in the program. This task is not necessarily easy either because the inter-relation among methods tends to be more complicated in an object-oriented programming language like Java than in a structured programming language like Pascal. The workload in terms of time consumption for each kind of activities in the review process, such as planning, review, analysis, and documentation, is summarized in Table 5. The planning of review includes the formation of check-lists and insertion of faults in the program. The review includes reading the program, recording the detected faults and
6 Conclusion and future research
The specification-based review approach that we have proposed in this paper can be an effective technique in detecting faults in programs. The core idea of the approach is that the review needs to ensure that every desired function defined in the specification is implemented correctly by a path in the program. We have conducted two case studies to investigate the effectiveness and weakness of the proposed review approach and obtained quite positive results. In the first case study, the result shows that the proposed method is effective in uncovering faults in most cases, but may be conditionally less effective in detecting faults involved 9
Q3-1-1: Is each input variable implemented properly in the corresponding program? Q3-1-2 Is the type of each input variable correct? Q3-2-1: Is each output variable implemented properly in the corresponding program? Q3-2-2: Is the type of each output variable correct? Q3-3-1: If the specification uses rd type of external variables, is the definition of the loading method that reads from the external variable implemented in the corresponding program? Q3-3-2: If the specification uses wr type of external variables, is the definition of loading and saving methods that read and write the external variables implemented properly in the corresponding program? Q3-3-3: Is the loading method called properly? Q3-3-4: Is the saving method called properly? Q3-3-5: Are the definitions of loading and saving methods correct? Q3-3-6: Are the types of the variables involved in calls of the methods correct? Q3-4-1: How many function-path pairs does the specification have? How many program paths does the program have? Q3-4-2: Can a function-path be mapped to a program-path? Q3-4-3: Is conditional branch statement correct? Q3-4-4: Is each statement correct? Q3-5-1: Are the operators mk, diff, union, conc, hd, tl, and others appropriately implemented?
in the loop-conditions of iteration statements, such as while, do-while, and for statements. In the second case study, the modified strategy is applied to the program implemented by the third party. We have found that it is effective in detecting faults due to the adoption of the comprehensible control flow diagram derived from translation of the program under review. Apart from this discovery, we have also found that review of other person’s program takes more time than review of the program written by the reviewer herself, and it is important to have a tool support in order to enhance the efficiency of the review process. The research reported in this paper provides a good base for systematically study the specification-based review approach. In future research we plan to concentrate on every step of the function-path review method, including the techniques for identifying desired functions from a formal specification of process, extracting all the program paths from the program, establishing the mapping between the desired functions in the specification and the paths in the program, and for supporting the verification of each path against its corresponding desired function. To enhance the trustability and efficiency in reviews, we will continue our current work on the building of a software supporting tool for the review approach as well.
Appendix
Acknowledgements
We would like to thank Tomoya Sano, Yota Taira, and Yoshihisa Sadahira for joining discussions during the research.
Q1-1 :
Is every process implemented properly in the corresponding program? Q1-2 : Is every data store implemented properly in the corresponding program? Q1-3 : Is each store accessed properly by the implemented components of the related processes?
References [1] Harlan D. Mills, Michael Dyer, and Richard Linger. Cleanroom Software Engineering. IEEE Software, 4(5):19—25, Sept 1987. [2] Tom Gilb and Dorothy Graham. Software Inspection. Addison-Wesley, 1993. [3] Gerald M. Weinberg Daniel P. Freedman. Handbook of Walkthroughs, Inspections, and Technical Reviews, Third Edition. Dorset House, 1990. [4] Karl E. Wiegers. Peer Reviews in Software: A Practical Guide. Addison-Wesley, 2002.
Q2-1-1: Is every type in the module implemented using appropriate data structures in the program? Q2-1-2: Is every type invariant implemented correctly in the program? Q2-1-3 Is every invariant of data flows in the CDFD are enforced in the program?
10
[5] M. E. Fagan. Design and Code Inspections to Reduce Errors in Program Development. IBM Systems Journal, 15(3):182—211, 1976. [6] John Dawes. The VDM-SL Reference Guide. Pitman, 1991. [7] Jim Woodcock and Jim Davies. Using Z: Specification, Refinement, and Proof. Prentice-Hall Europe, 1996. [8] Edward Yourdon. Modern Structured Analysis. Prentice Hall International, Inc., 1989. [9] Shaoying Liu, A. Jeff Offutt, Chris Ho-Stuart, Yong Sun, and Mitsuru Ohba. SOFL: A Formal Engineering Methodology for Industrial Applications. IEEE Transactions on Software Engineering, 24(1):337—344, January 1998. Special Issue on Formal Methods. [10] Shaoying Liu, Masashi Asuka, Kiyotoshi Komaya, and Yasuaki Nakamura. An Approach to Specifying and Verifying Safety-Critical Systems with Practical Formal Method SOFL. In Proceedings of Fourth IEEE International Conference on Engineering of Complex Computer Systems (ICECCS’98), pages 100—114, Monterey, California, USA, August 10-14 1998. IEEE Computer Society Press. [11] Chris Ho-Stuart and Shaoying Liu. A Formal Operational Semantics for SOFL. In Proceedings of the 1997 Asia-Pacific Software Engineering Conference, pages 52—61, Hong Kong, December 1997. IEEE Computer Society Press. [12] Jin Song Dong and Shaoying Liu. An Object Semantic Model of SOFL. In Keijiro Araki, Andy Galloway, and Kenji Taguchi, editors, Integrated Formal Methods 1999, pages 189—208, York, UK, June 28-29 1999. Springer-Verlag. [13] Shaoying Liu. A Formal Specification of Shaoying Liu’s Lab Research Award Policy for Students. Technical report HCIS-2003-02, Faculty of Computer and Information Sciences, Hosei University, Koganei-shi, Tokyo, Japan, 2003.
11