Generalized algorithmic debugging and testing

2 downloads 0 Views 602KB Size Report
Information Science, Linki.iping University, S-581 83, Linki.iping, Sweden; email: {paf, nsh, mak}@ida.liu.se; T. Gyimothy, Research Group on the Theory of ...
Generalized Algorithmic Debugging and Testing PETER FRITZSON, NAHID SHAHMEHRI, and MARIAM KAMKAR Linkoping University and TIBOR GYIMOTHY Hungarian Academy of Sciences

This paper presents a method for semi-automatic bug localization, generalized algorithmic debugging, which has been integrated with the category partition method for functional testing. In this way the efficiency of the algorithmic debugging method for bug localization can be improved by using test specifications and test results. The long-range goal of this work is a semi-automatic debugging and testing system which can be used during large-scale program development of nontrivial programs. The method is generally applicable to procedural languages and is not dependent on any ad hoc assumptions regarding the subject program. The original form of algorithmic debugging, introduced by Shapiro, was however limited to small Prolog programs without side-effects, but has later been generalized to concurrent logic programming languages. Another drawback of the original method is the large number of interactions with the user during bug localization. To our knowledge, this is the first method which uses category partition testing to improve the bug localization properties of algorithmic debugging. The method can avoid irrelevant questions to the programmer by categorizing input parameters and then match these against test cases in the test database. Additionally, we use program slicing, a data flow analysis technique, to dynamically compute which parts of the program are relevant for the search, thus further improving bug localization. We believe that this is the first generalization of algorithmic debugging for programs with side-effects written in imperative languages such as Pascal. These improvements together makes it more feasible to debug larger programs. However, additional improvements are needed to make it handle pointer-related side-effects and concurrent Pascal programs. A prototype generalized algorithmic debugger for a Pascal subset without pointer side-effects and a test case generator for application programs in Pascal, C, dBase, and LOTUS have been implemented. Categories and Subject Descriptors: D.2.5 [Software Engineering]: Testing and Debugging-debugging aids; D.2.6 [Software Engineering]: Programming Environments General Terms: Algorithms, Experimentation, Performance Additional Key Words and Phrases: Algorithmic debugging, automated debugging, category partition testing, program slicing

This work is supported by NUTEK, the Swedish National Board for Technical Development. Authors' addresses: P. Fritzson, N. Shahmehri, M. Kamkar, Department of Computer and Information Science, Linki.iping University, S-581 83, Linki.iping, Sweden; email: {paf, nsh, mak}@ida.liu.se; T. Gyimothy, Research Group on the Theory of Automata, Hungarian Academy of Sciences, H-6720 Szeged, Aradi vertanuk tere 1, Hungary; email: [email protected]. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a foe and/or specific permission. © 1992 ACM 1057-4514/92/1200-0303$1.50 ACM Letters on Programming Languages and Systems, Vol. 1, No. 4, December 1992, Pages 303-322.

304

Peter Fritzson et al.

1. INTRODUCTION

Debugging has always been a costly part of software development, and several attempts have been made to provide automatic computer support for this task [Seviora 1987). The Algorithmic Debugging Technique, introduced by Shapiro [1983] was the first attempt to lay a theoretical framework for program debugging and to take this framework as a basis for a partly automatic debugger. In this system, the programmer supplies a partial specification of the program during the bug localization process, by answering questions. However, Shapiro's original model did not handle side-effects or loops (although loops can be handled by transforming them to recursive functions) and was initially only being applied to Prolog programs. This restriction prevents the system from being practically useful for programs written in imperative languages. Later the method was generalized to pro­ grams in concurrent logic programming languages [Lichtenstein and Shapiro 1989]. Another generalization of algorithmic debugging, called Rational De­ bugging, is described by Pereira [1986]. That method can handle side-effects in Prolog, but has lost the property of declarative debugging since questions to the user are not only concerned with the declarative input/output seman­ tics of procedures or functions. Recent work [Nilsson and Fritzson 1992] has extended algorithmic debugging to lazy functional languages. A major drawback of algorithmic debugging is the great number of user interactions during the debugging process. Thus, an important improvement would be to supply the debugging system with some information which can reduce this number. An algorithmic debugging method for imperative languages, which pre­ serves the property of declarative debugging, was first presented in Shahmehri and Fritzson [1989]. A major improvement in the bug localization process is demonstrated in Shahmehri et al. [1990] by combining program slicing and algorithmic debugging. Program slicing, as presented by Weiser [1982] is a method for automatically decomposing programs by analyzing their data flow and control flow. This method isolates individual computation threads within a program. The size of a slice is usually program and input dependent. However, in practice, a slice is often much smaller than the original program, especially for block-structured languages. In this paper we present a further improvement in the bug localization process by combining the category partition testing method [Ostrand and Balcer 1988] with the algorithm introduced in Shahmehri et al. [1990]. The main concept of the imp rovement iB the following: During the debugging of a real program the user has to answer a great number of "difficult" questions. For example, suppose that the program contains a procedure which computes the sum of an array, and this procedure is called with an actual parameter which is an array with a hundred elements. The user cannot easily check the correctness of the result. However, if this procedure has already been exhaus­ tively tested, the test results can be used in the debugging process. Of course, if the test results are incorrect the bug localization may go wrong-but the same may happen if the user answers a question incorrectly. ACM Letters on Programming Languages and Systems, Vol. 1, No. 4, December 1992.

Generalized Algorithmic Debugging and Testing

305

We developed a Generalized Algorithmic Debugging and Testing method (GADT) which uses the program-slicing concept and an extended version of the category partition testing method during the bug localization process. This allows us to combine a rather formal testing approach with the more informal debugging needed during development of new program modules or when correcting bugs detected during testing. In the remainder of this paper we first give a brief overview of the testing method used in GADT. Section 3 contains a short description of algorithmic debugging. Program slicing is briefly explained in Section 4. Then, the GADT method is described in Sections 5, 6, and 7. An example is given in Section 8 to show how the debugging system works from the user's point of view. We describe the implementation status in Section 9. 2. T-GEN:

AN

EXTENDED

VERSION

OF

THE

CATEGORY

PARTITION

TESTING METHOD

In this section we describe a method which provides assistance for functional testing of programs. This method has been implemented in the T-GEN test case generator tool. T-GEN is able to generate executable test cases for testing programs in C, Pascal, dBase, or LOTUS. A detailed description of T-GEN can be found in Szucs and Gyimothy [1990] and Toczki et al. [1990]. During the process of functional testing, the programs cannot be tested with all the possible properties of the input parameters. Hence, the tester's first task is to define the critical properties of parameters. These critical properties, called categories, are investigated in the testing process. The categories can be divided into classes, called choices, presuming that the behavior of the elements of one choice is identical from the point of view of the test process. If the categories and choices for a program have been defined, then T-GEN is able to generate all the possible test frames. A test frame contains exactly one choice from each category. In general, there are many superfluous frames among the generated test frames. These frames can be eliminated by associating selector expressions with the choices. A choice can be made in a test frame if the selector expression associated with the choice is true. The selector expressions contain property names. A property name is also associated with a choice and can be considered as a logical variable. The value of this variable is true if the given frame contains that choice. A program usually produces a number of results, and the tester must define the "interesting" test frames. The results of a program can also be divided into categories and choices by selector expressions. Running test cases in applications usually necessitates time-consuming installation of environment parameters. The test frames using the same environment can be divided into test scripts by way of selector expressions. In the different parts of the test specification, the user can describe declarations and executable statements which are generated into the test cases. ACM Letters on Programming Languages and Systems, Vol. 1, No. 4, December 1992.

Peter Fritzson et al.

306

procedure arrsum(a· mtarray, var

b. mteger);

var 1 integer. begin

b·= O; i =I

for

to n do

b:= b + a[1], that the value ofn 1s defined globally*)

end,(* we suppose test arrsum;

s1ze_of_array,

category

zero one

Fig.

1.

arrsum

The

definition

of the

procedure

and a framework for a test specifica­

tion for this procedure.

property SINGLE, ·

property SINGLE,

two ., more. propertyMORE: type_of_elements, postt1ve , negative .,

category

mixed category

ifMORE propertyMIXED.

deviatwn;

small

large

average

: ifMIXED, ifMIXED,

scripts

scnpt_l : ifMIXED, scnpt_2 if notMIXED,

result

result_!

·ifMIXED:

In Figure 1 we give the structure of a test specification for the procedure This procedure is able to compute the sum of the elements of an array. From the specification given in Figure 1, T-GEN is able to generate test frames integrated in test scripts. For example, under the label "scripts" in Figure 1, scripLl contains two frames: (more, mixed, large) and (more, mixed, average). Only one frame is generated for each choice associated with the SINGLE property. By extending the test specification with declarations and executable state­ ments the system can generate executable test cases from test frames. During the execution of the test cases, test reports are produced in a database. These test reports can easily be accessed by using a coded form of the test frames. The system presented in Ostrand and Balcer [1988] investigates only the generation of test frames with a restricted form of selector expressions. The new features implemented in T-GEN (test scripts, result categories, test cases, test reports) extend the application possibilities of the category parti­ tion method. The test specifications and test reports implemented for the procedures of a program can be used during the algorithmic debugging process (see 5.3.2).

arrsum.

3. PRINCIPLES OF THE ALGORITHMIC DEBUGGING TECHNIQUE

Algorithmic program debugging as originally defined by Shapiro [1983] is an interactive process where the debugging system acquires knowledge about the expected behavior of the debugged program and uses this knowledge to localize errors. The knowledge is collected by the system through a number of ACM Letters on Programming Languages and Systems, Vol. 1, No. 4, December 1992.

Generalized Algorithmic Debugging and Testing

307

yes or no questions to the user. The target and implementation language for the algorithms presented by Shapiro is Prolog. Drabent et al. [ 1988] suggest a generalization of the language used to communicate with the debugger. In addition to the usual yes and no answers, assertions are allowed. The assertions give formal specifications of some properties of the intended program. These specifications can be logic pro­ grams. This mechanism may reduce the number of user interactions. We generalize the algorithmic debugging method to programs which may contain side-effects and which can be written in imperative languages, e.g., Pascal. Our implementation is currently limited to sequential programs without pointer-related side-effects, but including side-effects related to global variables, reference parameters, simple I/O, and goto statements. We follow Banning's [1978; 1979] definition of side-effects which consists of both vari­ able side-effects and exit side-effects. Assertions in this model are expressed in terms of Boolean expressions, which can refer to functions and procedures, parameters, and global variables. The current target and implementation language for our algorithmic debugging system is Pascal. The algorithmic program debugger can be invoked by the user after notic­ ing an externally visible sympton of a bug. The debugger executes the program and builds a trace execution tree at the procedure level while saving some useful trace information such as procedure names and input/output parameter values. The debugger can be used either when developing new code in a system consisting of both old and new modules or when localizing bugs during maintenance of existing code. The algorithmic debugger traverses the execution tree and interacts with the user by asking about the expected behavior of each procedure. The user has the possibility to answer yes or no to give an assertion about the intended behavior of the procedure. The search finally ends, and a bug is localized in a procedure p when one of the following holds: -Procedure p contains no procedure calls. -All procedure calls performed from the body of procedure p fulfill the user's expectations. The output from the debugger shows that, given the input parameter values and the user's assertions about expected results, an error has been isolated to a certain procedure body. If there are multiple bugs in the program, the debugger will localize one of them. If the user knows that more bugs are present the search may continue, and additional bugs will be localized. The method can be applied to nondeter­ ministic programs since the nondeterminism can be removed by performing the bug search on a full trace of the specific execution leading to the bug. As previously mentioned, this generalization of algorithmic debugging cannot handle concurrent programs. There are, however, implementations for debug­ ging concurrent logic programming languages [Huntbach 1987; Lichtenstein and Shapiro 1989; Takeuchi 1987], where the unit of bug localization is a process instead of a procedure. ACM Letters on Programming Languages and Systems, Vol. 1, No. 4, December 1992.

Peter Fritzson et al.

308

3.1 Simple Examples of Algorithmic Debugging

We introduce a bug into an insertion-sorting algorithm by swapping the arguments of the predicate greater in a function insert. The actual behav­ ior of the function sort on the given input [2, 1, 3 J is s o rt [2, 1, 3 J [3, l]. Figure 2 illustrates the execution tree corresponding to this execution. How­ ever, the expected results of this execution, i.e., the intended behavior of this function, is sor t [2, 1, 3 J [ 1, 2, 3 J Any discrepancies between the actual and intended behavior of the function sort is the symptom of a bug. An interaction session with the algorithmic debugger on the sort example follows in Figure 3. In the interaction sessions presented in this paper, the boldface text stands for the debugging system's output, and normal text represents user input. As another small example of algorithmic debugging, consider the procedure P which given the two input parameters a and c computes the value of the two output parameters b and d. The value of variable b is computed by calling procedure Q, and the value of variable d is computed by calling procedure R. =

=



procedure P (a, c: integer; var b, d: integer); procedure Q(a: integer; var b: integer); end· pro edure R (c: integer; var d: integer);



end; begin Q (a,b); R (c, d); end;

Assume that: A call to procedure P on input values a' and c' returns the output values b' and d ' . There is a bug in procedure R which causes the wrong output valued'. An interaction session with the algorithmic debugger will be as follows. P(ln a: a', In c: c', Out b: b', Out d: d')? no Q(ln a: a', Out b: b')? yes R(ln c: c', Out d: d')? no An error is localized inside the body of procedure R. 4. THE CONCEPT OF PROGRAM SLICING

Program slicing introduced by Weiser [1982] is a method for reducing the amount of code that needs to be inspected when debugging [Lyle and Weiser 1987] or understanding programs. If we consider a specific subset of a program's behavior, slicing reduces that program to a minimal form which ACM Letters on Programming Languages and Systems, Vol. 1, No. 4, December 1992.

Generalized Algorithmic Debugging and Testing

309

sort (In [2,1,3])=[3,l]

sort

insert I In 2, In

(In [1,3])=[3,1]



sort

insert

(In [3])=[3]

(In l,In [3])=[3,1]



[ 3 ,1 I ) =[3, 1 l



sort

insert

(In [])=[]

(In 3,In [])=[3]

\

insert

(In 2,In [1])=[1]

insert (In 1,In [I )

=

[1 I

Fig. 2. The execution tree of the function sort on input [ 2, 1, 3 J. This erroneous execution of sort is caused by a bug in the function insert.

sort(in: >

sort(in: >

sort=[3,l])?

list=[l,3], out:

sort=[3,l])?

no

sort(in: >

list=[2,l,3], out:

no

list=[3], out:

sort=[3])?

yes

insert(in: elem=l, in: list=[3],out:insert=[3,l])? >

no

insert(in: >

elem=l, in:

list=[], out:

insert=[l])?

yes

An error has been localized inside the body of

function "insert".

Fig. 3. This shows the user interactions during bug localization. The algorithmic debugger searches for the bug while asking questions to the user. For example, sorting [2, 1, 3] in increasing order should yield [1, 2, 3] and not [3, l]. Therefore, the user answers no; this is incorrect. Sort([l, 3]) = [3, 1] is also incorrect, whereas sort([3]) [3] is correct. Finally the bug is localized to within the procedure insert. =

still produces that behavior. The reduced program which is an independent program is called a slice. A program slice at a program point p on a variable v is all statements and predicates of the program that might affect the value of vat point p. Figure 4 shows an example program and a slice of that program taken on the variable mul with respect to the last line of the program. In the presence of procedures and procedure calls, interprocedural slicing generates a slice of an entire program, where the slice crosses the boundaries of procedure calls [Horwitz et al. 1990; Weiser 1984). 5. FUNCTIONAL OVERVIEW OF GADT

We divide our algorithmic debugging methodology into three major phases: a transformation phase, a tracing phase, and a debugging pha:se (:see Figure 5). This structure is similar to the implementation of the tool for debugging parallel programs presented in Choi et al. [1991]. The last phase consists of ACM Letters on Programming Languages and Systems, Vol. 1, No. 4, December 1992.

Peter Fritzson e t al.

310

programp; var x, y, z, sum, mu!. integer; begin read(x,y): mul:= O; sum:=O; ifx

Suggest Documents