Generation of a minimal set of test cases that is functionally equivalent ...

3 downloads 1249 Views 172KB Size Report
Technical University of Ilmenau, Germany ... College of Engineering. Dept. of Electrical and Computer Engineering. University of Central Florida, Orlando, FL.
Generation of a minimal set of test cases that is functionally equivalent to an exhaustive set, for use in knowledge-based system validation Thomas Abel & Rainer Knauf

Faculty of Computer Science and Automation Dept. of Arti cial Intelligence Technical University of Ilmenau, Germany [email protected] [email protected]

Abstract A lot of the created AI Systems nd their end in the "wastebasket of academic solutions". The authors are convinced, that the main reason of this dilemma is the fact that nobody can really prove the validity of his system. This is no surprise, because expectations of users usually can't be formalized. One way out of this dilemma is using a test case methodology for system validation. The paper presents a method to nd a set of "good" test cases for rule{based systems using sensor data as input.

1 INTRODUCTION One question that has always posed diculties for knowledge-based system developers has been how intensively to test the system when validating its performance [3]. A lack of suitable standards has in the past prevented this question from being answered appropriately. One standard that does exist, however impractical as it may be in most cases, is the exhaustive testing of the knowledge-based system. That is, generate a set of test cases which covers all contingencies possible in the operation of the system. For systems which have more than a few inputs, the combinations of values of these inputs can be very large, thus making exhaustive testing quite impractical [2]. Nevertheless, it is not necessary in most cases to have a truly exhaustive set of test cases, yet still be able to test the system in a functionally exhaustive fashion. As stated by Chandrasekaran [1], the test cases should re ect the problems to be seen by the system.

0

Avelino Gonzalez

College of Engineering Dept. of Electrical and Computer Engineering University of Central Florida, Orlando, FL [email protected]

A functionally exhaustive set of test cases can be made considerably smaller than a naively exhaustive set by eliminating functionally equivalent input values and combinations of input values which subsume other values. In this paper we describe an automated technique which generates a functionally exhaustive set of test cases. While we certainly do not claim that a functionally exhaustive set of test cases makes it practical to test exhaustively in most cases, it will present the highest standard of testing intensity, from which a developer can step back if necessary. Before we describe our approach, we shall include a formal de nition of our terminology and algebra in order to facilitate the understanding of our approach. We will limit our discussion to rule-based systems used for diagnostic or classi cation tasks, and which use analog-type sensing devices as inputs (i.e., sensors).

2 DATA DESCRIPTION  S = fs1 ; s2 ; : : :; sm g1 is a set of variables si designating the input sensor data and ranging

max with a "normal value" between smin i and si snorm (a value read by the input device and coni sidered as unfaulty).

 E  f[t1; r; t2] : t1 2 S; t2 2 (S [ 0) contains those test cases t 2 Mi being positive ones for fi and P0 contains all test cases t 2 Mi (0 < i < n) being negative for fi : We have to check for each t 2 Mi whether the knowledge based system creates a positive solution for fi . If fi is true, then t will be an element of Pi of positive test cases for the proof of fi , otherwise t will be an element of the set of negative test cases P05 . Upon completion, P0 contains all negative test cases t 2 P, and the sets Pi (i > 0) contain all positive test cases t 2 Mi . So we have sorted the set P into n + 1 dijunctive subsets Pi. Of course until now the cardinality of P isn't decreased, but this procedure will help us in the second step.

5 It's possible as well to collect all the test cases being a member of i which don't imply malfunction i but j in the corresponding set j . This would lead to a smaller set 0 and more or less larger sets j . Nevertheless we won't follow this way here. We think that these cases are useful for further knowledge acquisitionor improvement phases of the AI system. M

f

P

f

P

P

The second step - Minimizing all the sets Pi

The approach for minimizing each Pi individually (with the exception of P0 ) is as follows: 1. Segregate the largest possible subset Pseg of mtuples which di er in only one value (for instance the value of the variable sj ) from the set Pi. 2. With this segregated subset Pseg , look through Ri and gather all expressions in which sj is compared to any value or another variable sk . The purpose is to nd values of sj that subsume other ones (depending on the used relations). Therefore all potential test cases that are subsumed by other ones will be removed from that subset. The remaining set is Pgood . 3. Let the new set Pi be the minimized set: Pi new := (Pi old n Pseg ) [ Pgood

4. Repeat this (i.e., go to the top item) with Pi new until no subset Pseg is creatable from Pi new . 5. The minimized subset Pi? is the set Pi new computated in step (3). The above procedure should be carried out for each of the sets Pi. Upon completion of this procedure, there are only 1 or 2 test cases per sj and per expression being part of a rule's premise, which describes a relation between sj and a certain value or another variable. In the end, the minimized set of test cases able to test the knowledge based system exhaustively can be written as an n-tuple [P1?; : : :; Pn?] and the minimized set of all test cases looked for is the union of all sets Pi? (0 < i  n): FES =

What to do with P0?

[n

i=1

Pi?:

The elements of P0 describe a non-faulty system's behaviour, faulty states of the system undetected by the knowledge based system or faulty states of the system detected as having another fj than the considered fi . Our tendency is to ignore these negative test cases, as they do not explicitely test the ability of the knowledge based system to identify problems or classsify into classes. However, it could be argued that negative test cases are as important as positive ones, as a misdiagnosis by ommission (not identifying something when it should) can be at least as serious as identifying the wrong problem. We did not deal with

this issue during the course of this investigation, but feel strongly that it should be part of future continuation of this work. In fact, [4] also shows a technique for developing negative test cases. One approach that could be taken is that negative cases for some malfunctions become positive cases for others. The only requirement would be to keep track of which test case is what to which conclusion, something that could be easily done. By the way, if we imagine all the test cases t 2 FES as a map, then P0 6 FES can be considered as an "ocean of ignorance" in the technical system's "map of malfunctions". All the known test cases t 2 Pi (t > 0) of one known malfunction fi would be situated "on land" just like a country with borders to other countries (i.e., known malfunctions) or with a coast to the ocean. All the test cases t 2 P0 would be situated in the "ocean" itself. The shores of that "ocean" are the borders to the regions of in uence of known conclusions. In that case there is a chance to acquire new knowledge about the technical system by using P0 for nding "isles of new, unknown or unconsidered conclusions in the ocean of ignorance". But this topic should be discussed in a future article.

4 EVALUTION OF PROCEDURE In order to evaluate the technique described above, we designed an example problem consisting of 6 sensor inputs, 12 nal conclusions, and 19 rules. The relationships between the sensors and the nal conclusions ranged from 2 sensors a ecting one nal conclusion to 5 sensors to one nal conclusion. This provided a variety of complexity which was more realistic than it was worst case. The total set of test cases generated initially by the described procedure numbered 2394 cases. After minimizing the set P, P* was reduced to 103 test cases, a more manageable number. We feel that these test cases are functionally equivalent to the 2394, the only di erence being that the former number contains many test cases which subsume one another, and are thus redundant. The negative test cases were also removed. We furthermore feel that these test cases also represent an exhaustive coverage of the knowledge-based system as each nal conclusion was tested with all combinations of sensor values that a ect it within the signi cant ranges.

5 SUMMARY AND CONCLUSION The procedure described showed the ability to create a set of test cases that is capable of testing a

knowledge-based system exhaustively, yet doing so in a minimal fashion to avoid repetitious e ort. This sets the gold standard for test case validation of a knowledge-based system, something which has been lacking in the literature. Yet, it does it in a manageable fashion. A drawback of the approach is its high time complexity, but that price can and should be paid, because 1. the procedure of nding test cases is of a large complexity, but it isn't as time critical as expected, because computers do that task without any human assistance nearly "over night", but 2. humans having the task to judge about the results of test cases are usually not willing to spend plenty of time for doing such a job, and 3. work that can be done by "dumb" computers is mostly cheaper than work that has to be done by "smart" experts. We are not suggesting that all knowledge-based systems should be tested using this test case set. In many situations, it would be unnecessary and in others, the negative test cases would have to be added to avoid errors of ommission. The appropriate testing intensity for a knowledge-based system is a topic of future research, and it should be based on the customer's criteria as de ned in the speci cations. The availability of a gold standard, however, permits a set level of testing intensity from which the developer can back o if appropriate.

References [1] Chandrasekaran, B.: On Evaluating AI Systems for Medical Diagnosis; In: AI Magazine, 4:2, pp. 34-37. [2] Gonzalez, A.J.; Dankel, D.D.: The Enginee-

ring of knowledge{based Systems { Theory and Practice; Englewood Cli s, N.J.; Prentice Hall,

1993. [3] O'Keefe, R.M.; O'Leary, D.E.: Expert System

Veri cation and Validation: A Survey and Tutorial; In: Arti cial Intelligence Review, Vol. 7,

pp. 3-42. [4] Zlatareva, N.P.: A Framework for Knowledge

based System Veri cation, Validation and Re nement; Proceedings of the fth Florida Arti-

cial Intelligence Research Symposium, St. Petersburg, FL, 1992, pp.10-14.

Suggest Documents