engineers develop custom, domain-specific, specification-based testing tools. ... NASA Software Independent Verification and Validation Facility on the ...
Siddhartha1: A Technique for Developing Domain-Specific Testing Tools Arthur Alexander Reyes, Debra Richardson Department of Information and Computer Science University of California Irvine, CA 92697-3425 U.S.A. voice: +1 (949) 824 4043 facsimile: +1 (949) 824 4056 electronic mail: {artreyes, djr}@ics.uci.edu World-Wide Web: http://www.ics.uci.edu/~artreyes Abstract Existing specification-based software testing tools are difficult to use in many real-world application domains. This occurs when a tool’s design is constrained along dimensions (which are often theoretically-oriented) which are incompatible with the constraints at work in the application domain (which are often business-oriented). When this problem occurs and the benefits to the software organization clearly outweigh the costs, test engineers develop custom, domain-specific, specification-based testing tools. When this problem occurs and the benefits do not clearly outweigh the costs, rigorous software testing may be forgone. This paper describes Siddhartha. Siddhartha is a technique for constructing domain-specific, specification-based software testing tools . Siddhartha provides essential automation support for specification-based testing in application domains that are “untestable” by existing tools. Siddhartha represents a defined, disciplined alternative to ad hoc development of domain-specific testing tools. Domain-specific testing tools constructed under Siddhartha take advantage of available software tool support and allow existing (legacy) development processes to remain intact. By characterizing the natural variation that exists within each domain, and building testing tools that operate within those constraints, Siddhartha supports “outlying” application domains, which cannot be supported by existing tools. Siddhartha is composed of a domain-independent library of software tool components, a reference architecture, and a process model. By following the process model, a test engineer extends the library to construct a domain-specific program synthesizer conforming to the reference architecture. The synthesizer is an essential, 1
Siddhartha Gautama /si-’dar-te-’gaut-e-me/ ca 563-ca 483 B.C. The Buddha /’bud-e, ‘bud-/ Indian philosopher; founder of Buddhism We chose the name Siddhartha as an extension of the acronym “SDRTA,” which stands for Signature Discovery, Reformulation, Translation, and Automation. SDRTA served as our early research paradigm. In Sanskrit, Siddhartha means “he who has attainted.” Because this research began while Reyes attainted his Ph.D., we thought Siddhartha was an exceptionally appropriate name for the technique. 1
specification-based, software testing tool. The synthesizer transforms test specifications written in formal, domain-specific languages into corresponding test driver-oracle programs. Test driver-oracle programs automate test case execution and test case evaluation. This paper discusses the results of applying Siddhartha in two testing domains: requirements testing of Ada2 procedures against formal specifications written in SCR and regression testing of Ada procedures against earlier versions, assumed functionally equivalent. Ada procedures in these application domains are characterized by insufficiently defined interfaces and ubiquitous reference and assignment to global variables. We evaluate these domain-specific test synthesizers by replicating tests performed by the NASA Software Independent Verification and Validation Facility on the Operational Flight Program for the Production Support Flight Control Computer used on NASA Dryden Flight Research Center’s F/A-18B Hornet System Research Aircraft. This evaluation clarifies the prospective costs and benefits of applying Siddhartha within a software development organization.
1. Introduction Specification-based program testing seeks to automate test case generation and test case evaluation by exploiting formal specifications [1, 2, 4, 5, 8, 12, 13, 16, 17]. In specification-based testing, a formal specification serves as an unambiguous standard against which the behavior of the program unit under test (UUT) can be automatically compared and as a mechanism by which tests can be developed. Adoption of specification-based testing tools in industry has been slow. We believe an essential problem inhibiting the adoption of specification-based testing tools in industry is that the tools are designed along constraints which are incompatible with domain-specific, industry constraints. Software tool designers decide tradeoffs between power and generality. By constraining the class of problems to be supported, a software tool designer can bring additional problem-solving knowledge to bear, producing a tool with greater automation support (i.e., more power). Specification-based testing tools are no exception to this principle. To make a specification-based testing tool more powerful, the choices of what constraints to apply can be informed by the theory of test data selection being supported, the formal specification language being supported, characterization of the targeted market segment, etc. Our experience indicates that designers of specification-based testing tools tend to emphasize constraints that arise from theoretical considerations and under emphasize application domain-specific constraints. For example, many specification-based testing tools constrain program units under test and their formal specifications of intended behavior to possess fully encapsulated and syntactically equivalent interfaces [1, 2, 4]. This constraint simplifies translation between test case specifications, which are written in the language and vocabulary of the formal specification, and ground (i.e., variable-free) test 2
® Ada is a registered trademark of the United States government.
2
case input and output data, which are represented in the vocabulary of the UUT. Specification-based test tools rely on these translations to support test case generation (i.e., refinement of test case specifications into ground test case input data) and test case evaluation (i.e., abstraction of test case output data into a corresponding representation in the vocabulary of the formal specification). Constraints such as these may not correspond to application domain-specific constraints. Domain-specific software development constraints must balance many functional and nonfunctional quality objectives, among which testability may be of relatively low importance. Thus a specification based testing tool which is constrained along dimensions which are incompatible with domain-specific constraints may be unusable. When this problem occurs, test engineers in that application domain may need to develop their own specification-based testing tools that respect the constraints at work in their application domain. The constraint that the UUT and its specification possess equivalent interfaces is too restrictive in a number of interesting, real-world testing domains we discovered. For example, in the software application domain of digital avionics and flight controls, our experience indicates that high-business-value qualities such as efficient run-time performance and performance analyzability have for a long time engendered program units with poorly encapsulated interfaces and ubiquitous global variables. Specification-based testing tools are difficult to apply in this domain because the UUT/specification interface equivalence constraint cannot be satisfied (in fact, the UUTs often have no defined interfaces!). It has been suggested that digital flight control programs be redesigned or that new programs be designed with better interfaces (i.e., interfaces which possess all formal parameters needed to control the input to the program and to observe the output of the program). Such a redesign would allow existing test tools to be more easily applied to these programs. Hence there is interest in program design for testability (DFT) [1] within the digital flight control community. Thus we notice that the constraints of existing test tools attempt to impose tacit DFT requirements on UUTs. DFT initiatives for digital flight control software (or digital avionic software in general), often championed by management, meet a number of significant technical and organization challenges. Technical challenges are influenced by flight-qualified processors and memories, which are slow and small with respect to the state of the art for desktop computing applications. Organizational challenges are influenced by legacy software development processes and system acquisition/certification processes and their reporting requirements. Legacy software development processes engender modern software products that are based on software design patterns that were discovered in the earliest days of digital flight control. Acquisition/certification reporting requirements for program performance estimates engender software designs that are easy to analyze for performance (i.e., designs with an absolute minimum of run-time dynamism). In the final analysis, digital flight control software with satisfactory performance (i.e., software that “flies”),
3
that respects legacy processes, and which is certifiable, will always be chosen over software that is easy to test. Hence we are not very optimistic about DFT initiatives for digital flight control or avionic software. Given the challenging nature of the digital flight control and avionic software application domain, can specification-based testing tools be applied? The solution appears to be to construct specification-based test tools that are specific to the domain. An example, developed by Honeywell, is described in [18]. Such tools must apply a different set of constraints than those applied by general-purpose tools. A domain-specific, specification-based testing tool needs to apply the natural constraints that arise in the domain (i.e., the testing tool must define and support the parameters of variability that implicitly define the testing domain). This is the thesis of Siddhartha. This paper introduces Siddhartha, a technique for establishing essential automation support for specification-based program testing in domains which are not well-supported by existing testing tools. Siddhartha is composed of a domain-independent library of software tool components, a reference architecture, and a process model. By following the process model, a test engineer extends the library to construct a domain-specific program synthesizer conforming to the reference architecture. A synthesizer transforms test case specifications written in one or more domain-specific languages into test driver-oracle programs. Test driver-oracle programs automate test case execution and test case evaluation. This paper is organized as follows. Section 2 discusses the need for test driver-oracle programs to evolve with the programs they test and with their specifications. We show how this problem fits within he Knowledge-Based Software Engineering (KBSE) paradigm. We summarize one particular KBSE technique, pattern-directed, conditional, source-to-source program transformation, and discuss how we use it to evolve test driver-oracle programs. Section 3 discusses the main parts of Siddhartha. We discuss two testing domains and the synthesizers we constructed to operate in each domain. Section 4 provides implementation details on the material discussed in Section 3 by presenting a number of code examples. The code examples help the reader understand how Siddhartha represents grammars, transformation rules, rule application control functions, and other artifacts. Section 5 presents our early evaluation of Siddhartha’s costs and benefits as they play out in the Ada/SCR testing domain. Section 6 presents our conclusions.
2. Transformational Test Synthesis A program can be tested automatically by running a test driver. A test driver is another program that (1) establishes input values for the unit under test (UUT), (2) invokes the UUT, and (3) observes the UUT’s outputs. Test drivers that also decide whether the UUT’s input/output data pair satisfies the UUT’s specification are called test driver-oracles, because they include an oracle [17] that evaluates the test case results. Test driver-oracles must evolve with the UUTs they test and with the formal specifications of those UUTs. We view the problem of test driver-oracle development as a software
4
development problem in its own right. The discipline of knowledge-based software engineering (KBSE) [9] has been explored to reduce the cost of program evolution by raising program development to the level of formal specifications. We see test driver-oracle evolution as another good problem in which to apply the KBSE paradigm. Thus this research seeks to reduce the cost of test driver-oracle evolution by raising the development of test driver-oracles to the level of formal test specifications. An important technique explored in KBSE research is domain-oriented, pattern-directed, conditional, source-to-source transformational program synthesis, as exemplified by Draco [11] and Draco-PUC [14]. The idea of a transformation is to encapsulate a piece of programming knowledge in a formal rule. A rule states that whenever we have a fragment of code or specification that matches a syntactic pattern and satisfies a semantic condition, the fragment can be correctly replaced by a new fragment of code or specification. Program synthesis occurs by repeatedly applying a collection of rules to a problem-oriented formal specification. This process incrementally transforms the problem-oriented formal specification into a correct, solution-oriented implementation program. Transformation rules are reused by referencing them in rule application control functions. Transformation rules engender elegant designs for program synthesis tools. These tools are characterized by significantly reduced concern for control flow and data flow. Siddhartha applies the technique of domain-oriented, pattern-directed, conditional, source-to-source transformational program synthesis to the problem of test driver-oracle evolution. Given a formal test specification, a synthesizer produced via Siddhartha automatically generates a corresponding test driver-oracle program.
3. Siddhartha Siddhartha provides essential automation support for specification-based testing of “low testability” program UUTs. Siddhartha is comprised of a library of domain-independent software tool components, a process model, and an architectural style. By following the process model, a test engineer assembles library components and creates new components to construct a domain-specific program synthesizer in the specified architectural style. The resulting synthesizer transforms formal test specifications into test driver-oracle programs, which automate test case execution and test case evaluation.
3.1 General Description Figure 1 represents the classes of artifacts and processes in Siddhartha’s view of specification-based program testing.
5
Tests
Programs
Specs
Functionality MakeTBS Test Behav Spec Fwd Rev Eng Eng Spec
UUT
xor Test Data Spec Synthesize Driver Driver
Test
Report
Figure 1: Siddhartha Test Development and Execution Artifacts and Processes Nodes in the diagram represent classes of objects, hyper edges (i.e., edges that can have more than one source node) represent processes. A process takes as input an object at each source class and produces as output an object of the target class. Spec represents the class of formal specifications. UUT represents the class of UUTs. TestBehavSpec represents the class of test behavior specifications. A test behavior specification specifies the order in which UUT operations are to be applied during a test, potentially without regard to the specific data to be provided to those operations. TestDataSpec represents the class of test data specifications. A test data specification specifies what ground data will be provided or generated for test inputs, regardless of the order in which that data is used. Driver represents the class of test driver-oracle programs. Report represents the class of test reports. The process labeled Test represents executing a test and producing a test report. The process SynthesizeDriver represents the action of the test driver-oracle program synthesizer. The process labeled MakeTBS represents the tool that produces a TestBehavSpec from a Spec, UUT, and potentially a TestDataSpec. Siddhartha is not concerned with how the UUT was refined from Spec or how Spec was reverse engineered from UUT. Hence these process edges are shown as dashed lines. Figure 1 is divided into four quadrants. The upper portion contains artifacts which can be thought of as specifications. The bottom portion contains artifacts which can be thought of as programs (with the exception of Report). The left side contains artifacts that represent mission-oriented functions. The right side contains artifacts that represent tests of those functions.
3.2 Key Strategic Advantage The key strategic advantage of Siddhartha is its ability to provide automation support for what have historically been classified as “low testability” program units. Siddhartha is a general technique with the potential to accommodate wide spectra of specification and implementation languages and design styles (potentially including languages and design styles developed and used by only one organization). Applying Siddhartha produces a testing domain-specific, driver-oracle program synthesizer.
3.3 Components of Siddhartha Figure 2 presents a layered view of Siddhartha’s architectures. The lower portion of Figure 2 represents the architecture of Siddhartha’s domain-independent software tool component library. The upper portion of Figure 2 represents the reference architecture of testing domain-specific program synthesizers. 6
Rule Application Control Functions Transformation Rules Misc. Functions Grammars Abst ract Sy ntax Models
AAAA AAAA AAAA AAAAAAAA AAAAAAAA AAAA Synthesizers Ada/Ada SCR/Ada .../Ada AAAA AAAA AAAA AAAA AAAA AAAA AAAA AAAA data flow analyzer + AAAA AAAA AAAA AAAA AAAA AAAAAAAA AAAA Siddhartha Libs utilities AAAA AAAA AAAAAAAA AAAA AAAA AAAA AAAA REFINE/Ada REFINE/C REFINE/FORTRAN AAAAAAAAAAAAAAAA AAAA AAAA AAAA AAAA Reasoning SDK REFINE, DIALECT AAAA AAAAAAAA AAAA AAAA AAAAAAAAAAAA AAAA Franz Allegro Common LISP AAAAAAAA AAAAAAAA AAAAAAAA AAAA
REFINE/...
Figure 2: Architectures of Siddhartha Siddhartha’s domain-independent software tool component library is build on top of Reasoning SDK3 [15] (formerly known as Software Refinery), a tool infrastructure for program re-engineering. On top of Franz Allegro4 Common LISP, Reasoning SDK provides REFINE and DIALECT. REFINE is a wide-spectrum formal specification language. DIALECT is a grammar definition language. Together REFINE and DIALECT enable the development of domain-specific languages. On top of REFINE and DIALECT, Reasoning SDK provides four predefined programming language models: REFINE/Ada, REFINE/C, REFINE/FORTRAN, and REFINE/COBOL. Each language model defines abstract and surface syntax and provides a collection of program analysis tools. Siddhartha adds to Reasoning SDK by providing a collection of domain-independent utilities (e.g., for testing programs written by the test engineer) and a collection of language-specific analysis functions5 (e.g., an intra-procedural data flow analyzer for Ada subprograms). The test engineer is free to extend the domain-independent Siddhartha library and language-specific libraries as appropriate. The upper portion of Figure 2 shows the reference architecture of synthesizers. A test engineer builds program synthesizers on top of the language-specific analysis functions. A synthesizer generates a test driver-oracle program from a domain-specific formal test specification. The reference architecture of synthesizers is founded upon a collection of abstract syntax models. Grammars and miscellaneous functions are defined on top of the abstract syntax models. These models, grammars, and functions define languages specific to the given testing domain (e.g., the language of test behavior specifications and the language of test data specifications). Transformation rules are pattern-directed and usually operate at a source-to-source level. Rule application control functions are defined on top of transformation rules. The synthesizer “main” function is usually a rule application control function. 3
™ Reasoning SDK, Software Refinery, REFINE, DIALECT, REFINE/Ada, REFINE/C, REFINE/FORTRAN, and REFINE/COBOL are trademarks of Reasoning Systems Incorporated, Mountain View, CA, U.S.A.. 4 ™ Franz and Allegro are trademarks of Franz Incorporated, Berkeley, CA, U.S.A.. 5
Currently a subset of Ada is supported.
7
3.4 Domain-Specific Synthesizers Figure 3 represents development of three test-domain-specific synthesizers. A test engineer extends the domain-independent software tool component library and composes components together, usually also writing new domain-specific and -independent grammars, rules, etc.. Test Data Spec
Test Behav Spec
Spec Fwd Eng
Synthesize Driver
UUT
MakeTBS Test Spec Behav Spec Rev Eng
Driver Test
.../Ada
SCR/Ada
Ada/Ada MakeTBS
UUT
extend
Synthesize Driver
UUT
Driver Test
Report
MakeTBS Test Spec Behav Spec Fwd Eng
Test Data Spec
extend
Synthesize Driver
Driver Test
Report
Test Data Spec
Report
extend
Siddhartha Data Flow Analyz er + Siddhartha Utilities REFINE/Ada REFINE, DIALECT Franz Allegro Common LISP
Figure 3: Building Domain-Specific Synthesizers
3.4.1 Ada/Ada Testing Domain: Ada Formal Regression Testing Figure 4 represents a domain-specific instantiation of the generic artifacts and processes shown in Figure 1. The testing domain represented in Figure 4 is “Ada formal regression testing,” where the current version of subprograms in a library are tested against their counterparts in a previous, “gold” library. This testing domain is applicable when no functional changes have been made between two versions of a subprogram, and the tester wishes to be assured that no functional errors were accidentally introduced during evolution.
8
Spec
DFA
TDS
TBS
Restructure *
Regress
UUT
Driver Ada Report Runtime Environment
Figure 4: “Ada Formal Regression Testing” Domain Processes and Artifacts The bindings associated with this testing domain are as follows: Both UUT and Spec are written Ada, because Spec is an earlier version of UUT. the FwdEng process is instantiated by a restructuring process, indicating that no functional differences ought to exist between UUT and Spec. The MakeTBS process is instantiated by a data flow analyzer (DFA). The DFA examines the implementation details of both UUT and Spec in an effort to reverse engineer more complete interfaces for them (taking global variables and hidden state variables into account). After synthesizing more complete interfaces for UUT and Spec, DFA matches corresponding UUT and Spec formal parameters, global variables, hidden state variables, and function return values. The SynthesizeDriver process is instantiated by the function Regress. Regress takes a test behavior specification and a test data specification and produces a test driver-oracle program Driver. Regress is developed by a test engineer by applying Siddhartha. Driver is written in Ada. The Test process is instantiated by the Ada Runtime Environment.
3.4.2 Ada/SCR Testing Domain: Ada Requirements Testing Against SCR Specifications Figure 5 represents another domain-specific instantiation of the generic artifacts and processes shown in Figure 1. The domain represented in Figure 5 is “requirements testing of Ada subprograms against SCR formal specifications.” SCR*toolset sim Spec TBS
Rev. Eng.
TDS
LogtoAda
UUT
Driver Ada Report Runtime Environment
Figure 5: Ada/SCR Test Domain Processes and Artifacts The bindings associated with this testing domain are as follows: UUT is written in Ada. Spec is written in SCR6 [7]. The MakeTBS process is instantiated by the SCR* Toolset 6
In the testing domain for which we evaluated Siddhartha, Spec was reverse engineered from the UUT, but the same test driver-oracle synthesizer we developed would work if the UUT was forward engineered from Spec.
9
Simulator [6]. A test engineer creates an event file using whatever specification-based coverage criterion is of interest. The event file lists a sequence of assignments to input variables appearing in Spec. The event file serves the purpose of the test data specification. The test engineer uses the SCR* Toolset Simulator to interpret the Spec given an event file as input data. The SCR* Toolset Simulator produces a log file, which serves the role of test behavior specification. LogToAda is the name of the SynthesizeDriver process. LogToAda transforms log files into test driver-oracle programs (Drivers). Drivers are written in Ada.
4. Implementation Specifics This section provides an idea of how a domain-specific synthesizer operates. We present a sequence of example grammars, rules, and rule application control functions drawn from the Ada/SCR testing domain. An important early step in building a synthesizer is identification of the test behavior specification language and the test data specification language. In the Ada/SCR testing domain, the test engineer identified SCR* Toolset Simulator log files as a promising formal test behavior specification language. The test engineer’s task was to build a synthesizer that transforms log files into corresponding test driver-oracle programs. The test engineer decided the test driver-oracle programs should be implemented in Ada to conveniently link with the UUTs. Because Siddhartha uses the REFINE/Ada language model, no new abstract syntax model or grammar was needed for Drivers. Abstract syntax models are written in REFINE and grammars are written in DIALECT [15]. Because no DIALECT grammar or REFINE abstract syntax model existed for log files, the test engineer built them . The grammar for log files is shown below.
10
grammar SCR productions SCR-LOG ::= [SCR-LOG-STATES ++ ""] builds SCR-LOG, SCR-INITIAL-STATE ::= ["--- initial state --------------------------------------" SCR-INITIAL-STATE-EQUATIONS + ""] builds SCR-INITIAL-STATE, SCR-START-STATE ::= ["--- start state --------------------------------------" SCR-START-STATE-EQUATIONS + ""] builds SCR-START-STATE, SCR-NTH-STATE ::= ["--- state" SCR-NTH-STATE-NUM "-----------------------------------------------" SCR-NTH-STATE-EQUATIONS + ""] builds SCR-NTH-STATE, SCR-EDIT-STATE ::= ["--- state edit begin ----------------------" SCR-EDIT-STATE-EQUATIONS + "" "--- state edit complete ----------------------"] builds SCR-EDIT-STATE, SCR-UNCHANGED-STATE ::= ["--- state unchanged ----------------------" "warning:" SCR-UNCHANGED-STATE-VARIABLE-REF "stepped without change. pending event ignored."] builds SCR-UNCHANGED-STATE, SCR-EQUATION ::= [SCR-EQUATION-VARIABLE "=" SCR-EQUATION-VALUE] builds SCR-EQUATION, SCR-EDIT-STATE- EQUATION ::= ["edited new value of" SCR-EDIT-STATE- EQUATION-EQUATION] builds SCR-EDIT-STATE- EQUATION, SCR-INTEGER-LITERAL ::= SCR-INTEGER-LITERAL-VALUE builds SCR-INTEGER-LITERAL, SCR-REAL-LITERAL ::= SCR-REAL-LITERAL-VALUE builds SCR-REAL-LITERAL, SCR-ENUMERATED-LITERAL ::= SCR-ENUMERATED-LITERAL-VALUE builds SCR-ENUMERATED-LITERAL, SCR-VARIABLE-REF ::= SCR-VARIABLE-REF-SYMBOL builds SCR-VARIABLE-REF start-classes SCR-LOG, SCR-EQUATION, SCR-STATE, SCR-EDIT-STATE-EQUATION file-classes SCR-LOG pattern-syntax ["#scr-state" SCR-STATE] symbol-continue-chars "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_" end
Figure 6: DIALECT Grammar for SCR* Toolset Simulator Log Files Given this grammar, the DIALECT processor automatically generates a parser and printer for the log file language. After a test engineer creates a log file using the SCR* Toolset Simulator, the log file is parsed into Reasoning SDK’s object base for transformation. The test engineer next built transformation rules. Transformation rules are specified as pattern-directed, conditional, source-to-source replacements. An example transformation rule is listed below. The rule is written in REFINE using patterns from both the SCR log file language and Ada (##r SCR means that the fragment uses the SCR grammar listed in Figure 6 and ##r ADA means that the fragment uses the REFINE/Ada grammar). rule SCR-NTH-STATE- EQUATION-TO-ASSIGNMEN T-STATEMENT (X: SCR-STATE) X = `--- state @N ----------------------------------------------$BEFORE @V = @VAL $AFTER' & SCR-USAGE-OF-REF(V) = 'MONITORED --> X = `--- state @N ----------------------------------------------$BEFORE @P $AFTER' & P = `@V := @VAL;'
Figure 7: A Refinement Rule Mapping a Log File Fragment into an Ada Fragment Siddhartha groups transformation rules according to whether they operate within a given language or translate fragments in one language into another language. A work in progress is to see if we can reconceptualize Siddhartha in the style of Draco [11]. In the Draco 11
architectural style, rules that operate within a given language (or “domain model” in Draco terminology) are called optimizations and rules that cross languages are called refinements. The transformation rule listed above is an example of a refinement. Specifically, the rule helps refine an SCR log file into a corresponding test driver-oracle program in Ada. The rule says if a log file state description is found that contains an equation and the variable referenced in the equation is monitored (i.e., the variable serves as an input for the SCR model), then that equation must be transformed into an assignment statement in Ada (leaving everything else in the log file state alone). After transformation rules were defined, the test engineer wrote rule application control functions. These functions specify control over transformations via post-order traversal over the argument’s (i.e., the formal test behavior specification’s) abstract syntax tree (AST). Post-order traversal is used to preserve the specification’s original context as long as possible. At each node in the AST, a sequence of transformation rules is applied. If a rule’s antecedent and condition (i.e., left hand side of a rule) does not match the current AST node, the rule is not applied and its successor in the rule sequence is tested for applicability. If a rule is found which matches the current AST node, the rule is applied, and rule application resumes from the first rule in the rule sequence. If no rules in the sequence can be applied at the current node, the next node is visited in post-order and the rule application process is repeated at the next node. Siddhartha uses REFINE’s built-in rule application function postorder-transform. Postorder-transform takes as arguments the AST node on which to begin (which may be the AST’s root node), a sequence of rule identifier symbols, and a predicate indicating whether rule application should be exhaustive. An example rule application function is listed below. It controls the refinement of an SCR log file into a corresponding test driver-oracle subprogram body. function SCR-LOG-TO-ADA-SUBPROGRAM-BODY (X: SCR-LOG) : SUBPROGRAM-BODY = POSTORDER-TRANSFORM (X, ['SCR-EPDFAIL-EQUALS-TRUE-TRANSFORM, 'SCR-EPDFAIL-EQUALS-FALSE-TRANSFORM, 'SCR-VARIABLE-REF-TO-ADA-EXPRESSION, 'SCR-INTEGER-LITERAL-TO-ADA-INTEGER-LITERAL, 'SCR-REAL-LITERAL-TO-ADA-REAL-LITERAL, 'SCR-ENUMERATED-LITERAL-TO-ADA-SIMPLE-NAME, 'SCR-START-STATE-EQUATION-TO-VERIFY-PROCEDURE-CALL, 'SCR-INITIAL-STATE-EQUATION-TO-VERIFY-PROCEDURE-CALL, 'SCR-NTH-STATE-EQUATION-TO-VERIFY-PROCEDURE-CALL, 'SCR-NTH-STATE-EQUATION-TO-ASSIGNMENT-STATEMENT, 'SCR-EQUATION-TO-NULL-STATEMENT, 'ENGAGE-OUT-POSITION-MAP, 'EXPANDED-NAME-NEEDS-REF-TO, 'SCR-INITIAL-STATE-TO-BLOCK-STATEMENT, 'SCR-START-STATE-TO-BLOCK-STATEMENT, 'SCR-UNCHANGED-STATE-TO-BLOCK-STATEMENT, 'SCR-EDIT-STATE-EQUATION-TO-ASSIGNMENT-STATEMENT, 'SCR-EDIT-STATE-TO-BLOCK-STATEMENT, 'SCR-NTH-STATE-TO-BLOCK-STATEMENT, 'SCR-LOG-TO-SUBPROGRAM-BODY], false)
Figure 8: A Rule Application Control Function
12
Figure 9 demonstrates how a state description appearing in an SCR log file is incrementally transformed into a corresponding Ada block statement in the test driver-oracle program. The arrow --> represents application of the rule shown in Figure 7. The starred arrow represents application of many other transformation rules (such as those referenced in Figure 8). --- State 5 ----------------------------------------------Disengage_Request = 1 PSFCC_State = Disengaged Engage_Out_Other = Table_Number Engage_Out_Position = 8 ARM_ENG_FAIL = 8
--> --- State 5 ----------------------------------------------Disengage_Request := 1; PSFCC_State = Disengaged Engage_Out_Other = Table_Number Engage_Out_Position = 8 ARM_ENG_FAIL = 8
-->* STATE_5: declare STATE: constant STRING := POSITIVE ' IMAGE ( 5); begin DPR_INTERFACE.DISENGAGE_REQUEST := 1; PSFCC_MAIN.PSFCC_ADA_CLAW; VERIFY ( PSFCC_MAIN.PSFCC_STATE, PSFCC_MAIN.DISENGAGED); VERIFY ( ENGAGE_LOGIC.ENGAGE_OUT, ENGAGE_LOGIC.TABLE_NUMBER); VERIFY ( ENGAGE_LOGIC.ENGAGE_OUT, 8); VERIFY ( DPR_INTERFACE.PSFCC_STATE_MODE_ID.ARM_ENG_FAIL, 8); exception when others => TEXT_IO.PUT_LINE ( (("Exception raised in state " & STATE) & ".")); end STATE_5;
Figure 9: Example of How Transformation Rules Work The first log file fragment in Figure 9 tells us that in state 5 the input is Disengage_Request is assigned the value of 1 and in response to this assignment event four variables changed value since state 4. In the second log file fragment, the equation for Disengage_Request has been changed into an Ada assignment statement. An important problem in automating specification-based testing is relating the interface of Spec to the (possibly reverse engineered) interface of UUT. O’Malley [12] refers to this as the representation mapping problem: How can we define translation of Spec’s expressions about behavior into UUT’s corresponding expressions and statements about behavior? In the Ada/SCR testing domain the test engineer designed an interface representation mapping from SCR Spec identifier symbols into corresponding Ada expressions that UUT
13
can understand. A portion of this interface representation mapping is listed in Figure 10 below. var SCR-ID-MAPPING : map(symbol, tuple(SCR-USAGE : symbol, UUT-ID-REF : EXPRESSION)) = {| 'AILERON_CMD -> , … 'AFU -> , … 'ENGAGE_OUT -> , … 'ARM_ENG_FAIL -> , … |}
Figure 10: SCR -> Ada Interface Representation Map for PSFCC Library (portion) Building a domain-specific program synthesizer is not a trivial task. Siddhartha provides the test engineer with a process model to inform development of the synthesizer and associated software artifacts. Further information about the process model can be found in Reyes’s Dissertation (in progress at the time of writing).
5. Evaluation We evaluate Siddhartha indirectly by evaluating test synthesizers constructed by applying Siddhartha. This section discusses our ongoing evaluation of the Ada/SCR domain’s test synthesizer discussed earlier. The evaluation plan is to replicate testing performed by the NASA IV&V Facility on the NASA Dryden Flight Research Center (DFRC) F/A-18B System Research Aircraft’s (SRA) Production Support Flight Control Computer (PSFCC) Operational Flight Program (OFP) and use manually replicated NASA IV&V tests as the baseline for comparison.
5.1 Background In late 1997, we accepted an invitation from NASA to participate in the IV&V process for the PSFCC OFP. We were granted access to and reviewed many software artifacts, including documents, SCR models, SCR* Toolset, and Ada libraries. We used the PSFCC OFP as a real-world example for developing Siddhartha. We were able to fix the SCR model so that the SCR* Toolset Simulator could be used on it7. We used a number of REFINE/Ada software tools to analyze the PSFCC OFP libraries. In order to get first hand knowledge of the testing issues, we ported both the original and restructured versions of the PSFCC OFP to our workstation computer8. Although we 7
The SCR model produced by the IV&V contractor had a number of cyclic dependencies between internal variables. These cyclic dependencies prevented the SCR* Toolset Simulator from accepting the model. We acknowledge that our fix may not preserve all of the specification writer’s original intent. 8
We ported the PSFCC OFP libraries to compile and execute on a Sun Microsystems UltraSPARC 1 with SunAda compiler version 2.0.1 for Solaris.
14
reviewed both NASA IV&V’s software test procedure and software test results documents, we did not port NASA’s test support Perl scripts to our computer workstation. We replicated the NASA test processes by manually writing test drivers in the style generated by the NASA scripts, informed by NASA’s test specifications. The NASA scripts instantiate a general test driver template with the name of the subprogram UUT and assignment statements to establish the initial condition for the specific test.
5.2 Experiment Design Our evaluation plan is to use the domain-specific test synthesizers constructed via Siddhartha to replicate testing performed during NASA IV&V of the PSFCC OFP libraries. We chose four factors, each of which represents a different dimension for differentiation. For each factor, we chose a set of two values (also known as “levels”) for each factor to take on. The factors and their associated set of values are shown below. • Technique Applied: {Siddhartha, Manual} • NASA IV&V Found Fault?: {Yes, No} • Type of Test: {Requirements, Regression} • Spec or Log Size: {Small, Big} The factor “Technique Applied” distinguishes between Siddhartha’s results and manually replicated NASA IV&V results. The factor “NASA IV&V Found Fault?” allows us to compare fault detection effectiveness of each technique. The factor “Type of Test” distinguishes between the Ada/Ada and Ada/SCR testing domains. The factor “Spec or Log Size” allows comparison of scale effects. The 16 possible combinations defines the part of the evaluation space under investigation. To date we performed eight experiments for the Ada/SCR testing domain. Our evaluation of trends for this domain are summarized in Figure 11 below. UUT = PSFCC_ADA_CLAW Test Data Selection Criterion = Cover every cell of every table in the SCR specification with at least one test Driver Failure Failure Production detection Rate Detection Technique Setup Time Rate (avg. Rate (avg. Applied (person days) (Drivers/day) Failures/Driver) Failures/day) Manual 1 69 0.5 34.5 Siddhartha 109 53 17.25 914.25
Figure 11: Trends for Ada/SCR Testing Domain Setup time is an important outcome measure. The manual technique needed a total setup time of one person day. Most of this day was used to define the interface representation mapping between the UUT and Spec. The Siddhartha technique also needed a day to define the interface representation mapping (we assume no sharing of information exists between the test engineers performing each technique). The Siddhartha technique needed an additional 108 days of setup time to develop the remaining parts of the domain-specific test synthesizer (i.e., AST models, grammars, transformations rules, etc.).
5.2.1 Manual Test Development and Failure Detection Trends The manual technique is productive after only one day of setup. With practice, a test driver-oracle could be written manually from an SCR event table cell in an average of 7
15
minutes (which includes compiling and testing the test driver-oracle to ensure it meets the test engineer’s expectations). Of the manually constructed drivers, half detected a failure.
5.2.2 Siddhartha Test Development and Failure Detection Trends Siddhartha could not become productive in developing test driver-oracles until after 109 days of setup time to build the test synthesizer. Surprisingly, test driver-oracle development rate is less than the manual technique. As with the manual technique, driver-oracle development under Siddhartha also started by considering an SCR event table cell. In the Siddhartha technique, the test engineer then operated the SCR* Toolset Simulator to generate a log file representing a test of each cell. This activity was not performed by the test engineer in the manual technique. After each log file was generated, the Siddhartha technique invoked the LogToAda synthesizer function from within the Reasoning SDK interpreter. Under the Siddhartha technique, developing each driver-oracle took an average of nine minutes. This nine minutes includes the elapsed time for the synthesizer to execute, the elapsed time to compile the synthesized test driver-oracle, and the elapsed time to execute the synthesized test driver-oracle. Because the net development rate under the Siddhartha technique is lower than the net development rate for the manual technique, the manual technique will always be more productive than the Siddhartha technique in developing test driver-oracles. Figure 12 below graphically depicts the average, comparative timelines for developing test driver-oracles manually (left column) and via the synthesizer (right column). 02:00 Understand SCR event table cell.
02:00 Understand SCR event table cell.
03:00 Manually write test driver-oracle.
04:00 Manually use SCR* T oolset Simulator to create log file.
01:00 Compile, link & debug test driver-oracle. 01:00 Execute test driver-oracle & record test results.
01:00 Invoke synthesizer on log file. 01:00 Compile & link test driver-oracle. 01:00 Execute test driver-oracle & record test results.
Manual
Siddhartha
Figure 12: Comparison of Average Test Driver-Oracle Development Timelines Although losing the productivity comparison, the Siddhartha technique far surpassed the manual technique in failure detection effectiveness. Driver-oracles developed under the Siddhartha technique detected an average of 17 failures. Because they are derived from SCR* Toolset Simulator logs, driver-oracles developed under Siddhartha are much more rigorous than drivers developed manually. Given these trends, we predict that the 16
Siddhartha technique will surpass the manual technique in failure detection after 113 working days (i.e., only four days after the test synthesizer becomes operational). At this tradeoff point, Siddhartha will have constructed 212 driver-oracles.
6. Conclusions We described Siddhartha, a technique providing essential automation support for specification-based testing of “low-testability” programs. Siddhartha applies the KBSE paradigm to developing test driver-oracle programs: to reduce test driver-oracle program evolution costs by raising the development of test driver-oracle programs to the level of formal test specifications. Siddhartha has been applied to produce domain-specific test driver-oracle program synthesizers that automate development of functional, regression tests of Ada procedures and requirements test of Ada procedures against SCR formal specifications. By themselves, these domain-specific synthesizers contribute to specification-based testing technology in their ability to operate on program units characterized by insufficient interface definition and UUT/specification interface mismatch. Before Siddhartha, these problems inhibited adoption of many specification-based testing techniques. Siddhartha’s approach to test driver-oracle program synthesis is general in that it can potentially support broad spectra of programming languages and styles, formal test specification languages, and test driver-oracle reference architectures. Siddhartha’s approach is not very powerful for building synthesizers in completely new testing domains because it requires the test engineer work hard at defining abstract syntax models, grammars, transformation rules, etc. (potentially) from scratch. We expect that reusing languages, etc. from earlier, supported testing domains will reduce the test engineer’s work load. Early evaluation of Siddhartha’s costs and benefits indicate that test driver-oracle programs it engenders can possess significantly greater failure detection effectiveness than manually-developed driver-oracles. However, this greater failure detection effectiveness comes at the price of a long setup time needed to build the domain-specific synthesizer. During this long setup time, the more effective driver-oracle programs cannot be developed and used.
7. Acknowledgments This research was supported in part by the NASA Graduate Student Researchers Program (GSRP) grant NGC-70352 and by DARPA Evolutionary Development of Complex Software (EDCS) grant “Perpetual Testing.”
8.References [1] Binder, R.V. Design for testability in object-oriented systems. Communications of the ACM, Sept. 1994, vol.37, (no.9):87-101. [2] Mark R. Blackburn, Robert D. Busser, Joseph S. Fontaine, Automatic generation of test vectors for SCR-style specifications. Proceedings of
17
Computer Assurance 1997 (COMPASS’97): Are we Making Progress Toward Computer Assurance?, IEEE New York, NY, USA. [3] Carter, J. Production support flight control computers: research capability for F/A-18 aircraft at Dryden Flight Research Center. IN: 16th DASC. AIAA/IEEE Digital Avionics Systems Conference. Reflections to the Future. Proceedings (Cat. No.97CH36116). (16th DASC. AIAA/IEEE Digital Avionics Systems Conference. Reflections to the Future. Proceedings (Cat. No.97CH36116)16th DASC. AIAA/IEEE Digital Avionics Systems Conference. Reflections to the Future. Proceedings, Irvine, CA, USA, 26-30 Oct. 1997). New York, NY, USA: IEEE, 1997. p. 7.2-8-7.2-23 vol.2. [4] Roong-Ko Doong; Frankl, P.G. The ASTOOT approach to testing object-oriented programs. ACM Transactions on Software Engineering and Methodology, April 1994, vol.3, (no.2):101-30. [5] Gannon, J.; McMullin, P.; Hamlet, R. Data-abstraction implementation, specification, and testing. ACM Transactions on Programming Languages and Systems, July 1981, vol.3, (no.3):211-23. [6] Heitmeyer, C.; Bull, A.; Gasarch, C.; Labaw, B. SCR: a toolset for specifying and analyzing requirements. IN: COMPASS '95. Proceedings of the Tenth Annual Conference on Computer Assurance (Cat. No.95CH35802). (COMPASS '95. Proceedings of the Tenth Annual Conference on Computer Assurance (Cat. No.95CH35802) COMPASS '95. Proceedings of the Tenth Annual Conference on Computer Assurance, Gaithersburg, MD, USA, 25-29 June 1995). New York, NY, USA: IEEE, 1995. p. 109-22. [7] Heninger, K.L. Specifying software requirements for complex systems: new techniques and their application. IEEE Transactions on Software Engineering, Jan. 1980, vol.SE-6, (no.1):2-13. [8] Hughes, M.; Stotts, D. Daistish: systematic algebraic testing for OO programs in the presence of side-effects. (1996 International Symposium on Software Testing and Analysis (ISSTA), San Diego, CA, USA, 8-10 Jan. 1996). SIGSOFT Software Engineering Notes, May 1996, vol.21, (no.3):53-61. [9] Automating software design / edited by Michael R. Lowry and Robert D. McCartney. Menlo Park : AAAI Press ; Cambridge : MIT Press, c1991. [10] Erik Mettala, Marc H. Graham, The Domain Specific Software Architecture Program, Software Engineering Institute Special Report CMU/SEI-92-SR-9, June 1992. [11] Neighbors, J.M. The Draco approach to constructing software from reusable components. IEEE Transactions on Software Engineering, Sept. 1984, vol.SE-10, (no.5):564-74. [12] O'Malley, Owen. A model of specification-based test oracles / Dissertation by T. Owen O'Malley. 1996. University of California at Irvine Department of Information and Computer Science. [13] Poston, Robert M. Automating specification-based software testing / Robert M. Poston. Los Alamitos, Calif. : IEEE Computer Society Press, c1996. [14] do Prado Leite, J.C.S.; Sant'Anna, M.; de Freitas, F.G. Draco-PUC: a technology assembly for domain oriented software development. N: Proceedings. Third
18
[15] [16]
[17]
[18]
International Conference on Software Reuse: Advances in Software Reusability (Cat. No.94TH06940). (Proceedings. Third International Conference on Software Reuse: Advances in Software Reusability (Cat. No.94TH06940) Proceedings of 1994 3rd International Conference on Software Reuse, Rio de Janeiro, Brazil, 1-4 Nov. 1994). Edited by: Frakes, W.B. Los Alamitos, CA, USA: IEEE Comput. Soc. Press, 1994. p. 94-100. REFINE User Guide, DIALECT User Guide, REFINE/Ada User Guide, Reasoning Systems, Mountain View, CA. Richardson, D.J.; O'Malley, O.; Tittle, C. Approaches to specification-based testing. (ACM SIGSOFT '89 Third Symposium on Software Testing, Analysis and Verification (TAV3), Key West, FL, USA, 13-15 Dec. 1989). SIGSOFT Software Engineering Notes, Dec. 1989, vol.14, (no.8):86-96. Richardson, D.J.; Leif Aha, S.; O'Malley, T.O. Specification-based test oracles for reactive systems. IN: International Conference on Software Engineering. (International Conference on Software Engineering, Melbourne, Vic., Australia, 11-15 May 1992). New York, NY, USA: ACM, 1992. p. 105-18. Weber, R.; Thelen, K.; Srivastava, A.; Krueger, J. Automated validation test generation. IN: AIAA/IEEE Digital Avionics Systems Conference. 13th DASC (Cat. No.94CH3573-0). (AIAA/IEEE Digital Avionics Systems Conference. 13th DASC (Cat. No.94CH3573-0) AIAA/IEEE Digital Avionics Systems Conference. 13th DASC, Phoenix, AZ, USA, 30 Oct.-3 Nov. 1994). New York, NY, USA: IEEE, 1994. p. 99-104.
19