SYNTHESIZING ABSTRACT DATA TYPE SPECIFICATIONS ...

4 downloads 328 Views 348KB Size Report
Department of Mathematics and Computer Science. Florida State University. Tallahassee ... as complex as automatic synthesis of programs. The simplicity or complexity. PerLission to ..... TOP~S 3,2 (April 1981), pp 126-143. [Sco76J Scott, D.
SYNTHESIZING ABSTRACT DATA TYPE SPECIFICATIONS Boumediene Joseph

Belkhouche E.

Urban

Computer Science Department University of Southwestern Louisiana Lafayette, Louisiana 70504 Gregory A. Riccardi Department of Mathematics and Computer Science Florida State University Tallahassee, Florida 32306

of the processor depends on the method chosen to de~ine the aostract data types. An imperative definition oorrowing from the programming language falls in the first category, whereas an applicative definition in terms of a specification language falls in the second category. This paper descriOes a system which incorporates a specification language for aOstract data types within an operational compilaDle language and synthesizes implementations of an aostract data type from its specification [Se181,Moi82,SriS0,SuD79].

ASST~CT

Several alternatives for implementing aostract data types exist. One approacn is the definition and implementation of a new language which directly supports aDstract data type constructs. Another approach is to incorporate abstract data type constructs in an existing programming language. ThiS paper descripes recent and current research in the incorporation of a specification language for aDstract data types within an operational compilaole programming language. The synthesis of implementations of aostract data types from their specifications is discussed.

Researchers in the field agree on the important role aostract data types must have in programming. However, their approacnes differ with regard to the underlying theory. Some view an abstract data type as a class of models in an axiom system [Sta78] (axiomatic approach); others view it as an initial many-sorted algebra [Gog79,Gut78,Zi180] (algeoraic approach); still others view it as a lattice [Sco76] (lattice-theoretic approacn). However, there is a general consensus -- an abstract data type is a set of objects with a set of operations such that the Dehavior of these oojects can be ooserved only through the available operations.

and Phrases: aostract data types, specifications, synthesis. CR Categories: 4.1, 4.2 1.

Introduction

Several alternatives for im~le,aenting aostract data types exist. One approach is the definition and implementation of a new language which supports directly abstract data type constructs [Gog79,Icn79,Jon76,Lis77]. An alternate approach is to incorporate tne same constructs in an existing language. Tnis approach would utilize a preprocessor witn the already existing compiler. The preprocessing activity can De as simple as macro expansion, or as complex as automatic synthesis of programs. The simplicity or complexity

A program which uses a given abstract data type has two parts: a specification of the aostract data type, and the program part. This program is input to the preprocessor wnicn transforms t h e specification part into a suDprogram. The transformation is not a monolithic process. It is accomplished in several stages. The first stage performs syntactic and semantic analysis, and generates an abstract

P e r L i s s i o n t o copy without f e e a l l o r p a r t of this material is 8ranted provided that the c o p i e s a r e n o t made o r d i s t r i b u t e d for d i r e c t c o m a e r c i a l a d v a n t a g e , t h e ACM c o p y r i g h t n o t i c e and t h e t i t l e o f t h e p u b l i c a t i o n and i t s date appear, and n o t i c e i s K l v e n t h a t o o p y l n 8 i s by per~ission of the Association for Coaputin8 Machinery. To copy o t h e r w i s e , o r t o r e p u b l i s h , requires a fee and/or specific pors~ssion.

parse

tree.

The

second

stage

uses

decomposition conce~ts to incre.aentally refine the aDstract nodes of the tree into concrete nodes. The third stage performs data and flow analyses to determine data dependencies and categorize the operations according to

1982 104 0-89791-071-0/82/0~00-0176 $00.?5

176

space and time criteria. The fourth stage utilizes the output of the analysis to select a specific representation. The last stage generates code and integrates this code within the program.

find SUCh that R(a,x) where P(a). P(a) is the input assertion(s) and R(a,x) is the output assertion(s). Normally the specifier "find" implies a systematic search through some universe, although in this case it is taken to mean "determine a specific value such that a relation is satisfied." The input assertion P(a) and the output assertion R(a,x) may be one or more relational expressions connected together by boolean connectives. However, a relational expression within an output assertion is restricted to the following form:

Tne following sections are an overview oi the specification language and the synthesis system. The parsing phase is also part of this system, out since it is fairly straightforward, it is not described in this paper. 2.

The Specification

Language

This section briefly describes a specification language based on the aostract model [Ber79,Man80,Wu176]. The approach relies on input-output assertions and mathematical structures to specify a~stract data types. The model provides primitive types (integer, real, boolean, string), and four basic abstract data structures with their operations. Tnese mathematical structures are: sets, sequences, cartesian products and discriminated unions [BEE79, Hoa72,Lis75, Wu176]. To define an a~tracc data type, tne specifier Chooses an abstract representation and defines the semantics of the abstract data type in terms of the operations available on the abstract representation.

= .

The restrictions section is used to signal exceptional conditions that may arise whenever an input assertion is not satisfied. The sole purpose of this section is for raising conditions, not handling them. 3.

The Synthesis

System

A specification of an abstract data type is translated into an input assertion suDtree and an output assertion subtree. Nodes of a subtree can De either abstract or concrete. An abstract node represents an entity (operation or operand) whose meaning can not be realized directly by the programming language. Therefore, a suotree whose nodes are abstract reflects a complex assertion that is decomposaole. Respectively, the nodes of a suDtree representing a simple assertion are called concrete nodes. Initially, the synthesis system uses the input-output assertion subtrees and a knowledge base of rules to generate abstract programs in a high-level language. This process is basically a transformational process which makes use of the problem reduction paradigm by decomposing the assertions into simple intermediate assertions that are independently satisfiable. The knowledge base that is available to the system provides information about the application domain, and the semantics of the specification and target languages. For example, there exist rules which describe now a composite abstract structure can be decomposed into its immediate constituents. These transformation rules are applied recursively to an abstract node until a concrete node is generated. Transformation rules fall into two categories: refinement rules and distribution rules. A refinement rule is

A specification in the language consists of six parts: header, interface, representation, initialization, operations and restrictions. The header represents the abstract data type name, which may have parameters and any restrictions associated with them. The interface lists the types and the operations which are Visible to the users. F o r each operation, the types and modes of the parameters are listed in order after the operation name. The representation section defines the abstract structure of each new abstract data type in terms of one or more already defined abstract data types. The initialization section is used to initialize the abstract data type representation for each instantiation of the type. The operations are defined within the operation section. Each operation consists of a header which contains the operation name, the parameters, and an operation body. The latter describes une operation effects in terms of input-output assertions. The operation mody has oasically the following format:

177

resulting cost estimation is fairly accurate, as the locality of definition of abstract data types allows a thorougn analysis.

used to decompose a structured operand into unstructured operands of primitive type. A distribution rule is used to deco,~pose a structured operator into suooperators whose collective effect on unstructured operands is equivalent to tne original operator on t h e original operands. The result of the decomposition is a tree whose leaves are simple terms which can be translated into constructs of the target language by means of coding [BarY?] and efficiency [Kan79] rules. The coding rules describe how to map high-level constructs into target language constructs. The efficiency rules are used for data structure [Dew79, S c h 8 1 ] and algorithm selection.

A mechanism i s provided to allow t h e i m p l e m e n t e r and t h e s y n t h e s i s s y s t e m to reevaluate the weights a f t e r some usage a n d in l i g h t of information gathered through execution monitoring. The estimated costs may vary after adjustment of the weignts. A significant variation of the costs may require a different representation to De selected. This reevaluation of weights and revision of choice processes may De carried out to fine-tune a model of an implementation before any implementation has been attempted. The monitoring is simple, a n d involves frequency count at two different levels. The first level gathers data about t h e usage of t h e abstract data type, and the second level provides data about the behavior of the operation blocks implementing the aOstract data type operations.

The second stage of the refinement process involves an analysis of the oenavior of the abstract data type. A local analysis is performed in o r d e r to classify tne operations of the abstract data type into one of several categories. Each category is determined according to some pre-computed complexity costs. Tnis classification of operations is concerned with the operations, not the utilization, of the a~stract data type. A global analysis is required to gather data describing the utilization of the abstract data type. Tnis latter task is dependent on the availapility of information on programs using the most,act data type. The results of these two analyses are used to assign weights to each category. Alternatively, the weight assignment may oe carried out through interaction with the implementer.

Optimization techniques for very-hign-level languages such as those described in [Ear76,Fon79,Pag79] can De performed at tnis stage to generate more efficient code. 4.

An Example of Decomposition

In this section, an example of the decomposition process of a specification into a decomposition tree is illustrated. The operation considered is leaveolock of the symool taole definition of a olock-structured language [Ano77]. Informally, leaveologk decrements the block numoer and destroys the exited environment. Figure 1 is the specification of leaveOlock.

The synthesis system has readily availaole data about the complexity of different algorithms on several data structures representing each abstract data structure. The collected data are comoined with the weights to estimate the cost of each plausible implementation. Tne choice of the c a n d i d a t e representation is based on t h e estimated costs and is made such that a representation which minimizes the cost i s selected [Low78]. Tnis preliminary selection is oased entirely on static information and predefined patterns of oenavior of operations on aostract data structures. The selection of a good representation relies heavily on the available data descrioing the behavior of various implementation schemes. For each of the four aOstract data structures, and for each of their operations, a set of implementations is generated beforehand and analyzed. The results of the analysis provide sufficient knowledge about time, space and COSt of an implementation. The

leaveblock: procedure (inout st: symta~) find st SUCh that st i = 1 { k } s t = < b l o c k ' - 1 , i n f o ' {x in info': x.level >= o l o c K ' } >

X = {x in info': x.level p s ( 0 l o c k >= l ) , and O -

LeaveDlock

178

(st

=