A Formal Semantics for Sisal Arrays Isabelle Attali
Denis Caromel
INRIA Sophia Antipolis - BP 93 06902 Sophia Antipolis Cedex - France
[email protected]
I3S - 650, Route des Colles BP 145 06903 Sophia Antipolis Cedex - France
[email protected]
Yung-Syau Chen, Jean-Luc Gaudiot Dept of Electrical Engineering-Systems University of Southern California Los Angeles, CA 90089-2563, USA {yschen, gaudiot}@usc.edu
Andrew L. Wendelborn
Dept. Computer Science University of Adelaide 5005 Australia
[email protected]
Abstract We present a formal denition of the dynamic semantics of arrays in the functional language Sisal 2.0. We adopt a logical setting: the structural operational style of Natural Semantics, using the Typol inference rules within the Centaur system, a generic programming environment. From the formal specications, a development and visualization environment for Sisal programming is generated. This semantic denition should allow for a precise comparison of array facilities in similar languages. Moreover, this work is the basis for a formal description of program transformations (e.g. parallelizations) which are crucial in the compilation techniques of functional languages such as Sisal.
Keywords compiler specication and construction, executable specications, programming environments, functional language specication.
1 Introduction We present a formal denition of the dynamic semantics of arrays in the functional language Sisal 2.0 in the structural operational style of Natural Semantics; more precisely using Typol inference rules within the Centaur system, a generic programming environment. Sisal is a strongly typed, applicative, single assignment language in use on a variety of parallel processors, including conventional multiprocessors, vector machines and data-ow machines. Sisal 1.2 1
has been in use since 1984; Sisal 2.0 [4], a new language denition, is currently under development. Sisal research and use has demonstrated the eectiveness of the language and its implementation on a wide variety of applications [28, 26] and machines (the Optimizing Sisal Compiler (OSC) and other tools are freely available from Lawrence Livermore National Laboratory). General functional programming research has shown that considerable power of expressiveness and abstraction is oered by higher-order functions, polymorphism, overloading and type inference. Noting this, the Sisal 2.0 proposal suggests inclusion (in a limited form for the sake of eciency) of such features. Important issues thus raised are the expressiveness and soundness of these proposals, especially in the context of the powerful array operations characteristic of Sisal; here, we present an array semantics which will provide a foundation for investigation of these issues. An important design goal for Sisal is ecient programming of large-scale scientic applications. Accordingly, expressive, high-level facilities for the description and manipulation of arrays are a major aspect of the Sisal language. Arrays in Sisal are dynamic and highly exible; array bounds are determined by the expression which produces the array value. A powerful notation is provided for generating array values, for example by mapping to a subarray, specication of diagonal components, or placement according to a vector subscript. Arbitrary subarrays can be selected, or updated with new values. Arrays interact closely with loop expressions, with facilities to select appropriate values from an array as the range of a loop, and various means of packaging values produced by a loop. Sisal has, from its earliest versions, given prominence to arrays as a primary data structure. More recently, other general-purpose functional languages have incorporated array facilities capable of implementation with constant-time access to elements. Most notable of these are Id [20, 9] and Haskell [16] (although Id is not a purely functional language, it is built on a functional core which includes arrays). The design of Haskell arrays was inuenced by that of Id, hence there are strong conceptual similarites. Fundamental Haskell array primitives are: a function which constructs an array from a specication of its bounds (possibly multidimensional) and an association list of index/value pairs; a function to select an element at a given index; and a function to return index bounds. Monolithic array denition is provided by array comprehensions (syntactically similar to list comprehensions), which separately dene disjoint regions of the array. Our approach to the study and analysis of arrays is based on a formal Natural Semantics of the Sisal language [2]. Natural semantics descriptions have been extensively used to express dynamic semantics of various languages [22, 3, 19, 1], dene various translations [6], describe compiler optimizations [14], and formalize the semantics of data ow graphs [8]. We have drawn on a natural semantic characterization of Lassi [10], a preliminary investigation into the semantics of a small Sisal-like language. We expect to provide a rm foundation for understanding and evaluating parallel language design 2
issues, aid the elimination of ambiguities in the language, provide a valuable reference for both implementors and programmers, and facilitate comparison of Sisal with other parallel functional languages. From this semantic denition, we intend to formally dene program transformations, particularly parallelizations. The work presented here focuses on the behavior of Sisal programs involving arrays, specically generation, reference, update, and operations. Arrays aspects of the development of a Sisal 2.0 compiler are described in [11]. In the next two sections, we introduce the Centaur system with its dedicated formalism to express semantics and arrays in the Sisal programming language. Section 4 outlines the structure of the complete semantic denition. Section 5 presents the formal denition of arrays. Finally, we illustrate our approach with a view of the generated Sisal interactive environment, and conclude the paper with some directions for future work.
2 The Centaur system and Natural Semantics The Centaur system is a formal tool to model and implement all aspects of programming languages. From the specications of the syntax and the semantics of a given language, one can automatically produce a syntactic editor and interactive semantic tools (such as type checkers and interpreters) for this language. In the Centaur system, syntactic specications describe both a concrete syntax (including language keywords) and an abstract syntax (language structural constructs) for a given language. From these specications, a parser is generated which transforms a Sisal source into a well-typed abstract syntax tree. For instance, we give in Figure 1 concrete and abstract syntax for array generation.
Concrete Syntax: ::= "array" "[" ":" "]"; array_gen(, , )
Abstract Syntax:
array_gen : TYPE_SPEC SIZE_DESCR_LIST ARRAY_PART_LIST ! EXPRESSION
Figure 1: Concrete and abstract syntax for array generation.
3
Semantic specications are in an operational style, using the Natural Semantics approach [17] and its implementation using the Typol formalism [7]. The general idea of a semantic denition in Natural Semantics is to provide axioms and inference rules that characterize semantic behaviors of language constructs. Behaviors are expressed with typed sequents in a logical style close to Natural Deduction [13] and Structural Operational Semantics [21]. Language constructs (abstract syntax operators) appear in a distinguished position in sequents, called the subject. Subjects are used for type-checking, according to the type denition of the sequent called a judgement. A semantic denition is identied with a logic; reasoning with the language is proving theorems within that logic. During the proof process, structural induction on subjects is performed. Axioms and rules are used as a proof-theoretic tool to generate new facts (proof trees) from existing facts in a non-deterministic manner due to the relational style of the formalism. Thus, semantic denitions are concise, mathematically precise and directly executable. A typical Typol rule from the Sisal dynamic semantics is shown below: function_definition(Fname|- System: function_def(function_header(_, For_params,_), Exps)) & bind_parameters(For_params, Eff_params -> Bind_params) & function_execution(System, Bind_params |- Exps : Values) -----------------------------------Fname, Eff_params |- System : Values ;
Figure 2: A Typol rule. This rule expresses, reading from the denominator part to the numerator part, that the semantics of a Sisal program, given a function name Fname to execute and its eective parameters E_params, is a list of values Values provided that the function is dened in the program and the execution of the function body (a list of expressions Exps to be evaluated whithin the binding of formal and eective parameters), results in this list of values. As shown in this rule, sequents can refer to other sets (function_denition, function_execution) which makes it possible to structure the semantic denition into dierent sets, each of them dealing with related objects.
3 Sisal and Arrays Sisal programs are made up of separately compiled modules, which comprise one or more function denitions. Functions have no side-eects; computation is by application of functions to values in expressions, which may return multiple values. Basic data types are boolean, integer, real and 4
double real, complex and double complex, null and character. Values may be structured using array, stream, record and union types. Streams, which are potentially innite sequences with non-strict semantics, provide pipelined parallelism. Parallel and sequential loops can be expressed. A parallel loop allows generation of a range of values, with instances of a loop body potentially executing in parallel for each value in the range. A variety of reduction and masking operators are provided to extract values from loop iterations. A sequential loop form expresses loops with data dependencies between iterations. program QuickSortExample type Info = array of integer ; function QuickSort (Data : Info returns Info) if size(Data) < 2 then Data
% (1)
else let Pivot := Data[liml(Data)] ;
% (2)
Low, Mid, High := for E in Data do returns
% (3)
array of E when E < Pivot, array of E when E = Pivot, array of E when E > Pivot end for ; in QuickSort(Low) || Mid || (QuickSort(High)) end let end if end function function main ( n: integer returns Info) let A1 := array integer [ 1..5 : 53, 14, 92, 10, 65 ] ;
% (4)
A2 := array integer [ 1..5 : [ 5..1..-1] A1 ] ;
% (5)
A3 := array integer [ 1..3, -1..1: [ i in 1..3, j in -1..1 {i dot j} ] 1;
% (6)
[ otherwise ] 0 ]; A4 := A3 [ 2, 0 : 100];
% (7)
A5 := A3 [ 1, -1..1..2: -100];
% (8)
A6 := A1 + A2 * n;
% (9)
in QuickSort ( A1 || A2 || A5 [1,..] || A6)
% (10)
end let end function end program
Figure 3: A Sisal program.
5
Sisal attempts to provide a structural, or monolithic, view of arrays, so that operations can be expressed on arrays as a whole, subarrays, or slices of arrays, rather then by element by element iteration through an array, a computational structure that often confuses intent by requiring excessive use of conditionals. Syntactic elements which contribute to this expressiveness can be seen in Figure 3, presenting a Quicksort program. Array generation (lines 4, 5, 6) expresses a type specication (the element type), a size specication for each dimension of the possibly multidimensional (line 6) array specifying the lower bound, upper bound and stride (line 5) of that dimension, and an array-part list which species both placement and values of array elements. Placement specications are quite expressive: they can be used to map the values of a subarray into a geometrically dened subset of the generated array (line 5), to describe diagonal components using a dot notation (line 6), or specify arbitrary placement of values when an array is used as vector subscript. Similar notations are used in selecting from arrays (array selection and array update, lines 7,8), at the level of individual elements, rectangular subarrays, arbitrary strided diagonals, and selection of elements at positions specied in a vector subscript. Similarly, sections of an array can be selected for update (provided the selections are disjoint) with other array values of corresponding shape. Other features of Sisal arrays illustrated by the Quicksort example are: predened functions (lines 1,2), array-of part (line 3), array inx operator (line 9), subarray selection (line 10), and array concatenation (line 10). Further discussion of this example appears in 5.8.
4 Structure of the Semantics The dynamic semantic specication is written upon the abstract syntax we dened for the Sisal language. It assumes that the Sisal program is correctly typed (integration of a type-checker in our Sisal environment is planned). Computing the result of a Sisal program requires to know which value is associated with each identier (eective parameters, carried names in loops, etc). All mappings between names and values are gathered in an environment. Execution of a Sisal program results in a tuple of values, even if that tuple only contains one value. Sisal values can be constants, arrays, streams, unions, and function values (modelled as closures, which are pairs of -expression representing a function body, and an environment). In the following, we describe the structure of the environment, and we give a synopsis of the entire semantic denition of the language.
6
4.1 Semantic Values and Environment We dene an environment as a list of pairs composed from a name and a value. Note that this structure is dened as part of the Sisal abstract syntax (see Figure 4), and uses Sisal operators (e.g. name, function_def), as well as new operators (e.g. pair, closure, etc). CONSTANT VALUE DIM_CONTENT, VALUE ARRAY_ELT env pair closure stream array dim_content boundaries array_elts ...
: : : : : : : :
PAIR * NAME FUNCTION_DEF STREAM_ELT * DIM BOUNDARIES LOWER ARRAY_ELT *
VALUE ENV DIM_CONTENT ARRAY_ELTS UPPER
! ! ! ! ! ! ! !
ENV PAIR VALUE VALUE VALUE DIM_CONTENT BOUNDARIES ARRAY_ELTS
Figure 4: Abstract Syntax for environment and values. For environment manipulation, we need to specify: (1) the adjunction of a new pair composed with a name and a (single) value and, (2) the search for a name in the environment and the retrieving of its associated value. In order to reect inheritance of surrounding scopes and hiding rules for inner scopes, new pairs are added at the front of the current environment. Because Sisal respects the principle of single assignment, this means that there is no way of changing, in an environment, the value associated to a given name.
4.2 A synopsis of the semantics The behavior of a Sisal program is expressed in terms of evaluation of the body of a main function (starting point for execution). A function body is a list of expressions which are to be evaluated in sequence and return values. Most of the rules follow the following judgement: SYSTEM, ENV |- EXPRESSION : VALUES ;
which can be paraphrased as: an expression evaluates to a list of values given a Sisal system, and an environment. Only for and let constructs can add new bindings in the environment thanks to their declaration part; management of environment reects inheritance of surrounding scopes, and hiding rules when a redenition occurs. 7
With the Typol formalism, it is possible to structure the semantic denition in separate modules, each of which deals with similar concerns. Therefore, we have a variety of Typol modules specifying the dynamic semantics of all constructs of the Sisal language. These modules can be summarized as follows:
function_execution: starting point of the semantics, this module evaluates the root function given its name and its arguments;
expressions: this module contains the specication of the core of the language (expressions), it contains sets for dealing with lists of expressions, identiers, constants, evaluation of operations on scalar types, and calls to other specic modules for evaluation of function invocation, stream and array operations, for or let constructs;
loops: specication of the loop constructs, distribution and iterative control, and packaging of return values;
streams: species stream generation, and reference;
arrays: species array generation, reference and update;
system_denition: specication of information to be retrieved from the program; this requires the program to be a parameter of most rules.
5 Semantic Denition for Arrays In this section, we give an overview of the strategy adopted in developing the array semantics, and we highlight the challenges in doing so over and above more conventional array descriptions. Then, we present the principles of the semantic denition for arrays, give intuitive meanings of the main modules, and on several occasions, we give judgements, rules and axioms.
5.1 An Abstract Syntax for Sisal Arrays The whole abstract syntax of the Sisal language contains 109 sorts and 172 operators. We detail in Figure 5 operator and sort denitions specically related to array constructs. Note that, in some cases, the parsing process can be ambiguous; analyzing a source fragment such as a[b], we can not know if it is a reference to a stream value or an array value. In this case, we chose to dene temporary operators like stream_or_array_ref. It is then the responsibility of the semantic denition to solve ambiguities, according to naming and visibility rules. 8
SELECT PLACEMENT EXPRESSION, TRIPLET, NAME_TRIPLET SELECT_PART array_gen : TYPE_SPEC SIZE_DESCR_LIST ARRAY_PART_LIST! EXPRESSION array_ref : EXPRESSION SELECT ! EXPRESSION stream_or_array_ref:EXPRESSION SELECT ! EXPRESSION array_update : EXPRESSION UPDT_PART_LIST ! EXPRESSION size_descr_list : NAMED_TRIPLET * ! SIZE_DESCR_LIST named_triplet : NAME TRIPLET ! NAMED_TRIPLET triplet : EXPRESSION EXPRESSION EXPRESSION ! TRIPLET array_part_list : ARRAY_PART * ! ARRAY_PART_LIST array_part : PLACEMENT EXPRESSION_LIST ! ARRAY_PART no_placement : ! PLACEMENT otherwise : ! PLACEMENT select : SELECT_PART_LIST DIAG_SPEC_LIST ! SELECT select_part_list : SELECT_PART + ! SELECT_PART_LIST diag_spec_list : DIAG_SPEC * ! DIAG_SPEC_LIST diag_spec : NAME NAME_LIST ! DIAG_SPEC updt_part_list : UPDT_PART + ! UPDT_PART_LIST updt_part : SELECT EXPRESSION_LIST ! UPDT_PART binary : EXPRESSION REL_OP EXPRESSION ! EXPRESSION name : identier ! EXPRESSION ...
Figure 5: Abstract Syntax for array constructs.
5.2 Comparison of Array Designs The design of Sisal 1.2 arrays is based on the one-dimensional array model. Under its semantics, arrays are constructed by the concatenation of the inner dimension arrays to build a single onedimensional array. This model enables optimizations such as vectorization to be easily exploited because a multi-dimensional array is indeed represented as a vector containing vectors. Thus the construct of arrays is uniform. The disadvantages of this array design are (1) arrays must be stored in contiguous memory and (2) the inner vectors must always have the same size. For some applications, this is not exible enough. To fully utilize the advantages of one-dimensional array and avoid the disadvantages, the design of Sisal 2.0 array should allow both one-dimensional arrays and real multi-dimensional semantics. In the semantics for the multi-dimensional arrays, the subarrays inside arrays can have dierent sizes. For example, the following array can be handled in Sisal 2.0 array semantics: array real [1..2,1..2: [2,..] array real [1.0, 2.0]; [1,..] array real [3.0, 4.0, 5.0]; ]
9
The results for the above Sisal 2.0 array in our Centaur environment is as follows: [ [1.0 2.0] [3.0 4.0 5.0] ]
This array cannot be represented in a Sisal 1.2 program. The design of Sisal 2.0 arrays contains many constructs which are convenient for users. These utilities in turn appear to be challenges for semantic specications of Sisal 2.0.
5.3 Design Challenges The design of Sisal 2.0 arrays are intended to make the task of programming easier by allowing versatile syntax utilities. More convenience for programmers means more challenges in semantic specications. Thus using Centaur, we have to write wise Typol rules to manipulate these versatile semantics of arrays. We will show several array construct examples to illustrate the challenges: B := array integer [7..8:30,40];
C := array integer [7..8:50,60]; A := array real [2..3, ..8:[2,..] B; [3,..] C];
In this case, our semantics must compute the size of B and C then place their elements in the appropriate position. Also the number 7 omitted in the index descriptor of array A must be detected by the semantic rules.
A := array real [2..3,7..8:[2,..] B; [3,..] 0];
Our semantics expressed in Typol rules must recognize the number '0' and put the appropriate number (here specied as 3) of zeroes in the array. A3 := array real [1..2,1..2,1..2,1..2:
[1,1,1,1] 3.0; [1,1,1,2] 4.0; [1,1,2,1] 5.0; [1,1,2,2] 6.0; [1,2,1,2] 12.12; [2,2,1,2] 22.12; [2,2,2,2] 7.0; [otherwise] 3.1416 ];
Our Typol rules must avoid overwriting the positions which are lled with the available values when inserting the value (3.1416) specied in "otherwise". In our specication, the Typol rules ll the array construct with the otherwise value rst then overwrite the positions which are specied with values (e.g. 3.0) in the array. 10
let in
A15 := array integer[1,4,2];
array real [1..3,1..4: [1,A15] 11.1, 14.4, 12.2; [2,A15] 21.1, 24.4, 22.2; [otherwise] 99.9] end let
In this array reference, array indices can be specied by a predened array, e.g. A15. Our Typol rules must recognize that the index is an array name and then fetch the element from that array (A15) to feed into the index-descriptor to serve as position indices. A2 := array real [2..3,7..8:[2,..] array real [1.0, 2.0];
[3,..] array real [3.0, 4.0]];
In this example, an inner array is created inside the array A2. Typol rules must identify that this is an array creation instead of one single element, so that the inner array elements will be inserted into the correct positions (there may be more than one position) in the array A2.
A one-dimensional array whose elements are one-dimensional arrays is never conformable with a two-dimensional array (p 30 of the Sisal 2.0 Reference Manual [4]). Then, we have to distinguish the above two in in our description of Sisal 2.0 arrays. This makes it possible for a user to specify an array with dierent size subarrays. For example, the following array integer [1..2,7..8: [1,..] array integer [1, 2, 3]; [2,..] array integer [4, 5]; ]
is a two dimensional array but not a vector-on-vector array. The execution results in our Centaur environment for the above example is: [ [1 2 3] [4 5] ]
5.4 Generation The array generation semantics (Figure 3, lines 4, 5, 6) is invoked in the following rule from the expression program: array_generation(System, Env |- array_gen(Type, Sizedl, Arraypl) -> Values) --------------------------------------------------------System, Env |- array_gen(Type, Sizedl, Arraypl) : Values;
Then, the semantics of array generation can be written as follows: 11
dimension(System, Env |- Sizedl : Dim, Bounds) & build_array(System, Env, Type, Dim, Bounds |- Arraypl : Def_array) & fill_array(System, Env, Def_array |- Arraypl : Array) ---------------------------------------------------------------System, Env |- array_gen(Type, Sizedl, Arraypl) : values[Array] ;
The set dimension species the dimension and the bounds from the size descriptor. Note that, in the absence of a size descriptor (optional for one-dimensional array), the set dimension returns 1 for the dimension, 1 for the lower bound and a free variable for the upper bound. This free variable will automatically be instantiated in the set build_array, via unication from the number of values dened in the list of array_parts. Then, we build the array itself with the set build_array, specied as : otherwise(System, Env, Type |- Arraypl : Value) & build_def(System, Env, Dim, Bounds, Value |- Arraypl: Def_array) ----------------------------------------------------System, Env, Type, Dim, Bounds |- Arraypl: Def_array;
The rst premise (otherwise) returns a default value from the otherwise construct. Note that this default value might be a free variable if there is no otherwise specier. Then, we actually build the array structure (Def_array) with this default value and bounds. At this point, the whole array value is built with dimension, lower and upper bounds, and all the elements set with a default value. Finally, the set ll_array examines each array_part in sequence: from the Placement son, a list of indices Indices is generated; from the Expl son, the evaluation process generates a list of values Values. Then the set update, called on Def_array with these indices and values, returns the nal array. Note that, when placement is the otherwise specier, the array is unchanged. Moreover, in the absence of placement (no_placement operator), the set select returns an empty list of indices, which means that values are lled in sequence. The semantics of ll_array is expressed as follows: set fill_array is judgement SYSTEM, ENV, VALUE |- ARRAY_PART_LIST : VALUE ; System, Env, Array |- array_part_list[] : Array ;
12
System, Env, Array |- Ap : Array' & System, Env, Array' |- Apl : Array'' ------------------------------------------------------System, Env, Array |- array_part_list[Ap.Apl]: Array'' ; System, Env, Array |- array_part(otherwise(), Expl): Array ; select(System, Env, Array |- Placement : Indices) & eval_expression_list(System, Env |- Expl : Values) & update(Indices, Values |- Array : Array') ---------------------------------------------------------System, Env, Array |- array_part(Placement, Expl): Array' ; provided diff(Placement, otherwise()); end fill_array;
Possible position speciers include expressions, triplets, named triplets, and optional diagonal components which indicate respectively components, subscripts, and vector subscripts. Because Sisal requires each element of an array to be set, and since bounds can be dened dynamically, this verication (as well as checking intersection of size descriptors) can not be done statically and has to be specied in the dynamic semantics.
5.5 Reference Array reference semantics (Figure 3, line 10) is invoked in the following rules from the expression program. The rst rule is used to solve the abovementioned ambiguity between stream and array references and applies only if the referenced data structure is an array value (an alternative rule handles the stream case); the second rule deals with unambiguous cases: eval_expression(System, Env |- Exp : values[array(Dim, Dim_content)]) & array_reference(System, Env, array(Dim, Dim_content) |- Select -> Values) --------------------------------------------------------System, Env |- stream_or_array_ref(Exp, Select) : Values; eval_expression(System, Env |- Exp : values[array(Dim, Dim_content)]) & array_reference(System, Env, array(Dim, Dim_content) |- Select -> Values) ----------------------------------------------System, Env |- array_ref(Exp, Select) : Values;
The set select returns a list of indices from which values can be extracted from an array. 13
5.6 Update In Sisal, the specication of a section of an array to be updated for array updating are almost as powerful as facilities for generation and subarray selection (Figure 3, lines 7, 8). The result of an array update is a new array diering in the specied positions. Array update semantics in invoked in the expression program as follows: eval_expression(System, Env |- Exp : values[array(Dim, Dim_content)]) & array_update(System, Env, array(Dim, Dim_content) |- Updtpl -> Array') --------------------------------------------------------System, Env |- array_update(Exp, Updtpl) : values[Array'];
The array resulting from the evaluation of the expression has to be updated, with respect to the given selectors. Then the set array_update examines in sequence each update_part, and builds step by step a new array value. Because of the single assignment principle, the original value in the environment is unchanged. System, Env, Array |- Updtp : Array' ------------------------------------------------------System, Env, Array |- update_part_list[Updtp] : Array' ; System, Env, Array |- Updtp : Array' & System, Env, Array' |- Updtpl : Array'' --------------------------------------------------------------System, Env, Array |- update_part_list[Updtp.Updtpl] : Array'' ;
For each update_part, the modications are achieved on the new array, according to the selectors and the values coming from the evaluation of the expression_list. Once again, the sets select and update are used. select(System, Env, Array |- Select : Indices) & eval_expression_list(System, Env |- Expl : Values) & update(Indices, Values |- Array : Array') -----------------------------------------------------System, Env, Array |- updt_part(Select, Expl) : Array';
14
5.7 Arrays in loops The for expression is the fundamental control mechanism in Sisal, and can be used to express either parallel distribution of loop bodies, or functional iteration. Clearly, the relationship between these expressions and array structures is of paramount importance in Sisal programming. A for expression with distributed control establishes notionally separate instances of its body for each value in a sequence, where the sequence can be specied as an arithmetic progression, or by scattering values from a stream or array. The indices corresponding to the scattered elements (At-part) can be named and used in the loop body. Each execution of the body of a for expression can be seen as contributing to sequences of values, corresponding to the identiers used in the loop body. Such a sequence can have its values combined (using array of) to form an array as a result of a for expression (Figure 3, line 3). The semantics of the array of part in a return part can be described in two steps. In the rst step, each execution of the body evaluates the bottom part (which may comprise several return parts) and build sequences of values for the next step according to possible masking lters (eval_lter); possible position speciers coming from the optional at-part are also evaluated (eval_at_list) and gathered for later packaging (using the || concatenation operator); eval_filter(System, Env |- Filter : true()) & eval_expression(System, Env |- Exp : Values') --------------------------------------------System, Env, Values, _ |- return_array(Sizedl, Exp, Filter) : Values || Values', _; eval_filter(System, Env |- Filter : false()) ------------------------------------------System, Env, Values, _ |- return_array(Sizedl, Exp, Filter) : Values, _; eval_at_list(System, Env |- Atpart : Positions') & eval_expression(System, Env |- Exp : Values') --------------------------------System, Env, Values, Positions |return_array(Sizedl, Exp, Atpart) : Values || Values', Positions || Positions';
The resulting array value can be dened using any of the mechanisms discussed above, so the scattering mechanism is quite expressive and powerful. In the second phase, after the last execution of the body, the packaging of these values takes place and results in the new array returned by the loop. 15
dimension(System, Env |- Sizedl : Dim, Bounds) & build_from_values(System, Env, Dim, Bounds, _ |- Values : Array) -----------------------------------------------------------------System, Env, Values, _ |- return_array(Sizedl, Exp, Filter) : Array dimension(System, Env |- Sizedl : Dim, Bounds) & build_from_values(System, Env, Dim, Bounds, Positions |- Values : Array) -----------------------------------------------------------------------System, Env, Values, Positions |- return_array(Sizedl, Exp, Atpart) : Array;
Rules dealing with lters and the rules dealing with position speciers are exclusive since a return array part may not have both lter and position specier. The choice between these two rules results from a matching on the third son of return_array. It can be either a lter (when_expression or unless_expression) or a at_list. The set build_from_values builds an array from a list of values and a list of positions speciers (when provided).
5.8 Discussion The formal specication we dened being executable, we are able to generate an interactive programming environment for Sisal and array manipulations. This environment includes structured editing, interpretation, and animation tools such as a step by step execution of the semantics (this permit to see in a given state what are the inference rules which currently apply). We give a avor of our Sisal environment in Figure 6, where the Quicksort program (Figure 3) is being evaluated. The Examine window shows the environment during the elaboration of the declaration part of the let (with n, A1, A2). We consider this example in more detail, in order to illustrate how an interpretable semantic denition can be useful to both a Sisal programmer and implementor. The function main denes several array values; the result it returns comes from the function QuickSort invoked with an argument formed by concatenating several array values. The denitions A1 to A6 illustrate (as mentioned in Section 2) dierent forms of array generation and selection. The function QuickSort exhibits parallelism in the parallel loop; an instance of the loop body is generated for each value in the array passed as argument, and three results are formed by ltering appropriate values and gathering them into each result. Section 5.7 explains how loop semantics complements array semantics. The interpreter generated by Centaur from the Typol semantics allows us to examine highlighted 16
Figure 6: The Sisal environment. values and expressions and the associated semantic rule. The window Examine in Figure 3 shows the environment bindings appropriate to the evaluation of the expression highlighted. The window Semantics shows the Typol rule for the evaluation of the expression associated with the name A3. It is possible, using the Edit menu, to set control points (break, skip, hide) depending on what has been selected in the window Debug Options. The interpreter allows us to step through execution, showing the current semantic rule at each step; such facilities permit convenient development of both Sisal programs and semantic rules. In the case of arrays in particular, the semantics can be readily augmented to provide more detailed visualizations of array structure, for example, a navigator of large arrays. The principles of such visualization in the context of Centaur have been explored by Lee and Wendelborn in [18], whereby the semantic rules are augmented with trace event generators to produce trace les suitable for visualization by standard visualization software such as ParaGraph [15].
17
6 Conclusions and Future Work As pointed out above, our semantic denition in Typol provides, through Centaur, an interactive programming environment. Another interactive environment for Sisal is proposed in [23, 24]. However both the means and the objectives are dierent from the work presented here. Regarding the method, their system is not based on a formal description of Sisal, but rather on classical programming techniques, which explains itself with the fact that the intended goal does not include formal transformations or proofs, but instead focuses on program scheduling and assessment. Much of the Sisal 2.0 denition is concerned with powerful array and sub-array descriptions. Id and Haskell arrays dier from those of Sisal in that they are non-strict, and can be dened recursively (Sisal requires that arrays be strict, and hence forbids recursive denition). This provides very concise denition of array-based computations in some cases, but with an implementation penalty in that evaluation order must be determined at run-time (as dependencies evolve); insistence on strict arrays allows more extensive compile-time analysis (current highly ecient implementations of Sisal 1.2 owe much of their success to such analysis). Some work is underway to combine specication of evaluation order in a recursively-dened array [12]; it is hoped that such eorts will ultimately allow both recursive denition and compile-time optimization in many useful cases. An operational semantics based on reduction rules was developed for Id Nouveau, a slightly earlier version of Id. The work reported here lays the foundation for all the analysis and optimizations required for an ecient implementation of these features. Future work will include optimization tools for Sisal: we aim at specifying the IF1 [25] and IF2 [27] intermediate formats, and the transformations from one to another, again using Centaur and the Typol formalism. This comprises the translations from Sisal into the intermediate formats, the transformations themselves, and the proofs of their validity. Among possible optimizations on arrays, we wish to address the compile-time optimizations of OSC (specically the preallocation IF2MEM, and update-in-place phases IF2UP), which minimize run-time overhead due to frequent array dynamic memory allocations and array copying (caused by the referencial transparency principle). There is also a need to experiment with program transformations which are dedicated to specic parallel architectures. Thanks to these formal specications and proved transformations, we intend to permit the denition and programming of validated compilers for Sisal.
18
References [1] Attali I., Caromel D., Oudshoorn M. A Formal Denition of the Dynamic Semantics of the Eiel Language", Sixteenth Australian Computer Science Conference (ACSC-16), Brisbane, Australia, 1993, also Research Report I3S 92.52. [2] Attali I., Caromel D. and Wendelborn A. A Formal Semantics and an Interactive Environment for Sisal, pp 231-258, in A. Zaky, editor, Tools and Environments for Parallel and Distributed Systems, to appear, Kluwer Academy Publishers, 1995. [3] Bertot Y., Implementation of an Interpreter for a Parallel Language in Centaur", 3rd European Symposium on Programming, Copenhagen, Denmark, 1990, LNCS 432. [4] Böhm A. P. W., Cann D.C., Feo J.T., Oldehoeft R.R., Sisal Reference Manual (language version 2.0)", Draft Report, 1992. [5] Borras P., Clément D., Despeyroux T., Incerpi J., Kahn G., Lang B., Pascual V. CENTAUR: the system", in Proc. of SIGSOFT'88, Third Annual Symposium on Software Development Environments, Boston, 1988. [6] Clément D., Despeyroux J., Despeyroux T., Kahn G. A simple applicative language: Mini-ML", Symp. on Functional Programming Languages and Computer Architecture, 1986 [7] Despeyroux T. Typol: a formalism to implement Natural Semantics", INRIA research report 94, 1988. [8] Dion B., Angeli L., Bravo Lastra A. PARAGRAPH: an interactive environment for parallelizingFORTRAN programs", INRIA research report 1920, 1993. [9] Ekanadham, K. A Perspective on Id", in Parallel Functional Languages and Compilers" B.K. Szymanski (ed), ACM Press, 1991. [10] Errington L. Lassi Semantics", Sisal project internal report, Department of Computer Science, University of Adelaide, August 1991. [11] Fitzgerald, S. M. Copy Elimination for True Multidimensional Arrays in SISAL 2.0", in Proc. of the Third SISAL Users and Developers Conference, San Diego, October 1993. [12] Gao, G.R., Yates, R.K., Dennis, J.B., and Mullin, L.R. An Ecient Monolithic Array Constructor", ACAPS Technical Memo 19, McGill University School of Computer Science, 1990. [13] Gentzen G. Investigation into Logical Deduction, Thesis 1935, reprinted in The collected papers of Gerhard Gentzen E. Szabo, North-Holland, Amsterdam, 1969. [14] Gopinath K., Copy Elimination in Single Assignment Languages", PhD Thesis, Stanford University, 1989 (Technical Report CSL-TR-89-384). [15] Heath M.T. and Etheridge J.A. Visualizing the Performance of Parallel Programs", IEEE Software, vol. 8, num. 5, pp 29-40, 1991. [16] Hudak P., Peyton Jones S. L., and Wadler P. L.(editors). Report on the programming language Haskell, a non-strict purely functional language (version 1.2).", SIGPLAN Notices, 27(5), 1992.
19
[17] Kahn G. Natural Semantics", in Proc. of Symp on Theoretical Aspects of Computer Science, Passau, Germany, LNCS 247, 1987 [18] Lee, K.P., IVIS: An Interpreter and a Visualizer for a Parallel Functional Programming Language, Honours Report, Department of Computer Science, University of Adelaide, 1993. [19] Milner R., Tofte M., Harper R. The Denition of Standard ML", MIT Press, 1990. [20] Nikhil, R.S. Id Language Reference Manual", TR CSG Memo 284-2, Lab. for Computer Science, MIT, 1991. [21] Plotkin G.D. A Structural Approach to Operational Semantics Report DAIMI FN-19, Computer Science Department, Aarhus University, Aarhus, Denmark, 1981. [22] Prasad S., Giacalone A., Mishra P. Operational and Algebraic Semantics of Facile: A Symmetric Integration of Concurrent and Functional Programming", in Proc. of 17th International Colloquium on Automata, Languages and Programming, Warwick University, England, 1990. Lecture Notes in Computer Science, Volume 443, Springer-Verlag, pp 765-780. [23] Shirazi B., Chen H., Kavi K., Marquis J., Hurson A.R. A Software Development Tool for Parallel Program Scheduling and Assessment", in Proc. of International Parallel Processing Symposium, 1994. [24] Shirazi B., Chen H., Yeh J. S. A visualization tool for display and interpretation of Sisal Programs", in Proc. of International Conference on Parallel and Distributed Computing Systems, 1994. [25] Skedzielewski S., Glauert J. IF1 - An intermediate form for applicative languages", Manual M-170, Lawrence Livermore National Laboratory, Livermore, Calif., 1985. [26] Sohn A. A parallel implementation of the Traveling Salesman problem on a Sequent Symmetry multiprocessor", in Proc. of the IFIP WG 10.3 Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism, Jan 1993. [27] Welcome M.L., Skedzielewski S., Yates R.K., Ranelletti J. E. An applicative language intermediate form explicit memory management", Manual M-195, Lawrence Livermore National Laboratory, Livermore, Calif., 1986. [28] Wendelborn A. L., Gardsen H., Irlam G., McDonald I., Smith G. The development and ecient execution of Sisal programs", in Jean-Luc Gaudiot and Lubomir Bic, editors, Advanced Topics in Dataow Computing, Prentice Hall, 1991.
20