Contribution to Semantics of a Data-Parallel Logic Programming Language Arnaud Lallouet1 and Yann Le Guyadec2 1 Universite d'Orleans - LIFO 4, Rue Leonard de Vinci - BP 6759 F-45067 Orleans Cedex 2 - France.
e-mail :
[email protected]
2 LAMSI 8, Rue Montaigne - BP 1104 F-56 014 Vannes - France.
e-mail :
[email protected]
December 8, 1995
Abstract. We propose an alternate approach to the usual introduction
of parallelism in logic programming. Instead of detecting the intrinsic parallelism by an automatic and complex data- ow analysis, or upgrading standard logic languages by explicit concurrent control structures leading to task-oriented languages, we tightly integrate the concepts of the data-parallel programming model and of logic programming in a kernel language, called DP-Log. It oers a simple centralized and synchronous vision to the programmer. We give this language a declarative and a distributed asynchronous operational semantics. The equivalence theorem of these semantics establishes the soundness of the implementation. The expressiveness of the language is illustrated on examples.
Keywords : Logic programming | Data-parallel languages | Design of programming languages | Semantics | MIMD architectures
Introduction The introduction of parallelism in programming languages enables to extend the expressiveness of scalar sequential languages in temporal or/and spatial directions. Temporal extensions lead to control-parallel (or task-oriented) languages, where parallelism is expressed by the addition of a parallel control structure. In opposition, spatial extensions concern data-types which are promoted from scalars to vectors. Then basic objects have parallel data types, and the model is called data-parallel. Considering programming paradigms, a large variety of programming languages have been proposed to handle parallelism. Moreover, these languages have bene ted from formal studies leading to the design of programming environments. Such environments consist of tools such as interpreters, compilers and debuggers whose goal is to ll the gap between the programming model and a
particular execution model, depending on the target computer. An other part of the programming environment consists in formal tools, whose goal is to help the programmer to have a better understanding of the programming model. For instance, a proof system may oer a privileged simple vision, by using some properties like determinism or compositionality. Although data-parallel languages [9], [20], [5], [28], [10] only enable to express a particular subset of problems, they can exploit inherent parallelism by applying the same operation to a set of data distributed on the target computer. In opposition to control parallel languages, where the programmer has to manage multiple interacting control ows, data-parallel languages are based on a synchronous and centralized semantics, providing a privileged simple vision to the programmer. Applications like numeric simulation, dynamic discrete system modeling or high speed computing handle complex data-structures and use algorithms that can be easily de ned and eciently manipulated by data-parallel languages. Less attention has been payed to the design of data-parallel logic programming languages (see gure 1) and their semantics. We place ourselves in this framework by proposing a tight integration of the concepts of the data-parallel programming model and of logic programming in a kernel language, called DPLog, brie y sketched in [19]. Sequential Control-parallel Data-parallel Imperative Pascal, C ADA, CSP, Occam C? , HPF Functional Lisp, ML Parallation-Lisp CM-Lisp, DPML Logic Prolog Parlog, Godel, CC ?
Fig. 1. Programming paradigms vs introduction of parallelism This kernel language handles arrays with parallel access, and provides an atomic general communication primitive which enables to get some remote informations, namely the result of a relocated proof. From the programmer's point of view, it is based on a synchronous semantics, where processing elements which are able to evaluate a goal, may perform it at the same time. This synchronous semantics yields deterministic executions. Nevertheless, the evaluation process is based on an asynchronous execution model, adapted to parallel MIMD architectures. In the rst part of this paper, we give an informal presentation of the DPLog language, and we discuss about related works. In the second part, we de ne a declarative semantics for our language as the classical least xed point of a suitable operator. The third part of the paper is devoted to build a proof for a given query in DP-Log. This section is also motivated by getting ecient executions, formalized by a distributed asynchronous operational semantics. Fi2
nally, we present an equivalence theorem of these semantics. This establishes the soundness of the implementation proposed by the operational semantics.
1 Informal presentation of DP-log In this section we present the concepts of the data-parallel programming model and of logic programming, and we show how they have been integrated into a kernel language. The data-parallel programming model [3] provides two complementary abstract visions to the programmer. The synchronous and centralized vision, also called macroscopic (processor of arrays), oers an attractive model which is at the origin of the success of data-parallel languages. The asynchronous and distributed vision, also called microscopic (array of processors), is dedicated to the execution model. The rst vision is adapted for an informal presentation of a data-parallel language. The evaluation process of a DP-Log program will be expressed by translating the macroscopic vision into a microscopic execution. A DP-Log program is a set of de nite Horn clauses (i.e. without negation) of the form h b1 ; : : : bn where h, b1 , ..., bn are atoms. Instead of handling scalar (mono-dimensional) objects, a DP-Log program may handle vectorial (multidimensional) objects. This workspace is described by particular objects called indexes. The most usual index domain is a nite subset of IN n (n-dimensional arrays). The topology of the index domain is also called geometry, like in most data-parallel languages. For the sake of simplicity, in this paper we only consider one-dimensional arrays, but other domains are left as discussion in conclusion. Like in other data-parallel languages, to each index is attributed the same program, and each computation occurs locally, starting with its own query. For instance, let's consider the program P1 on gure 2, and the sequence of indexes f0; 1; 2; 3g. Then, we use a vectorial notation [ a; b; a; a] 0;1;2;3 to express the query a on the set of indexes f0; 2; 3g and the query b on the index 1 at the same time. f
P1 :
a b
b
P2 :
P3 :
a
b
b
This = 3
a
g
get b from This - 1
b
Fig. 2. Three small DP-log programs This query succeeds because each goal can be locally deduced from the program. The fact that programs are identical does not mean that local computations are the same. The behavior of the program may also depend on the value of the actual index. This is achieved by a special vectorial constant, This, whose value is precisely the index of the computation. Considering the same query and the program P2 of gure 2, the query fails because the goal b can only succeed at index 3. A computation at a given index may depend on results computed at 3
other indexes. This is the purpose of the general communication primitive \get p from j " where p is an atom to be proven at index j . This is illustrated by the program P3 of gure 2. If we suppose that the domain is actually a torus (ie. computations on the index domain are done modulo 4 to avoid communication to a non-existent index), every proof of a at index i will use an auxiliary proof of b at index i ? 1. There is no \send" communication in DP-log because this kind of communication primitive may induce non-determinism, due to collisions (although nothing forbids that communication should be actually implemented this way for practical purpose). Unlike other data-parallel languages, especially imperative ones, a communication does not bring to the receiver the value of a variable. Actually, logical variables could not cope with this operation since their value cannot change after being bound. Instead, a communication asks to get a relocated proof of an atom. Declaratively speaking, an atom can be \get" from an index i if it belongs to the model of the program at index i. Operationally, when asked for a \get", an index launches a new proof process to get a proof of this atom. C1 : scan(This,N) C2 :