Distributed Programming with Logic Tuple Spaces
Paolo Ciancarini
Technical Report UBLCS-93-7 April 1993
Laboratory for Computer Science University of Bologna Piazza di Porta S. Donato, 5 40127 Bologna (Italy)
The University of Bologna Laboratory for Computer Science Research Technical Reports are available via anonymous FTP from the area ftp.cs.unibo.it:/pub/TR/UBLCS in compressed PostScript format. Abstracts are available from the same host in the directory /pub/TR/UBLCS/ABSTRACTS in plain text format. All local authors can be reached via e-mail at the address
[email protected].
UBLCS Technical Report Series ¨ Babaoglu, 92-1 Mapping Parallel Computations onto Distributed Systems in Paralex, by O. ˘ L. Alvisi, A. Amoroso and R. Davoli, January 1992. 92-2 Parallel Scientific Computing in Distributed Systems: The Paralex Approach, by L. Alvisi, A. Amoroso, ¨ Babaoglu, O. ˘ A. Baronio, R. Davoli and L. A. Giachini, February 1992. ¨ Babaoglu, 92-3 Run-time Support for Dynamic Load Balancing and Debugging in Paralex, by O. ˘ L. Alvisi, S. Amoroso, R. Davoli, L. A. Giachini, September 1992. ¨ Babaoglu, 92-4 Paralex: An Environment for Parallel Programming in Distributed Systems, by O. ˘ L. Alvisi, S. Amoroso, R. Davoli, L. A. Giachini, October 1992. ¨ 93-1 Consistent Global States of Distributed Systems: Fundamental Concepts and Mechanism, by O. Babaoglu ˘ and K. Marzullo, January 1993. ¨ Babaoglu 93-2 Understanding Non-Blocking Atomic Commitment, by O. ˘ and S. Toueg, January 1993. 93-3 Anchors and Paths in a Hypertext Publishing System, by C. Maioli and F. Vitali, February 1993. ¨ Babaoglu, 93-4 A Formalization of Priority Inversion, by O. ˘ K. Marzullo and F. Schneider, March 1993. 93-5 Some Modifications to the Dexter Model for the Formal Description of Hypertexts, by S. Lamberti, C. Maioli and F. Vitali, April 1993. 93-6 Versioning Issues in a Collaborative Distributed Hypertext System, by C. Maioli, S. Sola and F. Vitali, April 1993. 93-7 Distributed Programming with Logic Tuple Spaces, by P. Ciancarini, April 1993. 93-8 Coordinating Rule-Based Software Processes with ESP, by P. Ciancarini, April 1993. 93-9 What is Logic Programming good for in Software Engineering?, by P. Ciancarini and G. Levi, April 1993.
Distributed Programming with Logic Tuple Spaces Paolo Ciancarini1
Technical Report UBLCS-93-7 April 1993 Abstract From the point of view of multiparadigm distributed programming one of the most interesting communication mechanisms is associative communication based on a shared dataspace, as exemplified in the Linda coordination language. In fact, Linda has been used as coordination layer to parallelize several sequential programming languages, like C and Scheme. In this paper we study the combination of Linda with a logic language, whose result is the language Extended Shared Prolog (ESP). We show that ESP is based on a new programming model called PoliS, that extends Linda with Multiple Tuple Spaces. A class of applications for ESP is discussed, introducing the concept of “open multiple tuple spaces”. Finally, we show how the distributed implementation of ESP effectively uses the network version of Linda’s tuple space.
1. e-mail:
[email protected]
1
1 Introduction
1
Introduction
In a distributed system several loosely coupled computing elements are interconnected by a communication network. The amount of potential computing power present in a typical distributed system with scores or hundreds of general purpose workstations is comparable to the power offered by an expensive parallel supercomputer. However, to take advantage of these systems, we must make available to developers languages for writing distributed programs easily and effectively. It is not simple neither designing nor efficiently implementing a distributed language. One of the problems that increases the complexity of designing a distributed programming language is that several different paradigms for interprocess communication are available [6], like remote procedure calls, asynchronous message passing, and CSP-like synchronous rendez-vouz. Especially if the language being designed includes as sequential component an existing programming language, it is not easy to choose communication mechanisms that fit well with the existing mechanisms. From the point of view of a language designer one of the most interesting paradigms is associative communication based on a shared dataspace. In the activity of language design for distributed systems the idea of shared dataspace is useful because it allows to neatly separate the issues concerning control of coordination among several activities from the issues pertinent to controlling a single activity. Linda [22] is an example of this kind of languages: in fact, Linda has been used as “coordination layer” [14] to parallelize a number of sequential programming languages, notably C, Modula-2, Scheme, and Eiffel, among others. Such “language marriages” are interesting from a theoretical point of view, because usually the resulting combinations offer new programming idioms [17], from an implementation point of view, because the Linda’s Tuple Space offers many opportunities for optimizations [13], and from a practical point of view, because the resulting language broadens the spectrum of applications in which the sequential language can be used [5]. In this paper we study the combination of Linda with a logic programming language. We are interested in such a combination for the following reasons. Associative communications as exemplified in Linda fit well in the logic paradigm, that is based on unification. A shared dataspace of tuples is similar to the dynamic knowledge base of Prolog, where logic atoms are asserted and retracted. Linda operations provide a clear model for distributed implementations of logic programs, and in fact several implementations of Prolog recently added some Linda-like primitives to their inter-process communication libraries, e.g., [1,41]. There is also a more profound relationship, that is currently subject of research: a special form of logic, called Linear Logic [25] can offer a general framework for studying concurrency issues in both Linda and logic languages [3]. Our proposal for designing a logic programming language directly based on Linda is the language Extended Shared Prolog (ESP), initially defined in [11]. ESP is a Multiple Tuple Space language [23], extending the Linda concept, that is based on a unique Tuple Space, with a plurality of tuple spaces, aiming at simplifying the design of distributed systems. ESP is being used in two classes of applications: the first includes distributed versions of symbolic programs based on Prolog that need large computing power to be executed; the second includes interactive distributed programs that coordinate several users. The distributed implementation of ESP that we discuss in this paper is directly based on the network version of Linda. The use of Linda in the design of the run-time system of a Linda based distributed language can seem an obvious solution, yet most proposals of Linda UBLCS-93-7
2
2 A crash course in Linda
dialects do not explore it. We found that Linda is a flexible and powerful tool for this kind of system programming work. This paper has the following structure: in Section 2 we shortly present the main features of Linda programming. In section 3 we discuss how a Linda dialect is designed, and introduce an extension of the Linda model that includes multiple tuple spaces. In Section 4 we describe syntax and semantics of ESP, a logic language that includes a notion of Multiple Tuple Spaces. In Section 5 the language is used to show how it can be used to program the distributed version of a Prolog program. In Section 6 we show how “open” Multiple Tuple Spaces offer a flexible framework to program coordination of interactive distributed systems. In Section 7 the ESP distributed run-time system based on Network Linda is described, and some preliminary results concerning the performance of such an implementation are gathered. Finally, in Section 8 we compare ESP with other coordination languages.
2
A crash course in Linda
Linda is a coordination language, i.e., a set of language-independent operators for parallel programming. Linda permits cooperation between independent processes using tuple operators to control access to a shared multiset of tuples. These operators have been added to the syntax of several sequential programming languages, such as C [37], Modula-2 [9], Scheme [28], Eiffel [29], Joyce [33], and Russell [12], resulting in new parallel programming languages. What follows is a short overview of the Linda syntax and semantics, based on the C- Linda combination; a more abstract presentation of the Linda concept is contained in [17]. 2.1
Tuples and Tuple matching
Tuples are finite sequences of fields; the number of fields is the arity of the tuple. Every field has a value and a type drawn from the host language. The type of a tuple is the cross product of the types of its fields. Example: In C-Linda, ("array",1,3) is a tuple of arity 3. The first field has type string, the second and third fields have type int. The type of the tuple is string int int. 2 The tuple space is a multiset of tuples, i.e., identical tuples may exist in the tuple space. Processes communicate by inserting, removing and examining tuples in the tuple space. Thus the tuple space is a shared data object. All processes having access to a tuple space have access to all tuples in it. Access is associative, i.e., processes use pattern matching to access tuples. Pattern matching is based on the concept of anti-tuple: an anti-tuple is similar to a tuple, except that some fields can be typed variables; variables are prefixed by a “?” symbol. We say that a tuple t and an antituple a match if: i. both t and a have the same arity; ii. values in corresponding fields are identical; iii. a variable in a and a corresponding value in t have the same type. The result of a successful matching operation is that variables in antituple a obtain the values contained in the corresponding fields of tuple t. Example: UBLCS-93-7
3
2 A crash course in Linda
The antituple ("array",?x,?y) matches the tuple ("array",1,3); conversely, the 2 antituple ("array",?x,?x) does not matches the tuple ("array",1,3). Actually several Linda implementations allow variables also in tuples (i.e., in the tuple space), but they do not match a corresponding variable in an antituple. Thus variables in tuples play the role of wild cards in matching any value in an antituple, and no side effect is intended. 2.2
Operations on tuples
Processes in C-Linda execute ordinary C programs that can include Linda operators. These are out, in, read, and eval. The out operator inserts a tuple into the tuple space. This is a failure-free non-blocking operation. Example: out("array",1,3) inserts the tuple ("array",1,3) into the tuple space.
2
The in operator removes a tuple from the tuple space. Its argument is an antituple to match tuples against. If more than one tuple in the tuple space matches the antituple, one is chosen nondeterministically. If no matching tuple can be found, the process blocks, waiting for a matching tuple to be inserted by another process with an out operation. Example: in("array",1,?c) matches the tuple ("array",1,3); variable c assumes the value 3. 2 The rd operation is similar to in, but leaves the matched tuple in the tuple space. Many, but not all, Linda implementations offer two more operators that are predicates on the tuple space, namely inp and rdp. They are equivalent to in and rd but are nonblocking and return a boolean value which indicates the success or the failure of the operation. These operators are not standard because they can be difficult to implement in a distributed architecture: conceptually, to assert that a tuple is not present in the Tuple Space, one has to search all the tuple space itself in mutual exclusion, freezing its activities. The last operation provided by Linda is eval. This is similar to out, except that some fields in the tuple may be function invocations: such fields are called active fields, and the tuple is called active tuple. A new process is created for each active field. When the evaluation of all fields has terminated, the active tuple becomes an ordinary tuple in the tuple space, i.e., eventually an eval is equivalent to an out. Example: Let f(x) and g(x) be two functions declared in the program. The operation eval("array",f(1),g(1)) creates two new processes to evaluate each of the active fields. When both processes finally terminate, for instance resulting in f(1) = 2 and g(1) = 3, the tuple 2 ("array",2,3) will appear in the tuple space. UBLCS-93-7
4
3 Extending Linda
2.3
Network programming with Linda
Linda has been implemented on several multiprocessor architectures, both distributed and shared memory based. In our experiments, we have used Network C-Linda (Network C- Linda is a trademark of Scientific Computing Associates, New Haven), that is an implementation distributed over a network of homogeneous workstations (Sun Sparc) [5]. The implementation we used consists of a C-Linda compiler, called clc, that analyzes the program and optimizes the associative accesses to the distributed tuple space. The output of the compiler is then executed using the command tsnet, that uses a database of workstation names to distribute the code to all the workstations involved. In the current implementation there is an important limitation, i.e., that only one worker per workstation can be used (this limit should be released in the future versions). There is also a tool that supports the creation and debugging of Linda programs, called TupleScope.
3
Extending Linda
Linda is a coordination language, that means that it has to be combined with a sequential programming language (called host language) to offer a complete language for parallel programming. 3.1
Combining Linda with another language
Linda has been combined with several programming languages, like C, Scheme, Modula-2, and Eiffel. Every combination has to specify the following issues: The type system for data values; Linda has neither a type system nor a set of basic data values. Tuples are defined as ordered sequences of data values inherited from the host language. Not all data types of the host language are easily embedded in Linda. For instance, in C-Linda tuples cannot include pointer values because it would be meaningless to pass such references from a process to another process. The matching rules between tuples and antituples; the matching is based on the type system of the host language. The control constructs allowed to combine Linda tuple operators; control constructs are inherited from the host language, and not every construct is compatible with Linda operators on tuples. For instance, it is difficult to combine the tuple operators with backtracking, as in Prolog. The semantics and possible constraints on active tuples, i.e., on eval. For instance, in Lucinda [12] only one function invocation per tuple is allowed; moreover, in some implementations only one eval per processor is allowed. The closures for active tuples; when an active tuple is put in the tuple space, it must be specified which is the environment assumed for variables in the code that it executes. For instance, in C-Linda under Unix the closure is empty. Syntax and semantics of multiple tuple spaces; in the original Linda definition only one tuple space is allowed. A natural extension consists of releasing this constraint, defining a language based on Multiple Tuple Spaces. In this paper we are interested in the combination of Linda with Prolog, and we will discuss all the above problems in this context. Even leaving aside concurrency and communication mechanisms, Linda is apparently very different from a logic language like Prolog. The main data structures in Prolog are UBLCS-93-7
5
3 Extending Linda
typeless terms; Linda’s tuples include typed values. Prolog uses unification to access clauses in the knowledge base; Linda uses typed pattern matching to access tuples in the Tuple Space. All Prolog procedures are predicates; Linda’s predicates on the Tuple Space, i.e., operations rdp and inp, are not supported in every implementation of the language [37]. Prolog relies upon backtracking and recursion as main control mechanisms; Linda operations are not backtrackable, i.e., the tuple space is definitively modified after an in or an out. Linda is even more different from the family of parallel logic languages described in [39]; in fact, most of the latter cannot be classified as coordination languages at all, because they do not include any sequential language component. For a detailed comparison between Linda and stream-based parallel logic languages, see [21,38]. Still, we believe that the relationship between Linda and the logic programming community would be advantageous for both by developing a “Linda Prolog” language. Such a language could be defined in at least four different ways: Adding the Linda primitives in, rd, out, eval, inp, rdp to Prolog as non backtrackable built-in predicates, and upgrading the Linda’s typed pattern matching to full unification of logic terms. This solution is not trivial to implement [1,41], but from a theoretical point of view it is difficult to integrate with traditional logic programming semantics. In fact, a Linda Prolog defined in this way would have new Linda-like predefined predicates whose semantics would be declaratively obscure and difficult to merge with the semantics of Prolog. Adding backtrackable Linda primitives to Prolog. This solution consists of defining backtrackable communication primitives. An approach of this kind was followed in the definition and implementation of DeltaProlog [32], whose communication primitives were borrowed from Hoare’s CSP: they were backtrackable communication events. To define the semantics of DeltaProlog it was necessary to introduce a temporal logic framework. Moreover, backtracking of communications among distributed logic processes was very complex to implement. Similar problems should be addressed in the definition of really backtrackable Linda-like primitives. A more viable alternative consists of the definition of a new abstract machine for Prolog based on the Linda model. Such a solution would imply to build a fine-grained parallel interpreter for sequential Prolog using Linda’s distributed data structures and mechanisms. A possible model for the design of such an abstract machine could be an OR-parallel Prolog, like Aurora [30]. A last but not least possibility is the definition of a brand new parallel logic language having a “Linda flavour”. This is what we did when we defined Shared Prolog [10], a logic language based on a logic tuple space called “blackboard”. In this paper we continue to explore the last possibility, extending Linda with a concept of Multiple Tuple Spaces [23,31]. 3.2
Multiple Tuple Spaces
We now introduce PoliS, a programming model that extends Linda with Multiple Tuple Spaces; the name derives from the fact that we call the resulting systems “polispaces”. A polispace is a distributed system that is a collection of tuple spaces. A tuple space is a named multiset of tuples; a tuple is simply a sequence of fields. More precisely, in PoliS there are three key concepts: tuples, agents, and places. A tuple is a structured data object that is a sequence of values. It is produced by some UBLCS-93-7
6
3 Extending Linda
agent in some space, and it remains there until some agent consumes it. A tuple can be “copied” (read) or “consumed” (read and deleted) only by an agent included in the same place. Access to a tuple is associative, i.e., it is done “by contents”. The particular access mechanism chosen is a degree of freedom: e.g., PoliS can accommodate either a mechanism based on typed pattern matching, as in Linda [22], or a mechanism based on unification, as in a logic language. An agent is an execution thread, i.e., it is an abstraction of a running program completely independent of other agents. An agent is contained in a particular place and is able to perform some operations on the tuples that it contains. The semantics of an agent can be described as follows: an agent looks continously for some tuples; when they are found, it executes a computation consisting of instructions written in some sequential programming language; finally, it creates new entities (tuples or places). The sequential language chosen for programming the internal working of the agent is left outside the scope of the model as a degree of freedom, so that agents written in many different sequential languages can coexist. A place is a named multiset of tuples (in this paper we will use as synonyms for “place” the terms tuple space and blackboard). Places are containers in the sense that the universe of tuples and agents is partitioned in a number of places. Places can be dynamically created by agents. A place is both a computing space and a communication channel, i.e., a shared data structure on which agents read and write data; in fact, an agent can produce a tuple inside a place and it has access to every tuple in its own place. An agent cannot directly read the contents of an external place. Syntactically, an agent is represented by a tuple and executes the program contained in another (special) tuple, called program-tuple. An agent can use the following abstract tuple operations for its interaction with the landscape it lives in: associative test of a tuple contained in the same place the agent is; associative consumption of a tuple from the same place the agent is; asynchronous creation of a place or a tuple inside the landscape the agent knows. These operations are an abstraction of the Linda operations presented in Section 2. We need a new operation to deal with the creation of a tuple space: tsc(Name) creates a new tuple space. Moreover, we need a new notation for tuples sent to other tuple spaces: out(Tuple)@TupleSpace. What happens if an out operation targets an external tuple space that does not exists? PoliS extends the Linda semantics: out is a non-blocking operation (i.e., the agent that issues it does not wait for any result or error code), that never fails. Communications among places are supported by a meta Tuple Space where undelivered tuples remain deposited; whenever a place comes into existence, the undelivered tuples “pop up” in the tuple space. PoliS agents have a reactive semantics defined by a fixed protocol of tuple operations. The basic protocol is the following (we borrow some syntax from regular expressions: with op we intend a sequence of indefinite length of tuple operations): test; consume; loc eval; out Syntactically, such a protocol is written inside a program-tuple. (Heading: (Test; Consume; Loc Eval; Out))
The Heading is a normal tuple. Instead, Test, Consume, and Out are actually sequences of tuple operations, whereas Loc Eval is a sequential computation that has no side effect on the place to which the agent belongs. An agent is activated when the place contains both a UBLCS-93-7
7
4 Overview of ESP
program-tuple and a normal tuple matching the heading in the program-tuple. The second component of a program-tuple is also called a pattern. Executing a pattern, an agent will do the following actions: it reads associatively something from its place using any number of test operations; actually the PoliS test operation has a broader semantics than read in Linda: a number of predefined tests on the place are allowed, depending on the chosen type system for tuple arguments. Some useful general predefined tests are: relational (binary) predicates, a var predicate to check if an argument inside a tuple is a variable, and a self predicate returning the name of the place in which an agent is located. it deletes some tuples using any number of consume operations. When an agent has finished testing and deleting tuples from the place, it “reacts” and starts a computation that ends by creating some new objects in the landscape. it executes a “local evaluation” that has no effect on the place and is invisible from outside the agent insofar as no operations on the place are allowed; this local computation is expressed in a sequential programming language, it outputs the results obtained in a number of places it “knows”; these outputs can consist of tuples or places; at the end of the sequence the agent “dies”, terminating its thread of evaluation; however, we can specify an ever-lasting agent by inserting among its outputs the creation of a copy of itself. Which is the computing model underlying agents’ computations? The idea is that agents are stateless and reactive, i.e., they compute when a “molecule” can be built inside the Tuple Space. A molecule is composed of a program-tuple, a normal tuple matching the first field of a program-tuple, and all the tuples to be consumed as specified by the consume section in the program-tuple. The agent “reacts” to its environment, “burning” the molecule, and as a result creates new entities as specified in the create section. Even if the relationship among places, agents, program-tuples, and local evaluations can look slightly contrived, actually their relative meaning is quite simple: a place defines an ANDparallel computation of agents; an agent executes the computation defined by a program-tuple; the agent reacts to the contents of its place with a local evaluation followed by the creation of new entities, either tuples or places.
4
Overview of ESP
An ESP program consists of a set of program modules called theories; these define the behaviour of active tuples inside a logic tuple space. The execution of an ESP program sets up a dynamic set of logic tuple spaces; each tuple space has a set of attributes, and in particular a name. The tuple spaces form the operating environment in which activities dictated by theories take place. A logic tuple space is simply a multiset of tuples that are Prolog terms. These can include variables, but they scope only for the term they belong to. A tuple is active if it matches the name of a theory. The ESP language was firstly defined in [11]. UBLCS-93-7
8
4 Overview of ESP
htheory i ::) hheading i hinterface i himplementation i hheading i ::) theory hname i ( hformal parameters i ) hinterface i ::) eval hrule i # . . . # hrule i himplementation i ::) with hsequential program i hrule i ::) hprecondition i ! hcall seq program i f hout set i g hprecondition i ::) htest set i f hconsume set i g Figure 1. (Simplified) ESP syntax
4.1
Syntax
Fig.1 shows the simplified ESP syntax definition. A theory is a programming module that includes a heading, an interface, and an implementation. The heading includes the name of the theory and a set of formal parameters that scope over the interface. The interface includes a set of activation rules. Each rule has two parts: a preactivation and a postactivation. The preactivation defines a list of tests on the tuple space; the postactivation includes calls to the theory implementation, and a multiset of tuples called Out Set. The implementation is a Prolog program whose predicates are directly or indirectly invoked by the postactivation of the rules. For lack of space, we do not describe the predefined tests that can be used in the preactivation, nor the restrictions that the implementation of a theory must obey. Example: The theory that defines an unbounded buffer is the following: theory buffer(B):eval
f ! f
g
put(Item)
append(B,[Item],NewB)
g
buffer(NewB)
#
buffer([Item | B])
f ! f
g
get(TS)
g
buffer(B), Item@TS
with
append(L1,L2,L):-
UBLCS-93-7
...
9
4 Overview of ESP
This theory includes two rules, that define the behavior of an active tuple buffer(B): the first rule is activated if the tuple space contains the tuple put(Item), whereas the second rule is activated if the tuple space contains the tuple get(TS). In the first case the buffer is modified appending the item to the end of the list; in the second case the buffer is modified deleting its 2 first item, and sending such an item to the tuple space from where the request came. 4.2
Semantics
An ESP tuple space is a multiset of logic tuples; it is conceptually similar to a goal made of atomic goals, except that tuples do not share variables. There is no eval operation in ESP (we remind that eval is used in Linda to create an active tuple). Activities inside a tuple space are defined by a “reactive” activation mechanism: if the tuple space contains a tuple that matches the heading of a theory, a process starts. A library of theories is visible from each tuple space. Tuples that match theory headings are defined active: they are agents that perform computations. For instance, a tuple foo(1,[a,b,c]) matches a theory whose heading is foo(X,Y). An active tuple tries to satisfy one of the “preconditions” of the rules of the theory it matches. Each precondition includes a test set T of tuples, meaning that tuples in T have to be found inside the tuple space, and a consume multiset C of tuples, meaning that tuples in C have to be found and deleted from the tuple space. If a precondition of a rule R is satisfied, the active tuple commits on R and it starts the execution of the corresponding Prolog goal; when this goal terminates the out set is produced in the tuple space. Formally, for specifying the activities inside a tuple space we define a transition system. Abstractly, each rule R has the following structure:
head(R)test(R)consume(R) 7! prolog(R)out(R) head(R) is the name of the theory which R belongs to, test(R) is a set of tuples, consume(R) is a multiset of tuples, prolog(R) is a Prolog goal, and out(R) is a multiset of tuples. The symbol “7!” is a simbol of commitment, meaning that the actions denoted on the where
left must be atomically evaluated. This abstract structure allows us to express concisely a transition system including one inference rule for each rule in the program (the operator ] denotes multiset construction; the symbol is overloaded for multiset union and insertion):
head(R) 2 TS; TS ) test(R); consume(R) (TS n head(R)) TS prolog ?!(R) (TS n (consume(R) ] fhead(R)g)) ] out(R) A configuration TS including a tuple T reacts if the program includes a theory named head(R) matching tuple T , and the theory includes a rule R such that i. from TS we can infer the goal test(R) (i.e., all the tests are satisfied by the current contents of TS), and
ii. consume(R) is a subset of TS n fhead(R)g. After executing the evaluation of goal prolog (R) according to Prolog semantics, that we denote labeling the transition with prolog (R), the result is a new configuration where tuple UBLCS-93-7
10
4 Overview of ESP
head(R) and the multiset of tuples consume(R) have been “consumed”, and the multiset of tuples out(R) has been added. To simplify the presentation, we have not taken into account fields’ matching; a formal study of matching in a shared dataspace is discussed in [17]. 4.3
Multiple Tuple Spaces in ESP
In logic programming, usually a program starts its computation when a user issues a goal to be evaluated. The same happens in ESP. Formally, an ESP goal is an < out set >. Such a set can include either normal tuples, that are added to the tuple space, or special tuples of two kinds: i. a tuple of the form tsc(Name of TS) creates a new tuple space; for instance, the following out set creates three tuple spaces named t1,t2,t3:
f
g
tsc(t1), tsc(t2), tsc(t3) .
ii. a tuple of the form Tuple@TS is sent to the tuple space named TS. For instance, the following out set sends a tuple alphabeth([a,b,c,d,e,f]) to the tuple space named coder:
f
g
alphabet([a,b,c,d,e,f])@coder .
A major difference between ESP and Linda is that an ESP program can specify several tuple spaces. Each Tuple Space has two attributes: a name, and a set of invariants. Agents can send tuples outside their own tuple space using the name of another tuple space. Tuple space names can be freely passed as arguments of tuples, so that it is possible to dynamically build complex communication flows. Whereas agents represented by active tuples are ephemeral and stateless, in ESP (as in PoliS) a Tuple Space can be seen as an object, i.e., as an entity that is persistent and has a state. In fact, a Tuple Space is not a passive entity, a mere repository of tuples or a channel for messages. There is a way of controlling the activities that take place inside a polispace. For each Tuple Space we can define one or more invariants, i.e., constraints that must hold for all the Tuple Space life span. Whenever an invariant is violated, the Tuple Space stops all activities and terminates. A “garbaged tuple spaces collector” could now claim all the resources allocated to the Tuple Space. Invariants are defined inside special theories where the keyword theory is substituted by the keyword invariant. Example: The following is an invariant: invariant sendresult(User) eval result(R)
! f
g
R@User .
when tuple result(R) is produced, the tuple space terminates, communicating tuple R to 2 tuple space whose name is bound to variable User. The invariant concept is not present in Linda, because a program terminates when all activities in the Tuple Space terminate. In ESP the invariant is a flexible mechanism to specify UBLCS-93-7
11
5 Parallelizing a Prolog program with ESP
the intended semantics of a Tuple Space: when the condition specified by the invariant is verified, a “result” has been obtained; it can then be passed to some other Tuple Space.
5
Parallelizing a Prolog program with ESP
In this section we describe the ESP parallel version of a Prolog program that implements a system able to play Mastermind . In our version, the game is as follows: given an alphabet A of k symbols, a coder secretly builds a code Code that is a string without repetitions over A of length L. A decoder tries to discover the code Code by issuing guesses that are answered by the coder. Answers state how many symbols in the guess were in the right place (bulls) and how many were in a wrong place (cows). The decoder wins when he obtains as an answer to a guess L bulls and no cows. We have chosen such an example because it has been used by several authors in the field of logic programming [40,30,42,16,43]. Moreover, it is well suited for performance evaluation and comparisons, as will be shown in the subsequent sections. For instance, it is simple to change the size of the problem to solve, simplifying the study of the performance of the program. 5.1
The Prolog version
The following Prolog version of Mastermind is taken from [40]. It plays the role of a decoder in a match against a user that plays the role of the coder. The reply to a guess consists of a pair of integers: bulls (number of symbols that appear in identical positions in the guess and in the code), and cows (number of symbols that appear in both the guess and the code, but in different positions). The decoder knows he has discovered the code if there are as many bulls as there are symbols in the code. % mastermind in Prolog mastermind(N):% N is the length of the code assert(tried([]), % initialize database of guesses guess(Guess, Alphabet), % generate new guess (see below) tried(Tried Guesses), % read old guesses check(Tried Guesses,Guess,N), % is Guess consistent with old guesses? ask(Guess, [B,C]), % ask the user for answer (B,C) abolish(tried/1), % delete old database assert(tried([[B,C],Guess] Tried Guesses])), % new database B=N. % the code has been broken
j
This program plays the role of a decoder in a game of Mastermind. The main goal in case of a code with four symbols is ?- mastermind(4). Such a goal starts a generate-and-test process that is interfaced via read/write predicates with the user, that plays the role of the coder and answers to the guesses. These are generated by permutation of an alphabet and tested for consistency with respect to older guesses, that are dynamically stored using assert. We do not specify all the program, that can be found in [40]. This program is interesting because it demonstrates a general scheme used in logic programming, since it heavily uses backtracking to generate and test the guesses. Also interesting is the use of assert/retract. Conceptually, this program is simply the client of a database that could be a completely separate application: the program is suitable to be parallelized. But there is also another point of possible parallelization: even if the original game is a match UBLCS-93-7
12
5 Parallelizing a Prolog program with ESP
between a coder and one decoder, it is easy to generalize it introducing many independent decoders that cooperate to break the code. 5.2
An ESP version of the mastermind program
Suppose that several decoders cooperate to guess the code: they could use a common database to store their guesses. The idea is that each decoder sequentially builds permutations over the alphabet A used for the secret code. Each permutation is compared with the preceding answers; if it is compatible, i.e., no preceding answer is logically inconsistent with the present permutation, it is passed as a guess to the coder to be evaluated. The coder stores the answer and plays the role of the common database. In ESP we write two theories: one for the coder, and one for the decoders. theory(coder). eval alphabet(A),length(L)
!7 f
build secret code(A,L,Code), secret code(Code),db([])
#
secret code(C),
7! f
g
f
g
try(Decoder,G),db(OldGuesses)
answer(C,G/Answer), append([(G,Answer)],OldGuesses,NewGuesses) db(NewGuesses)@Decoder,db(NewGuesses) with answer(Code, Try/[B,C]):% see [40] bulls cows(Code,Try,B,C).
g
The coder obeys to two rules: the first rule initializes the tuple space building the secret code; the other rule is used to answer to the guesses from the coders. theory(decoder(D)). eval alphabet(A), db(Tried)
7! f
f
g
compute(A,Tried,Guess) try(D,Guess)@coder with compute(A, Tried, Guess):subset(Guess,A), % generates the permutation check(Tried,Guess). % tests if it is an admissible guess
g
The decoder obeys to one rule only: when a database of guesses is obtained, a new guess is built. A goal that sets up a system with three decoders is the following: ?-
f
g
tsc(c), tsc(d(1)), tsc(d(2)), tsc(d(3))
The system starts the computation after the following tuples are inserted in the proper tuple spaces: ?-
f
g
coder@c,decoder(1)@d(1),decoder(2)@d(2),decoder(3)@d(3)
UBLCS-93-7
13
6 Open coordination programs
The decoders need also an initial alphabet tuple; in order to generate a different sequence of permutations, we simply send them different initial permutations on the same alphabet. ?-
f
alphabet([a,b,c,d,e,f])@d(1),
alphabet([c,d,e,f,a,b])@d(2),
g
alphabet([e,f,a,b,c,d])@d(3)
The program terminates when the coder receives a guess that matches the secret code. This violates the following invariant: invariant guessed eval length(L), db([(G,L/0)| ])
!
print('the secret code is', G').
This rule states that if L is the length of the secret code, and a guess G obtained an answer with L bulls and no cows, then G is the secret code.
6
Open coordination programs
The dynamic nature of Multiple Tuple Spaces as defined by ESP is easily supported by an interpretive operating environment. ESP is introduced as a language for rapid prototyping of distributed applications, and can be considered a useful model for applications that coordinate several users. In this section we show some examples of this kind of applications. 6.1
Using SP for Modeling A Bridge Game
The bridge players problem is an interesting instance of a distributed interactive program, and we will discuss it here as a major ESP example. The coordination of the players is not difficult. Players operate sequentially in an anticlockwise way, apart from the fact that the first player of each trick is the winner of the preceding trick. This means that scheduling of players has to be done dynamically. A more important problem is enforcing security: we want to be sure that no player can take a hidden glance to another player’s cards. The idea, shown in Fig. 2, consists of having four private tuple spaces and a common one. The following out set initializes the state of the public tuple space:
f
referee, m(0), player(n), player(e), player(s), player(w), succ(n,e), succ(e,s), succ(s,w), succ(w,n), @bridge table
g
The tuple space is called bridge table and initially contains: an active tuple representing a referee; an active tuple for each player; the relationship succ(X; Y ) that defines the anti-clockwise rounds of play; the tuple m(0) meaning that round 1 has to come after; UBLCS-93-7
14
6 Open coordination programs
north
h
6
? #
h
-
h
h
h
h
h
h
west
"
east !
6
?
h
south Figure 2. A system to play bridge modeled with multiple tuple spaces
The referee agent has to dynamically schedule players. This is the theory that defines its behaviour:
theory referee eval m(0)
f g ! f f ! f
setup(Dealer,N,E,S,W), hand(n,N), hand(e,E), hand(s,S),hand(w,W),to play(Dealer,[ , , , ])
#
g
g
played(B,Position), m(N)
referee(B,P,Winner), N1 is N+1 trick(N1,Winner), m(N1), to play(Winner,[ , , , ]) with setup(Dealer,N,E,S,W):- ... % distributes initial hands and decides Dealer % decides the trick winner referee(B, ,Winner):- ...
g
The first pattern sets up the players’ hands and requires that the player designed as dealer plays a card (tuple to play(Dealer,[ , , , ]); the second one is used to decide who wins the current trick. Four player agents represent players in the common tuple space. The theory player has one parameter: the player’s position n, e, s, or w. UBLCS-93-7
15
6 Open coordination programs theory player(Position). eval succ(Position,Succ), hand(Position,H)
f ! f f ! f
g g
hand(H)@Position
#
g
to play(Position,Trick)
g
play(Before,After)@Position
This theory is very simple: it has no local program. It is used mainly to insulate the public tuple space from agents representing real players; for instance, a malicious player could try to peep the hands of other players. This would be very possible if the communication space were monolithic, as in Linda. By confining the real players to their tuple spaces we obtain greater security. The players actually use a shell that is an agent in a private tuple space called Position. This contains the player’s cards and and active tuple called shell. theory shell(Tty) eval self(Position), % predefined ESP predicate asking for current bb name hand(OldHand), play(Before,After)
f ! f
g
play(Tty, OldHand, Before, After, NewHand) to play(Position, After), played(Before,Position) with play(Tty, Before, After):- ... % fails if Before is a trick composed of 4 cards ... % else ask the user on Tty for choosing a card
g
The following goal starts one tuple space for each player. It also says that the player issuing this goal will be player north, using device /dev/ttyp1 to access the system.
f
g
tsc(n), tsc(e), tsc(s), tsc(w), shell('/dev/ttyp1')@n
Other players willing to play have to send similar tuples to the referee, to establish contact with the system. 6.2
Applications of Open Multiple Tuple Spaces
Several organizations today own hundreds or even thousands of computers, usually interconnected with a network. Thus, there is the problem of designing and implementing systems that exploit the tremendous potentialities offered by such network systems. It has been claimed that these systems should be programmed aiming at implementing “software societies”, whose members cooperate to perform distributed tasks and to fulfill global goals [24]. One of the central problems in designing these complex organizations is that of achieving cooperation among a set of independent agents. the design and the implementation of coordination protocols involving several independent agents should be addressed when their activities have to be coordinated. In fact, the design of these “software societies” should UBLCS-93-7
16
7 A Linda-based run time system for ESP
be supported by specific tools, and recently a new word, groupware, has been introduced to designate this kind of software [19]. Enforcing coordination is essential when designing open systems [27]. Groupware systems are important instances of open systems; so are multi-user software development environments, office information systems, as well as distributed operating systems. They all include a number of heterogeneous agents (e.g., users, automatic tools, processors) that share a number of heterogeneous services (e.g., file systems, databases, schedulers), and that compete for distributed resources (e.g., CPU time, mass storage space, shared data). In many cases the activities of agents modify irreversibly the system itself, e.g., by adding new agents or services, or consuming non-replaceable resources. These applications have to coordinate collections of separate activities referring asynchronously to each other without central control. We suggest that open multiple tuple space are a suitable model to design this kind of systems. In fact, we have developed a number of applications aiming at demonstrate the effectiveness of this model for such a task. The bridge playing system is an application of this kind; other applications are a scheduler for coordinating the meeting of a committee, a financial simulation of stock exchanges, and a distributed multiuser programming system described in the next subsection. Oikos is a distributed software development environment written in ESP [2]. Oikos provides a number of standard facilities that can be easily configured using ESP itself. The overall approach consists of offering mechanisms that can be easily composed, in order to easily explore different environment designs. An ESP polispace offers a natural way of structuring a software development environment. The polispace has a hierachical structure based on a hierarchical naming scheme for tuple spaces. Such a structure is used to reflect the decomposition of the environment in sub-environments, according to a top-down refinement strategy. The hierarchy is not really constraining the communication patterns among the agents participating in a software development process, since space names can be exchanged in tuples, and an agent can put tuples in any space, provided that it knows the name of the destination. Therefore, highly dynamic communication patterns can be set up, even connecting spaces at different levels of the hierarchy, if this is convenient. The Oikos prototype has been implemented on top of a local network connecting some Sun workstations and a Vax mainframe. ESP provides the basic mechanisms for physical distribution and dynamic activation of communicating processes. For a more detailed exposition see [11].
7
A Linda-based run time system for ESP
The ESP operating system is able to execute ESP programs over a network of workstations. It is composed by three main subsystems: the interpreter of a logic tuple space, the meta tuple space, and the shell. A logic tuple space is a Prolog process that implements coroutine evaluation of active logic tuples. The meta tuple space (metaTS) is the communication medium through which inter-tuple space communications take place. The shell is a special tuple space in which an external user can directly insert tuples. UBLCS-93-7
17
7 A Linda-based run time system for ESP
7.1
The interpreter of a logic tuple space
A logic tuple space is implemented by a standalone Prolog process that connects to the metaTS. The basic functionalities are obtained extending Prolog with a number of new predicates that implement Linda operations. The external predicates added to Prolog are the following: % This module defines the following Linda-like operations: % inb, inp, rdb, rdp, out, del, tsc, start conn, end conn for BIM Prolog % (terms become strings in the metaTS) % external definitions required by BIM prolog to interface a C program: :- extern predicate(start conn(integer:r)). ::::-
extern extern extern extern
::::-
extern extern extern extern
predicate(end conn(integer:i)). predicate(out(string:i,string:i,integer:i)). predicate(del(string:i,integer:i)). predicate(tsc(integer:r,string:i,string:m,integer:i)). predicate(inb(string:i,string:m,integer:i)). predicate(inp(string:i,string:m,integer:i)). predicate(rdb(string:i,string:m,integer:i)). predicate(rdp(string:i,string:m,integer:i)).
These predicates are implemented by C-Linda modules that are used to connect via a socket to the metaTS. The main control structure of the interpreter is the following: go:start conn, schedule(Agents).
% open connection % start execution
Basically, the scheduler implements a forward chaining strategy of computation: it picks an active tuple to be executed, comparing it with the rules of the theory it is associated to. If no rule applies, another agent is chosen. If no agent can be activated, the scheduler blocks waiting for messages from the metaTS. Moreover, the scheduler uses a coroutine mechanism to alternate between rule execution and testing for messages incoming for the metaTS. When a tuple arrives, it is immediately classified as active if it matches one of the theories valid in that tuple space.
7.2
The meta tuple space
The meta tuple space is a C-Linda program that is executed by the Network C-Linda system described in Sect. 2.3. It activates a worker for each new tuple space as requested by an ESP goal tsc(TS), and moreover it takes care of communications across different tuple spaces. The metaTS is a distributed server that accepts TCP sockets connections; it takes care of new tuple spaces allocation over the network (that includes a number of SUN workstations). When it starts, a list on processors names is passed to it. UBLCS-93-7
18
7 A Linda-based run time system for ESP /* metaTS: a Linda server using sockets over the TCP protocol * It is a connection-oriented concurrent server */ real main(argc,argv) /* Linda main */ int argc; char *argv[]; /* a list of valid processor names */
f
extern int worker(); /* socket data structures */ char processorname[MAX NAME LEN];
/* processors names */ /* socket inizialization */ for (i=1; i