Theoretical Computer Science A timed semantics of Orc - Orc Language

6 downloads 205490 Views 665KB Size Report
Theoretical Computer Science 402 (2008) 234–248. Contents lists available at ScienceDirect. Theoretical Computer Science journal homepage: ...
Theoretical Computer Science 402 (2008) 234–248

Contents lists available at ScienceDirect

Theoretical Computer Science journal homepage: www.elsevier.com/locate/tcs

A timed semantics of Orc Ian Wehrman ∗ , David Kitchin, William R. Cook, Jayadev Misra The University of Texas at Austin, United States

article

info

Keywords: Orc Computation orchestration Semantics Time Concurrency Process algebra Web services

a b s t r a c t Orc is a kernel language for structured concurrent programming. Orc provides three powerful combinators that define the structure of a concurrent computation. These combinators support sequential and concurrent execution, and concurrent execution with blocking and termination. Orc is particularly well-suited for task orchestration, a form of concurrent programming with applications in workflow, business process management, and web service orchestration. Orc provides constructs to orchestrate the concurrent invocation of services while managing time-outs, priorities, and failures of services or communication. Our previous work on the semantics of Orc focused on its asynchronous behavior. The inclusion of time or the effect of delay on a computation had not been modeled. In this paper, we define an operational semantics of Orc that allows reasoning about delays, which are introduced explicitly by time-based constructs or implicitly by network delays. We develop a number of identities among Orc expressions and define an equality relation that is a congruence. We also present a denotational semantics in which the meaning of an Orc program is a set of traces, and show that the two semantics are equivalent. © 2008 Elsevier B.V. All rights reserved.

1. Introduction Orc is a language for structured concurrent programming. It is based on the premise that structured concurrent programs should be developed much like structured sequential programs, by decomposing a problem and combining the solutions with combinators of the language. Naturally, Orc combinators support concurrency: parallel subcomputations, spawning of computations and blocking or termination of subcomputations. Expressions in an Orc program are either primitive or a combination of two expressions. A primitive expression is a call to an existing service, a site, to perform its computations and return a result to the caller. There are only three combinators for Orc expressions, which allow sequential and concurrent executions of expressions, and concurrent execution with termination. Orc is particularly well-suited for task orchestration, a form of concurrent programming in which multiple services are invoked to achieve a goal while managing time-outs, priorities, and failures of services or communication. Unlike traditional concurrency models, orchestration introduces an asymmetric relationship between a program and the services that constitute its environment. An orchestration invokes and receives responses from the external services, which do not initiate communication. In this paper, we illustrate the use of Orc in implementing some traditional concurrent computation patterns; larger examples have also been developed [20,9]. Orc has also been used to study service-level agreements for composite web services [24].

∗ Corresponding address: The University of Texas at Austin, Department of Computer Sciences, Taylor Hall, 1 University Station, C0500, 78712-0233 Austin, TX, United States. E-mail address: [email protected] (I. Wehrman). 0304-3975/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.tcs.2008.04.037

I. Wehrman et al. / Theoretical Computer Science 402 (2008) 234–248

235

Fig. 1. Fundamental sites.

Fig. 2. Syntax of Orc.

Time is an essential aspect of many orchestrations—time-critical business workflows, for example, are naturally expressed as orchestrations [1]. Time is introduced in Orc implicitly by delays resulting from remote service calls, and explicitly by the site Rtimer, which waits a given amount of time when invoked before continuing execution. Previous accounts of the semantics of Orc [14,20,23,13] have not covered the semantics of time. In Section 4, an operational semantics of Orc is given that includes time. The semantics shown here is based on an asynchronous semantics of Orc [14]. The transition relation of the asynchronous operational semantics is extended to include the time at which an event occurs. The corresponding executions are changed from a sequence of events to a sequence of time-event pairs. The semantics allows multiple events to occur at a single instant of time. An important feature of the semantics presented here is that time can be considered either discrete or continuous. We have shown for the asynchronous semantics that equality of trace sets defines a congruence on programs, in that programs with equivalent trace sets are interchangeable [14]. We establish the same result for timed semantics. Additionally, we give a number of identities in Section 7, similar to those of Kleene algebra [16], that hold for the Orc combinators in the timed semantics. In Section 8, we show that traces form a denotation, which allows us reason about the operational behavior of an Orc expression compositionally. In particular, the denotational semantics shows that the traces of a recursively defined expression can be computed as the limit of a sequence of traces. In Section 9, we show that the operational and denotational semantics agree. Detailed proofs of all the results stated in this paper can be found in a companion technical report, Wehrman, et. al [27]. Portions of Sections 2 and 3 appeared previously in [14]. 2. Overview of Orc An Orc program consists of a goal expression and a set of definitions. The goal expression is evaluated in order to run the program. The definitions are used in the goal and in other definitions. An expression is either primitive or a combination of two expressions. A primitive expression is a call to an existing service, a site, to perform its computations and return a result; we describe sites in Section 2.1. Additionally, 0 is a primitive described in Section 4.1, which has no observable transitions. Two expressions can be combined to form a composite expression using Orc combinators; we describe the combinators in Section 2.2. We allow expressions to be named in a definition, and these names may then be used in other expressions. Naming permits us to define an expression recursively by using its own name in the definition. Definitions and recursion are treated in Section 2.3. We give a complete formal syntax in Fig. 2 of Section 2.4. During its evaluation, an Orc expression calls sites and publishes values. Below, we describe the details of calls and publications. 2.1. Sites A primitive Orc expression is a site call M(p¯ ), where M is a site name and p¯ a list of actual parameters. A site is an external program, like a web service. The site may be implemented on the client’s machine or a remote machine. A site call elicits at most one response; it is possible that a site never responds to a call. For example, evaluation of CNN(d), where CNN is a news service site and d is a date, calls CNN with parameter value d; if CNN responds (with the news page for the specified date), the response is published. Site calls are strict, i.e., a site is called only if all its parameters have values. Fig. 1 lists a few sites that are fundamental to effective programming in Orc (in the figure, a signal is a unit value and has no additional information). Signal is a site which responds immediately with a signal (it is the same as if (true)). Site Rtimer is used to introduce delays and impose time-outs, and is essential for time-based computations. Examples appear in Section 3.

I. Wehrman et al. / Theoretical Computer Science 402 (2008) 234–248

236

2.2. Combinators There are three combinators in Orc for combining expressions f and g: symmetric parallel composition, written as f | g; sequential composition with respect to variable x, written as f >x> g; and asymmetric parallel composition with respect to variable x, written as f x> g, expression f is evaluated and each value published by it initiates a fresh instance of g as a separate computation. The value published by f is bound to x in g’s computation. Evaluation of f continues while (possibly several) instances of g are run. If f publishes no value, g is never instantiated. The values published by f >x> g are the ones published by all the instances of g (values published by f are consumed within f >x> g). This is the only mechanism in Orc similar to spawning threads. As an example, the following expression calls sites CNN and BBC in parallel to get the news for date d. Responses from either of these calls are bound to x and then site email is called to send the information to address a. Thus, email may be called 0, 1 or 2 times.

(CNN(d) | BBC (d)) >x> email(a, x). Expression f  g is short-hand for f >x> g, where x is not free in g. As a short example of time-based computation, Rtimer(2)  M delays calling site M for two time units, and M | (Rtimer(1)  M) | (Rtimer(2)  M) makes three calls to M at unit time intervals. To evaluate (f x> let(x, false)).

Fork-join parallelism In concurrent programming, one often needs to spawn two independent threads at a point in the computation, and resume the computation after both threads complete. Such an execution style is called fork-join parallelism. There is no special construct for fork-join in Orc, but it is easy to code such computations. Below, we define forkjoin to call sites M and N in parallel and publish their values as a tuple after they both complete their executions. forkjoin ∆ (let(x, y) g for some variable x not free in g. So if f t→ f then, by rule (Seq1V), f  g → (f 0  g) | g.

I. Wehrman et al. / Theoretical Computer Science 402 (2008) 234–248

241

(s,a)(t,b)

f 00 , because b occurs s + t units after the evaluation of f starts. We define ut as the sequence that results from increasing each time component of u by t. The definition of ut is also lifted to sets pointwise: Ut = {ut | u ∈ U }. f



Define relation ⇒ as the reflexive–transitive closure of relation → except that the time components accumulate. 

(Ex-Refl)

f ⇒ f

(t,a) u f → f 00 , f 00 ⇒ f 0

f

(t,a)ut

(Ex-Trans)

⇒ f0

Call u an execution of f if f ⇒ f 0 for some f 0 6= ⊥. Note that the empty sequence  is an execution of any expression by rule (Ex-Refl). The definition of executions requires f 0 6= ⊥ so that all intermediate expressions in an execution (such as f 00 ) are reachable—if any intermediate expression is unreachable, the final expression, f 0 , would be unreachable because ⊥ has no transitions. u

Example. The example of Section 4.3, (Rtimer(3)  M) | N, has an execution shown below: u = (0, τ) (0, τ) (2, !n) (3, τ).

A trace u is obtained from execution u by removing each internal event (t, τ). The definition is also lifted pointwise to sets: U = {u | u ∈ U }. Example. Execution u and its trace u are shown below: u = (0, τ)(0, τ) (2, !a)(3, τ) u= (2, !a).

Notation. The execution set and trace set of f are written [[ f ]] and h f i , respectively:

[[ f ]] = {u | f ⇒ f 0 , for some f 0 }, and h f i = [[ f ]]. u

We define f ∼ g to mean [[ f ]] = [[ g ]] and f ∼ = g to mean h f i = h gi . We will show that ∼ and ∼ = are congruence relations, so that related expressions can be replaced by each other in all contexts. This claim, however, is not true with the theory developed so far. For example, we can prove that 0 ∼ let(x) because neither has an observable transition. Yet these two expressions display different behaviors in the same context: let(1) >x> 0 never publishes, whereas let(1) >x> let(x) always publishes. Our goal is for traces to represent the observable behavior of an expression, thus the semantics must be extended to distinguish these two cases. 6. Substitution events We introduce another kind of event, called a substitution event, to represent the binding of a value to a free variable in an expression. Substitution events have the form (t, [m/x]), where m is a value and x is a variable. The following transition rule introduces substitution events at the top level of expression derivations. f

t,[m/x]

→ [m/x].(f t )

(Subst).

Henceforth, we write [m/x].f t to mean [m/x].(f t ), i.e., the time-shift operator binds more strongly than substitution. Thus, using rule (Subst) and the definitions of time shifting and substitution we get f |g

t,[m/x]

→ [m/x].f t | [m/x].gt .

A substitution event differs from the base events described in Section 4 in a crucial way: the rules in Fig. 3 are defined only over base events. Therefore, given that f f |g

t,[m/x]

→ [m/x].f t , (Sym1) can not be applied to deduce

t,[m/x]

→ [m/x].f t | gt .

Introducing substitution events allows us to distinguish between 0 and let(x). Both 0 and let(x) have transitions due to (Subst), e.g., with event (0, [1/x]). However, [1/x].00 = 0 still has no observable transitions, while [1/x].let(x)0 = let(1) publishes 1. Example. In the example from Section 4, (Rtimer(3)  M(x)) | N was shown to evaluate to expression ((0  M(x)) | M(x)) | 0, which had no further transitions. The rule (Subst) can now be applied, e.g., with event (0, [1/x]) to yield expression ((0  M(1)) | M(1)) | 0, which can be evaluated further. Summary of notations. A summary of notation used in the sequel is shown in Fig. 7.

I. Wehrman et al. / Theoretical Computer Science 402 (2008) 234–248

242

Fig. 7. Summary of notation.

7. Identities In this section, we list certain identities over arbitrary expressions (i.e., with or without free variables), some of them similar to the laws of Kleene algebra [16]. Proofs of the identities, using strong bisimulation, are given in the technical report [27]. (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix)

|0 ∼ f |g ∼ g|f | (g | h) ∼ (f | g) | h >x> (g >y> h) ∼ (f >x> g) >y> h, if x is not free in h 0 >x > f ∼ 0 (f | g) >x> h ∼ f >x> h | g >x> h (f | g) g) g, if x is not free in g (f