Int. J. Business Process Integration and Management, Vol. 1, No. 2, 2006
129
Scalable peer-to-peer process management
Christoph Schuler* ERGON Informatik AG, CH–8008 Zurich, Switzerland E-mail:
[email protected] *Corresponding author
Can Türker Functional Genomics Center Zürich, CH–8057 Zurich, Switzerland E-mail:
[email protected]
Hans-Jörg Schek Swiss Federal Institute of Technology (ETH), CH–8092 Zurich, Switzerland University for Health Sciences, Medical Informatics and Technology (UMIT), Hall in Tyrol, Austria E-mail:
[email protected]
Roger Weber Platform Architecture, Credit Suisse, CH-8070 Zurich, Switzerland E-mail:
[email protected]
Heiko Schuldt Database and Information Systems Group, University of Basel, CH-4056 Basal, Switzerland E-mail:
[email protected] Abstract: The functionality of applications is increasingly being made available by services. General concepts and standards such as SOAP, WSDL and UDDI support the discovery and invocation of single web services. The state-of-the-art process management is conceptually based on a centralised process manager. The resources of this coordinator limit the number of concurrent process executions, especially as the coordinator has to persistently store each state change for recovery purposes. In this paper, we overcome this limitation by executing processes in a peerto-peer way exploiting all peers of the system. By distributing the execution and navigation costs, we can achieve a higher degree of scalability allowing for a much larger throughput of processes compared to centralised solutions. This paper describes our prototype system OSIRIS, which implements such a true peer-to-peer process execution. We further present very promising results verifying the advantages over centralised process management in terms of scalability. Keywords: process management; peer-to-peer execution; service orientation; scalability; hyperdatabases (HDB); OSIRIS. Reference to this paper should be made as follows: Schuler, C., Türker, C., Schek, H-J., Weber, R. and Schuldt, H. (2006) ‘Scalable peer-to-peer process management’, Int. J. Business Process Integration and Management, Vol. 1, No. 2, pp.129–142. Biographical notes: Dr. Christoph Schuler is a Senior Software Engineer at Ergon Informatik AG, Zurich, Switzerland. He studied Computer Science at the Swiss Federal Institute of Technology (ETH) and received his PhD in 2004 from ETH Zurich for his work on peer-to-peer process management. Currently he is working in the area of J2EE middleware and application frameworks, tooling for software factories, process management and architecture of large business applications.
Copyright © 2006 Inderscience Enterprises Ltd.
130
C. Schuler et al. Dr. Can Türker studied Computer Science at the Technical University of Darmstadt, Germany, from which he received a Diploma in September 1994. From 1994 to 1999, he was a Teaching and Research Assistant in the database research group of Dr. Gunter Saake at the University of Magdeburg, Germany, from which he received his PhD in October 1999. From August 1999 to March 2005, he worked as a Senior Research Assistant in the database research group of Dr. Hans-Jörg Schek at ETH Zurich, Switzerland. From October 2001 to March 2002, he held a parttime Professor (temporary as replacement) at the University of Constance, Germany. Since April 2005, he has been leading the Data Integration Group of the Functional Genomics Center Zurich. Prof. Dr. Hans-Jörg Schek is Emeritus Professor of Computer Science at the Swiss Federal Institute of Technology (ETH), Zurich where he led the Database Group between 1988 and 2005. Since 2002, he has also been a Professor at the University for Health Sciences, Medical Informatics and Technology (UMIT) in Hall, Austria. He holds a Diploma in Mathematics (University of Stuttgart, 1968) and a PhD (1972) and Habilitation degree (1978) in Engineering from the University of Stuttgart. The milestones of his career are the IBM Scientific Center Heidelberg (1972–1983), Full Professor at the Technical University of Darmstadt (1983–1988) and research sabbaticals at IBM Almaden Research (1987), at the ICSI Berkeley (1995) and at the Medical Informatics Department of the University of Heidelberg (1995). He has been an ACM Fellow since 2000. Dr. Roger Weber studied Computer Science at the Swiss Federal Institute of Technology (ETH) where he graduated in September 1996. He received his PhD in 2000 from ETH. From 2001 to 2004, he was a Senior Researcher at the Database Research Group at ETH. In 2004, he joined the IT Architecture team at Credit Suisse, the second largest Swiss bank, with responsibilities in the areas of Java development, standardisation of platforms and optimisation of the technical infrastructure. Dr. Heiko Schuldt is a Professor of Computer Science at the University of Basel, Switzerland where he leads the Database and Information Systems group since 2005. Previously, he held a Professor Position at the University for Health Sciences, Medical Informatics and Technology (UMIT) in Hall, Austria. He received a Diploma Degree in Computer Science from the University of Karlsruhe, Germany in 1996 and his PhD from ETH Zurich, Switzerland in 2001. From 2001 to 2003, he was a Senior Researcher at the Database Research Group of ETH Zurich.
1
Introduction
In the last few years, information technology has undergone several major changes. Especially the current trend towards service orientation, where services encapsulate and provide access to data and/or applications, had a strong impact on information systems and middleware and has radically changed the way information processing takes place. In particular, web services that can be invoked by common web protocols (SOAP over HTTP) have led to the recent proliferation of service-oriented computing. System support for the invocation of single web services is widely available. However, one of the most important tasks when dealing with web services is to combine existing services into a coherent whole. Such applications spanning several (web) service invocations are usually realised by processes. The platform-independent definitions of XML (W3C, 1998), SOAP (2003) and WSDL (2001) further simplify such a composition. In many application scenarios, we can use processes for maintenance purposes to implement replication or to enforce consistency constraints over different sources. This is done automatically by triggering the execution of a process whenever the violation of a constraint has occurred. Furthermore, it is also natural for such systems to implement accesses to data and queries from users by processes that gather the information via different service calls and aggregate the retrieved data according to the user’s demand. As a result, the infrastructure
of such systems has to cope with large numbers of concurrent processes and to ensure the same quality of service as traditional database applications. For all these reasons, service orientation and process support come along with a set of new challenges: •
Dynamics: new services and service providers may enter or leave the system at any time and the system must keep track of such changes in parallel to service invocations and process execution.
•
Optimal routing: sophisticated routing strategies for service invocations to distribute requests among service providers at run-time using approximate knowledge about availability and load.
•
Reliability and correctness: processes have to be executed reliably and with dedicated correctness guarantees even in the case of failures.
•
Scalability: the infrastructure must scale with the number of services, processes and users. Clearly, centralised systems will soon reach their limitations. Therefore, each peer of the community is equipped with a small software layer, which is a part of the overall infrastructure. This layer also implements process support in a peer-to-peer fashion and thereby gets rid of a monolithic process management system.
Scalable peer-to-peer process management In view of these challenges, a number of groups have started new projects (e.g. Infopipes (Pu et al., 2001), ObjectGlobe (Braumandl et al., 2001), ISEE (Meng et al., 2002), METUFLOW (Dogac et al., 1997) and MARCAs (Dogac et al., 2000) or eFlow (Casati et al., 2000)) breaking out of conventional technology. At ETH Zurich, the hyperdatabase (HDB) vision (Schek et al., 2002a,b) was established several years ago with the objective to identify a new middleware infrastructure based on well-understood concepts evolving from the database technology. While database systems handle data records, a HDB system deals with services and service invocations. Services in turn may use a database system. In short, an HDB takes care of optimal routing similar to query optimisation in a conventional database and it provides process support with transactional guarantees over distributed components using existing services as a generalisation of traditional database transactions (Schuldt et al., 2002). Most importantly and in contrast to traditional database technology, a HDB does not follow a monolithic system architecture but is fully distributed over all participating peers in a network. Every peer is equipped with an additional thin software layer, a so-called HDB layer as depicted in Figure 1. The HDB layer extends existing layers like the TCP/IP stack with a process-related functionality. As such, the HDB layer abstracts from service routing much like TCP/IP abstracts from data packet routing. Moreover, while the TCP/IP protocol guarantees correct transfer of bytes, the HDB layer guarantees the correct shipment of process instances. Of course, a distributed process infrastructure requires that each service provider locally installs this additional software layer. Ideally, this layer comes together with the operating system much like the TCP/IP stack does (comparable to the .NET framework Microsoft.NET (2005)). Figure 1
131 following these principles. While previous papers describe the HDB vision (Schek et al., 2002a), the process model (Schuldt et al., 2002) and the general architecture (Schuler et al., 2003) in the following, we concentrate on the discussion and evaluation of OSIRIS as a reliable and scalable infrastructure for the management of services and processes. The contributions of this paper are: •
it describes how to provide process execution in a true peer-to-peer manner
•
the presented solution has sophisticated failure handling mechanisms. Note that failure handling in a distributed environment is more complex compared to failure handling in a central process engine
•
we present evaluation results with OSIRIS using a first benchmark and show significant scalability improvements compared to central solutions.
In the following, we summarise our earlier work on the topics of process models (Schuldt et al., 2002), execution guarantees (Schuldt, 2001) and the architecture of large-scale peer-to-peer systems offering web services (Schuler et al., 2003) in Section 2. Section 3 contains the details of our peer-to-peer process execution. In extension to this previous work, we describe and evaluate true peer-to-peer process execution and we discuss the advantages of this approach compared to traditional centralised solutions in terms of scalability (Section 4). We will discuss related work in Section 5 and conclude with open problems and future work in Section 6.
2
Process model and architecture of OSIRIS
In this section, we summarise the OSIRIS architecture and discuss the main architectural decisions.
Peer-to-peer process execution
2.1
We have implemented a prototype system called OSIRIS (short for Open Service Infrastructure for Reliable and Integrated Process Support) (ETH Database Research Group, 2005)
The OSIRIS process model
A process encompasses a set of activities. Each activity corresponds to an invocation of a service. This can either be a basic (web) service or again a process. On the set of activities, two different orders are defined. The partial precedence order specifies the regular order in which activities are executed (i.e. in which the associated services are invoked). An activity can only be executed when all its pre-ordered activities have successfully finished and when the conditions on its execution are fulfilled. As the precedence order is a partial order, intra-process parallelisation can be realised by parallel branches (fork/join). However, we assume a process to have (at least conceptually) always exactly one start activity and one final activity.1 We define the data flow with mappings from a data space in the process instance called whiteboard, to the service request parameters and back from the response values to the whiteboard after execution. Note that this implies that each branch of the process has its own version of the whiteboard (no data flow between parallel branches is possible) and that these versions have to be merged when joining two or more branches. Our model also allows for loops and conditional execution (i.e. conditions
132
C. Schuler et al.
associated with orders). In addition, the preference order specifies alternative executions that can be chosen when an execution path fails. OSIRIS follows the model of transactional processes (Schuldt et al., 2002) that aims at providing transactional execution guarantees for processes. It is based on the flexible transaction model (Elmagarmid et al., 1990; Mehrotra et al., 1992; Zhang et al., 1994). Essentially, it considers the termination semantics of individual activities. Each activity is either compensatable, retriable or pivot. The effects of compensatable activities can be semantically undone after the corresponding service invocation has successfully returned. An activity is pivot when it is not compensatable. Retriable activities are guaranteed to terminate correctly, even if they have to be invoked repeatedly. In this case, the last invocation succeeds while all previous invocations of this activity do not leave any effects. On the basis of the termination guarantees and the precedence and preference orders, it can be verified already at built-time whether or not a process can be executed correctly. This is the case when all failures that may occur during the execution can be resolved by either complete compensation or partial compensation until a point is reached from where an alternative execution path can be followed (according to the preference order). All activities preceding the first pivot activities have to be compensatable. Each pivot activity in a process (there might be several pivots in a process) must be succeeded by at least one execution path that consists only of retriable activities, that is, an execution path whose correct execution can be guaranteed. According to the model of transactional processes, a process that is correctly defined has guaranteed termination property. This means that it is guaranteed that exactly one out of a set of possible execution paths is affected correctly (or the effects of all activities that have been invoked are completely undone) while all other executions paths do not leave any effects. As guaranteed termination allows to choose exactly one out of possibly several executions, it guarantees the well-known all-or-nothing semantics of transactions. OSIRIS comes together with O’GRAPE (Weber et al., 2003), our Java-based process modelling tool that supports a designer in defining and validating transactional processes.
2.2
Principles of the architecture
The architecture of OSIRIS is driven by the goal of implementing a true peer-to-peer process execution engine for executing transactional processes. Thereby, it follows our HDB vision: process execution involves only those machines of the network that offer a service required by the process definition. There should be no central synchronisation and navigation peer to drive the process instances. As we will see, true peer-to-peer process execution is strongly related to efficient meta-data management and replication. The architecture of OSIRIS is split into two parts: firstly, each peer of the network runs a HDB layer that provides local functionality for process instance navigation and routing as envisioned in the introduction. Secondly, OSIRIS runs a number of global system services for the meta-data management. These repositories hold information about process definitions, subscription lists, service providers, load information, etc. (cf. Figure 1, middle box). They
store meta-information in semi-structured documents, which are essential to run the system. However, the HDB layers that actually execute processes should not have to query meta-information from the global repositories. Rather, a push mechanism replicates those parts of meta-information towards the HDB layers that they require to fulfil their tasks. For instance, if a peer is involved in executing process P1, the definition and any changes to that definition are pushed from the process repository to the HDB layer. This reduces dramatically the probability that this information have to be requested at run-time. From another perspective, the HDB layers perform their tasks based mainly on local versions of the global meta-information. With this information, the process execution is decoupled from the meta-data management and allows for an efficient distributed process execution. Moreover, it is often sufficient to hold approximate versions of meta-data: for instance, load information of peers is required by the HDB layer to balance the process workload among the available service providers. As this is an optimisation, it is sufficient if local load information is only approximately accurate. Note that this would not lead to incorrect execution. In the worst case, the load is not optimal distributed among all service providers. On the other hand, changes on process definitions have to be propagated immediately. However, even this information can be replicated in a lazy way, as process instances are explicitly bound to a dedicated version of a process definition.
2.3
Replication of meta-information
It is important to understand that the global meta-data repositories are not directly involved in the execution of processes. They just maintain meta-information and distribute this information to all the HDB layers when needed. In previous work, we have introduced two measures to reduce the amount of data being transferred to the peers (see Schuler et al. (2003) for more details). Firstly, we use a publish/subscribe scheme with path predicates on the XML document for replication. The path predicate of the subscription selects those parts of the document that a peer locally requires: for instance, assume that there are two processes P1 and P2, but an HDB layer is only involved in the execution of process P1, that is, none of the services required by process P2 is provided by this peer. Obviously, this HDB layer only subscribes for the portion of process meta-data that includes the definition of P1 but not the one of P2. The same holds true for other pieces of meta-information: an HDB layer only requires load information of peers it eventually will send requests to. The publish/subscribe scheme then ensures that changes at the central XML document are transferred if and only if they are covered by the path predicate. In the previous example, the process repository would not push updates of process definition P2 to that specific HDB layer. Secondly, we use so-called freshness predicates to further reduce the number of publications. Freshness predicates are especially useful for replication of very dynamic metadatalike load information: small changes of the load of a peer are not significant for load balancing. To avoid such unnecessary updates, each subscription at a metadata repository further comprises a freshness predicate that defines under what
Scalable peer-to-peer process management circumstances a new data item has to be published. An ‘eager’ freshness, for instance, denotes that the repository has to publish every change as soon as possible.2 In the case of load information, a freshness predicate may denote that updates are only published if the difference between the global version and the local version are above 20%. Conceptually, the freshness predicate is a piece of code sent with the subscription that decides based on the global version and local version at the peer whether the update has to be published. As an optimisation, the global repositories keep track of the current states of the subscribed peers and of the changes performed meanwhile.
2.4
133 6
In case of several service providers, the peer may choose the most appropriate depending on various parameters. An important parameter is the load of the corresponding peers. Therefore, the peer subscribes for this load information.
7
The peer will then receive the load information of the corresponding peer(s).
Figure 2
Replication of meta-data
Dynamics of OSIRIS
The architecture of OSIRIS takes the dynamics of a large-scale distributed computing environment into account. Among others, it handles the following dynamics: •
service providers may enter or leave the network
•
new services can be offered, existing ones may be temporarily or even persistently unavailable
•
new processes can be defined, old ones deleted and
•
the load of the service providers is changing continuously.
As mentioned before, OSIRIS implements a sophisticated replication technique to ensure the consistency and freshness of the corresponding meta-information over the entire network. In the following, we will illustrate this technique for the provision of a new service instance of a service type B. As depicted in Figure 2, the following steps are performed: 1
The service provider (peer) registers the new service instance by invoking a ‘registration’ service that comes with the HDB layer.
2
From its configuration parameters, this registration service knows how to publish the new service over the network. It will offer the new service instance by subscribing for processes in which the corresponding service type appears. This subscription is performed on the process repository.
3
As response, the peer will receive the part of all process definitions in which the service B is involved in. These parts contain the information about the activities that have to be executed after the execution of B.
4
To be able to trigger this next activity directly in a peer-to-peer style, the peer must know which peers provide a corresponding service instance. For that, the peer implicitly subscribes for providers of instances of that service type (in our example C). This subscription goes to the (service) subscription repository.
5
The peer will then receive a list of corresponding service providers. In our example, these are the peers 3 and 4.
If a service instance is not offered anymore, it has to be deleted from the subscription repository. This deletion will then be propagated to the subscribers of this service such that they can update their local information about service providers and their load. If a service instance or a service provider will go offline for a while, it will distribute this information in the same way such that the corresponding peers are aware of this fact when they choose a service provider to invoke the next activity of a process. If a change in the process repository – due to an insertion, update or deletion of a process definition – all service providers that are affected by this change are determined via the subscriptions and are informed about it. Assume that process P1 is deleted. Then, all peers that provide an instance of a service that corresponds to an activity in P1 will receive the information about this deletion. Consequently, they will remove all replicated data related to P1. Analogously, if a new process definition is inserted into the system, the peers that provide a service that corresponds to an activity in this new process will receive the relevant process part. If this process part includes a service for which no data is already available at that peer, steps 4–7 of the process described above will be triggered implicitly.
3
Peer-to-peer process execution
As described earlier, we deploy a true peer-to-peer system for process navigation, but gather and maintain metadata in centralised repositories (although such a repository may
134
C. Schuler et al.
run on a cluster to distribute the load). To benefit from the peer-to-peer approach in terms of scalability, OSIRIS completely separates process navigation from metadata replication. This section describes in detail our peer-to-peerbased process execution system. Especially, we want to address the following key problems: •
How do peers know where to migrate a process instances to next? What information is needed for this?
•
How do peers execute processes? What minimal information is needed to reduce replication costs?
•
How can the system overcome severe failures while still providing transactional guarantees?
3.1
Routing by late service binding
To benefit from load balancing over different service providers, application designers do not encode the service bindings of activities at development time. Rather, they only specify the type or class of a service to be invoked together with its parameters. The concrete service binding is selected at run-time depending on the load of machines and costs of invoking a particular service instance. For this purpose, OSIRIS requires each service provider to subscribe for task requests it can fulfil. Alternatively, an application developer can register these subscriptions on behalf of the service provider. Each subscription contains two XSLT scripts to map the generic interface of a service type to the concrete interface of the service provider as the concrete interface may have different parameters and result values. At run-time, the HDB layer selects a service provider based on costs, parameters and conditions on its services. This is similar to the work on automatic service discovery with the exception that OSIRIS requires, for performance reasons, that service discovery is pre-computed and repeatedly updated for each activity in each process type. However, selection among the detected services is still performed at execution time. In addition to service selection, process navigation and thus instance migration also requires the selection of a peer where the next activity should be executed. In many cases, we may want to replicate a service over the entire network following the ideas of grid computing (Foster and Kesselman, 2004). Hence, it is not enough to select only a concrete service type. We also have to select an appropriate peer to execute that service. OSIRIS uses traditional load balancing methods to select a peer trying to advance process instances as fast as possible. The tasks described above require that each HDB layer knows about subscriptions of service providers to service types (including the mappings) and the load of machines providing these services. OSIRIS maintains this information in centralised repositories (see Figure 1) and uses its inherent replication feature to distribute this meta-information over the network (see Figure 2). We can deploy path predicates and freshness predicates to dramatically reduce the number of updates propagated by the repositories to its peers.
3.1.1
Communication
Once the HDB layer has identified the next peer(s) to migrate the current process instance to, it deploys the Two-Phase Commit (2PC) protocol (Özsu and Valduriez, 1999) to ship the instance data. If transportation is not possible, for instance in case of a disconnection of a mobile device or because of network problems, the HDB layer tries to find an alternative peer. If this also fails, the instance data remains at the peer until a suitable service becomes available or if alternative branches in the process model exist, process execution is continued with an alternative branch. Note that the 2PC protocol is rather simple as it involves exactly two peers. It guarantees that instance data is neither lost nor duplicated in cases of network, hardware or software problems. The only exception is the case that a peer completely disappears. Then, all the process instances that are currently executed on that peer would also disappear and hence, would not be able to finish. Such situations are difficult to solve automatically as the difference between a normal disconnection of a mobile device and the permanent disconnection of a peer is hard to detect.
3.1.2
Fork/Join problem
A fork is simply the migration of a process instance to different peers (or the same peer, but executing independent activities). However, to join these branches, the peers executing the last activity of a branch must know where to meet with the other branch(es). Of course, a costly broadcast communication protocol to detect the partners of the join should be avoided. Instead, OSIRIS assigns each process instance a dedicated, reliable peer for joining branches at instance creation time. The join is performed by an additional join activity J, which is automatically inserted before the first activity K that operates on the joined control flow. The service of this join activity is stateful and decides when to advance the instance to activity K (e.g. some branches may have conditions or denote unused alternatives; see below). It also merges the whiteboards of the different branches according to the process model. Afterwards, the merged process instance is routed to a peer providing the service that implements activity K. Note that each join in a process instance is handled by the same peer. However, different instances can of course have different join nodes.
3.2
Process execution plan and execution units
In principle, each peer of the network could replicate all process information. However, this would require large amounts of data to be replicated over the entire network. Rather, we only want to have all the information on a peer that is needed to drive the execution of those process instances that potentially visit a peer. Our approach within OSIRIS divides each process execution plan into a set of execution units. Each execution unit contains the information to execute the corresponding activity and to navigate the process depending on the result of the local service invocation. A peer only subscribes itself for those execution units of processes that invoke a locally available service. Consequently, the amount of replicated data significantly reduces. When a designer
Scalable peer-to-peer process management
135
deploys a new process definition, the peers that may execute one of its activities automatically receive the corresponding execution units. OSIRIS distinguishes between a logical and a physical system view. Process definition takes place with respect to the logical view. Processes are defined over a set of existing service types. However, as OSIRIS aims at realising a completely distributed process execution, an optimised form of process definition is required. Therefore, OSIRIS introduces the process execution plan, which reflects the physical view of the architecture and which consists of a sequence of execution units. This plan defines the navigational behaviour of a process instance while executed on a peer-to-peer network. It materialises all navigation rules including normal forward navigation as well as failure handling and the synchronisation of parallel execution paths. The generation of the process execution units involves the two phases sketched in Figure 3. In the first phase, the process definition is transformed into the corresponding process execution plan. For this, the transformation algorithm relies on a graphical representation of the process definition. It generates a new graph, which corresponds to the process execution plan. A second phase finally generates a set of execution units. Each process has a single source activity (α) as well as a single end activity (ω). These activities represent the instantiation and the termination of the process instance, respectively. Furthermore, in a distributed environment, parallel execution paths have to be joined to synchronise the parallel execution of branches. This is achieved by a special join activity (as discussed in the previous subsection). The process definition depicted in Figure 3 contains an alternative execution path starting with activity X, which is executed if the execution path starting with activity W fails. A process execution plan materialises all relevant decisions for a process instance navigation. The navigation mainly has to react on three types of activity events: •
Finished: the activity call has been executed successfully. This event is represented by an ε-edge in the process execution plan (regular control flow).
Figure 3
•
Failed: the execution of the activity has failed. This requires the (re)start of the same activity if possible; else, the previously finished activities must be compensated. These options are represented by a ϕ-edge or a ← ϕ−-edge, respectively.
•
Compensated: the compensation of the activity is completed. A κ-edge initiates the start of an activity of an alternative execution branch while a ← κ−-edge continues compensation (in backward order with respect to the ε-edges).
Consider again the process definition depicted in Figure 3: the activity V is followed by the activities W and X. The dashed arrow between the two edges from V to W and from V to X denote the preference order of the execution paths, that is, if service W fails, the alternative path over service X is followed. The specified transactional properties of the activities are depicted right to each activity. As we see, all activities of the process definition are specified as compensatable (); activity X is retriable () in addition. The derivation of the process execution plan depicted in the centre of Figure 3 is quite simple for the activities V, W and X. The problem arises at Y: The join expects two incoming edges. However, only one of them arrives at Y, as the analysis of the process definition shows that the edges from W to Y and from X to Y exclude each other. In other words, no special join activity has to be inserted, because no process instance must be synchronised at this point. This is marked by the sign above activity Y. In our example, activity W is preferred to X. Therefore, a process navigator would decide to execute W right after V. The process execution plan respects this fact by adding an ε-edge from V to W. Note that no ε-edge from V to X exists in the process execution plan, since X is never started right after V. However, a ϕ-edge from W to X exists, which starts X after W has failed. After X has finished, Y is started. Assume that Y finally fails. Then, the navigation would follow the ← ϕ−-edge to compensate X. After this, the ← κ−-edge continues the compensation at V. Remember that W has failed, and
Decomposition of a process definition into a set of distributed process execution units Process Definition
Process Execution Plan
Process Execution Units
α V
V
1st phase
ε W
X
ϕ
compensatable retriable
Interface
ϕ
'
κ
ε
X
ε Y
Y
ω
2nd phase
ϕ
κ
W
'
Y
'
κ
X ε
'
ϕ
'
κ
ϕ
κ∗
Y W
X
V Y
V X
136
C. Schuler et al.
therefore the navigation can directly move back from X to V. Finally, if the compensation reaches activity α, the complete process instance is compensated.
3.2.1
Generation of the process execution plan
Run-time process execution relies on a correct process execution plan. However, specifying a process execution plan directly is out of the scope of a user defining a process. This task has to be performed by the system itself at process deployment time. Therefore, the user can model a process on a global scope, using a graphical interface such as O’GRAPE (a process modelling tool we have built for OSIRIS) (Weber et al., 2003) or any similar front-end that produces a specification for transactional processes. The process definition service will generate a process execution plan for each process definition in the system. The following algorithm transforms a correct process definition in a correct process execution plan: 1
All activities defined in the process definition are taken over as nodes in the process execution plan.
2
The existing edges in the process definition are copied as ε-edges to the process execution plan.
3
The process execution plan materialises the process instance initialisation. This is done by adding an additional activity α at the root of the process execution plan.
4
An activity ω is needed to terminate the process instance. An ε-edge is inserted that connects each existing end-activity of the process definition with ω.
5
For each retriable activity, a ϕ-edge is inserted that points to itself. This will restart the activity in case of a failure.
6
For each ε-edge that connects a compensatable and a non-retriable activity, a ← ϕ−-edge in the opposite direction is inserted.
7
For each ε-edge that connects two compensatable activities, a ← κ−-edge in the opposite direction is inserted.
8
In case of a preference order, the activity that spans over the alternative paths triggers the activity with the highest preference. If this activity fails, it directly triggers the next ordered alternative and so on. Thus, the ε-edges from the activity that spans alternative paths to the activities that are not the first alternative are never used. Consequently, they are removed from the process execution plan. Besides, the outgoing ← ϕ−-edge of these alternatives are replaced by a ϕ-edge that directly connects the totally ordered alternatives such that an alternative triggers the next ordered one in case of a failure. To handle compensation, κ-edges are added analogously.
9
inserted for these activities. This edge points to the subsequent join activity. In the case of a failure, this edge ensures that a notification message is sent to the join activity, which will then initiate the compensation of the parallel paths.
A failure of a non-retriable activity within a parallel section will hinder the current path to reach the next join node. Therefore, an additional η-edge has to be
The main property of a correct process definition is guaranteed termination (Schuldt et al., 2002). Each valid execution path will guarantee that the system reaches a consistent state when the path is executed completely. This property must be preserved by any transformation. The correctness of the process execution plan is an essential criterion for a system-wide guaranteed execution. The correctness of the algorithm presented above can be proven in two steps. The generated plan is i) complete and ii) acyclic. A complete process execution plan handles all possible execution events (finished, failed and compensated) by corresponding edge types. Each activity features the corresponding outgoing edges as depicted in Figure 3. A complete process execution will reach a well-defined end state if there are no cycles. The execution will end at activity ω, as ω is the only activity having no further forward navigation edge ε. In addition, no valid execution cycles must exist. Process navigation follows strict rules. Therefore, cycles in the graph containing different edge types might be possible, as a correct process execution does not allow for certain state changes, for example, changes directly from finished to failed. The current state of an activity defines the edge type the navigation has to follow next (ε, ϕ or κ) to ensure a correct execution. This results in a strict order of edge types. Taking this into account, Figure 3 features no valid execution cycle. A complete process execution plan that contains no valid cycle is correct. The proof presented by Schuler (2005) shows the correctness of the algorithm in detail.
3.2.2
Partitioning the process execution plan
After the generation of the process execution plan, it can be partitioned into execution units, which contains all information needed to execute a single activity and to continue navigation. These execution units are depicted on the right hand side in Figure 3. An execution unit holds all information about the activity and the local service call. It also contains information about the dataflow. Service call parameters are initialised when a service is invoked using information from the whiteboard of the process instance (process instance data). After the successful execution, result parameters are stored back to the process whiteboard. Moreover, as an execution unit contains the part of the process execution plan that is relevant for local navigation to the next service provider, control is transferred to this provider, which then locally invokes the subsequent service.
3.3
Failure handling
Failure handling in a distributed environment is more difficult than in a centralised one. The main reason is that there is no global knowledge about where a process instance is currently executed. In addition, costly broadcast messages to detect the
Scalable peer-to-peer process management whereabouts of an instance have to be avoided. In OSIRIS, we handle failures at three different levels, where the first two levels also apply for centralised process management. At the first level, the process model, we provide alternatives, compensation, restarting of activities to react to failures. Our modelling tool O’GRAPE verifies the correctness of a process based on the transactional process model. If the definition is correct, our navigation model guarantees that process execution ends in a well-defined final state regardless of failures of service invocations. For compensation of activities, we further have to keep a history of where an activity has been executed. Compensation is then routed back on the same path as the process instance has advanced. At the second level, the HDB layer, persistent queues and the 2PC protocol guarantee that state changes of process instances and results of services are made persistent and are recoverable,3 that is, queued transactions (Bernstein and Newcomer, 1997) are supported. If the HDB layer fails (hardware or software problem), it recovers the current state of its active process instances using traditional database technologies, re-instantiates service calls if necessary and continues with process navigation of instances currently residing at this peer. At the third level, the global level, we have to deal with disconnection of peers. A disconnection can have several reasons: network splits or failures, peer crashes, disconnection of mobile devices or permanent removal of peers. When migrating a process instance from one peer to another, a 2PC protocol between the two peers guarantees that the instance data is safely transferred. If migration fails, the source peer simply selects another destination for the migration or delays migration if the failed peer was the only provider of that service. More difficult to solve is the situation when a peer permanently disappears or is disconnected for a too long period. In OSIRIS, we allow application developers to specify timeouts and assign special observer peers to watch critical peers. Critical peers are non-reliable peers that are likely to disconnect, for example, mobile devices. If a critical peer disconnects and the timeout is reached, the process instance is migrated to another currently available peer. When the formerly disconnected peer reconnects, it is informed by the observer that its version of the process instance is no longer valid. However, this approach does not work in all cases, for example, a non-compensatable activity cannot have a timeout, as we cannot invalidate the process instance if the peer reconnects and already has executed the activity. Again, O’GRAPE supports the application developer in validating the correctness of process model.
4
Performance evaluations
In this section, we investigate the potential of peer-to-peer process execution with a set of experiments. The goal is to evaluate that, at least for a basic setup and workload, peer-to-peer execution scales better than a centralised approach and that metadata management does not hinder scalability. In addition, we show that neither distributed navigation nor late service binding negatively affects process response times.
137
4.1
Evaluation settings
To examine the scaling potential of peer-to-peer process execution in OSIRIS, we compare the following three configuration settings: 1
2
3
Central: this setting contains a dedicated (centralised) process management peer. All process coordination, navigation and system metadata management takes place at that peer. This represents a typical set-up with a coordination of distributed (web) services by a central process management system such as IBM’s WebSphere Application Process Choreographer (IBM, 2005). Because of limited resources of the dedicated peer, process management cannot scale with respect to an increasing number of peers. This set-up will provide us a lower bound for scalability. Replicated: this configuration realises a complete replication of the process manager on each peer of the network. Note that this set of peers cannot be seen as one process management system, as all navigation on the peers is completely independent. However, the scaling characteristics of this setting seem to be optimal. Therefore, this setting will provide us an upper bound for scalability. Peer-to-Peer: finally, OSIRIS’s peer-to-peer process execution will be performed in this setting. In this setting, each peer can instantiate and advance a process using the peer-to-peer approach described in Section 3.
We have enhanced our prototype OSIRIS to be able to also act as a centralised process management system. Therefore, all three configurations can be realised by the same process execution infrastructure. They share the same implementation, modules and configurations, while they only differ in the way they execute processes. In this way, the results will reflect the plain effects of different architectures. Our experimental set-up determines the scalability characteristics of the three approaches under investigation. We use a linear process invoking generic activities that are available at all peers. This process corresponds to a typical short running data request process of clients. Each activity accrues costs of 2 sec on average and the overall execution cost of a process is 10 sec on average (five activities on average). Clients are simulated with a queuing model triggering a process instantiation every t ms on average. To visualise the scalability characteristics, we decrease t every minute by 25%. Thereby, the load of the system steadily increases and, at some point, we may observe an overload situation, that is, processes are significantly delayed due to full work queues at the peers and at the process manager. This experiment is repeated for different numbers of peers. Theoretically, the system should improve the more peers that are available for activity execution, that is, we should observe the break-down at a later point in time.
4.2
Scalability of peer-to-peer navigation
The main focus of the following comparisons lies on the number of concurrent processes each system configuration
138
C. Schuler et al.
can handle without dramatic increase of overall process execution time. In our evaluation setting, a number of client gateways simulate process execution requests. For the measurements, we use a set of different process types. These process types consist of four to six activities. Each activity is mapped onto a set of simple function calls, which implement idle wait. The overall waiting time for every process instance is 10 sec. To determine the maximum number of concurrent process instances, the client gateways start processes with increasing frequency. As depicted in Figure 4, the measurement starts with 60 processes per minute. This frequency is kept stable for the first minute. Then, it is increased by 25% each minute. After 10 min, the frequency of process executions reaches 550 processes per minute. Figure 4
persistence, navigation, etc.) at a time. The throughput of the centralised setting could be improved by using faster machines or by using a clustered approach such as in IBM WebSphere (IBM, 2005). However, the limitation on the number of concurrent processes always remains regardless of how many peers we add to the system. Figure 5
Comparing different architectures
Graph for peer-to-peer configuration
Figure 4 shows the results for the peer-to-peer configuration from 2 to 24 peers. A perfect system would execute all started processes immediately. Therefore, the graph would be close to the graph for the number of started processes. However, in reality, if the number of processes increases, the number of executed processes per minute will find a maximum. The peer-to-peer configuration on 8 peers cannot perform more than 200 processes per minute. When starting even more than 200 processes per minute, the system tries to distribute load over all peers. However, this hinders the execution of already running instances. Therefore, the number of executed processes decreases after the maximum. To find this maximum number of processes per minute, we have evaluated the overall execution time of each process instance. In the beginning, all processes can be executed immediately. However, when request frequency increases, at one point of time the system cannot execute the process instance in a reasonable time anymore. At this point, the execution times of the processes significantly grow. Therefore, the current frequency is the maximum to be handled by the configuration tested. Figure 5 depicts the throughput for all three configurations from 1 to 64 peers. The figure nicely shows the scaling limitation of the central setting. Although the centralised setting is capable of significantly improve from 2 to 8 peers, it does not scale any further beyond 8 peers. The reason for this behaviour is due to the central process navigation: it only allows for a limited number of concurrent process executions (logging, auditing,
The peer-to-peer setting, on the other hand, appears to scale almost perfectly with the number of peers in the network. In fact, it is comparable to the completely replicated setting, although it realises a distributed process management system that acts as one instance. If the workload becomes too large, we can simply increase the number of peers to cope up with the additional load. In the central solution, one have to replace the central machine with a much faster one to cope with an increased workload (if such a machine exists).
4.3
Dynamic changes of system configuration
The execution of process instances is distributed in OSIRIS over the peer-to-peer network. This allows for a parallel execution that increases the overall system performance. Therefore, a workload balancing algorithm is needed, which distributes approximations of the current load of service providers within the system. The goal is that each peer holds a replica of the load information of all providers to which it might send service invocations in the context of a process instance. This meta data allows for routing process instances though the network on the basis of the load of the recipient of a process instance. Local replicas containing workload information are updated: 1
by explicit replication of global knowledge
2
by direct feedback cycle of the local load balancer and
3
by local failure handling (e.g. connection failures).
Workload information is highly dynamic. For this reason, a replicated snapshot can only represent an approximation of the real situation. 1
a lazy feedback cycle would lead to a degenerated workload distribution, where all work is sent to the first node
Scalable peer-to-peer process management 2
the additional direct feedback cycle is based on local estimations on the expected workload of a certain service call. This allows a faster feedback and improves the quality of the workload balancing
3
finally, an integration of the local failure handling helps to improve the robustness of the system.
All these update cycles are completely decoupled from process execution. This allows for efficient process execution without global synchronisation. Figure 6 shows the throughput of OSIRIS while changing the number of connected peers. Initially, 32 peers handle a constant workload of 15 process instances per second. After 20 sec, half of the peers disappear (e.g. because of a crash or network problems), while still the same number of new process instances is generated. In our experiment, this is realised by explicitly killing the executables of the local HDB layers in a random sequence within the time interval of about 10 sec. This does not allow the peers that have disappeared to properly sign-off with the infrastructure. In addition, as not all peers crash at the same time, a lot of messages about configuration changes will be propagated. The current implementation realises several mechanisms to handle such failure situations. While running process instances at crashed peers will freeze in the local persistent storage, running peers will be informed by the infrastructure about the configuration change. In addition to this notification, if a peer detects a broken connection, it locally changes its metadata. This allows for a fast adaptation to a new situation. In this way, the system finally changes the configuration automatically and continues executing new process instances with the remaining peers. During a first adaptation phase, the overall throughput falls down to two instances per second. The system stabilises at a low level after a short time. However, a stable throughput of not more than five instances per second is reached.
139 minute, the system has to execute a number of postponed instances as well. This includes processes persistently stored at offline peers as well as instances which could not be executed, due to the temporal overload situation, during the downtime of half of the peers. Figure 6 shows a short phase, where more than 15 processes per second are executed to work off the additional load. After this phase, the system has completely recovered from the special situation and stays in a stable state at 15 processes per second.
4.4
An important aspect of distributed process navigation in OSIRIS is that all information required to navigate process instances is replicated and available as process execution units at the local HDB layers. Therefore, peer-to-peer navigation avoids the additional communication overhead otherwise needed to access a central process definition repository. In contrast to central process navigation, in a peer-topeer environment each navigation step of a process instance causes a migration to the peer providing the subsequent service. However, besides the migration between peers, distributed process navigation is not more expensive than a centralised navigation. Both approaches must store the same number of process instance states to a (local) database. Because of local navigation, the communication overhead to remotely call a service instance is minimised, while on the other hand the process instance must be migrated to the providers peer. In addition to persistent storage of the process state, the peer-to-peer navigation of OSIRIS requires a persistent migration of process instances. This can be realised by a simple 2PC handshake protocol.
5 Figure 6
Process navigation overhead
Related work
Reaction to a system reconfiguration
At time t = 80 sec, the peers are restarted. After local recovery, all process instances continue their execution. The OSIRIS system again inserts the peers that have come back into the active configuration and distributes their workload. While client load is still left unchanged at 15 process instances per
The presented OSIRIS infrastructure provides a scalable distributed process navigation platform. To achieve this, it combines a rich set of aspects. On the basis of HDB vision (Schek et al., 2002), we have combined ideas from process management, peer-to-peer networks ((Bawa et al., 2003) gives an overview), database technology and GRID (Foster and Kesselman, 2004) infrastructures. All parts are glued together to a value-added, coherent whole. Processes in OSIRIS are running within a peerto-peer community that is established by the individual service providers (in Dogac et al. (2000), service providers acting as peers are called MARCAs). However, in contrast to relatively simple file sharing applications, the execution of processes over services requires a significantly richer set of functionality. Therefore, OSIRIS implements process management concepts like state-of-the-art systems, such as IBM WebSphere Application Process Choreographer (IBM, 2005) or BizTalk (Metha, 2001). These systems realise a centralised approach, where every call to a service provider returns to the process engine. Albeit navigation tasks can be distributed in a cluster, storage of process instances is usually done by using a single,
140
C. Schuler et al.
centralised database instance (products such as Oracle 10g (Oracle Database 10g, 2005) can also support clustered databases). Some prototype systems feature similar architectures (e.g. Mentor-lite (Gillmann et al., 2002), SDM (Altintas et al., 2003)). Although executing processes in a peer-to-peer way without involving a centralised component, OSIRIS provides transactional guarantees, following the model of transactional processes (Schuldt et al., 2002). METUFLOW (Dogac et al., 1997) and TransCoop (Aberer et al., 1997) also provide transactional semantics for workflows in a distributed environment. In the distributed Mentor-lite (Gillmann et al., 2002) approach, the setting of the process engine in a cluster can be changed actively using a configuration tool, while OSIRIS distributes process execution over all available service providers and locally runs processes on provider hosts. This requires a middleware layer – the local HDB layer – on every participating component and service provider, respectively. A wide range of middleware solutions such as .NET (Microsoft.Net, 2005), J2EE implementations or CORBA follow similar ideas. In terms of process management, the OSIRIS middleware however provides much more functionality; in particular, metadata replication, auditing and system workload monitoring are also part of the OSIRIS middleware. Besides peer-to-peer process execution, dynamics both in terms of the current service providers as well as in terms of changes of their load are addressed by OSIRIS. Even dynamic changes of process models are supported. OSIRIS guarantees that each process instance is executed consistently, following the version of the process at instance creation time. For several reasons, for example, medical treatment running processes instances have to migrate to the newest process definition, therefore systems in this field such as ADEPTflex (Reichert and Dadam, 1998) or HematoWork (Müller and Rahm, 1999) have to deal with the process instance evolution. Ideas for migrating running process instances are currently not implemented in OSIRIS. However, these concepts are orthogonal and could be seamlessly integrated. The execution of process instances in OSIRIS follows late service binding, which realises a run-time look-up for running instances of a certain type. Other systems like, for instance, ServiceGlobe (Keidl et al., 2002) implement a service discovery based on tModel types, or include service discovery into the process navigation (e.g. eFlow (Casati et al., 2000), ISEE (Meng et al., 2002) or CrossFlow (Grefen et al., 2001)). OSIRIS’ service discovery relies on precomputed tables, available at every peer. These tables allow for a very fast decision at run-time between several semantic equivalent service types. The content of these tables can be maintained manually or using an automated semantic approach (Casati and Shan, 2002; Sivashanmugam et al., 2003). Each service type is bound to a publish- and-subscribe topic that provides a channel to the actual providers. Similar ideas can also be found at several other process management systems (see, for instance, Dayal et al., 2001). However, conventional implementations of publish- and-subscribe techniques either require a centralised publish/subscribe broker or use broadcast technologies. In contrast, the OSIRIS
peer-to-peer solution needs neither of them, but uses transparent routing mechanisms similar to those of MIRS (Giladi et al., 2000) or DBCache (Bornhövd et al., 2003).
6
Conclusions and future work
Service-oriented computing not only necessitates the possibility to combine existing services to applications at a higher level of abstraction (processes), but these applications also impose certain requirements on the infrastructure for process support. In addition to the dynamics of service providers, scalability in terms of the number of services and processes the infrastructure can support is an important issue. State-of-the-art process management systems feature highly optimised algorithms to provide a certain degree of throughput and performance. For that, they however require specialised, powerful hardware. Nevertheless, the performance and throughput of a centralised approach to process management will still be limited when the number of concurrent process instances further increases. A true peer-to-peer based approach, in contrast, where the process execution is distributed among the peers of the entire network is able to significantly outperform a centralised approach in terms of scalability. In this paper, we have presented the OSIRIS system as an implementation of a HDB that provides process support in a true peer-to-peer way. Peer-to-peer process execution is achieved by collecting metadata on the available processes, providers and the load of the latter using global repositories and by applying sophisticated replication algorithms to distribute this meta-data. Local HDB layers associated with each service provider manage replicas of global metadata that is sufficient to locally drive the execution of a process. However, to provide a reliable infrastructure for process execution even without central control, OSIRIS applies sophisticated failure handling strategies. Another important aspect is to support correct concurrent process executions. To do this – also in a truly distributed way – we are currently investigating distributed concurrency control protocols (Haller and Schuldt, 2003). These protocols will enable process synchronisation without introducing a centralised component through which each process instance has to be routed. Moreover, we have presented first experiments underlining the scalability features of OSIRIS. To extend this analysis, we are currently developing a benchmark suite for various application scenarios (including failure handling and concurrency) which should reveal the strengths and weaknesses of different system architectures.
Acknowledgements We want to thank Frank Leymann for supporting this work on OSIRIS and IBM for providing us with equipment enabling further studies on the benefits of the peer-to-peer approach of OSIRIS. The work presented in this paper has been performed while all authors have been with the Database Research Group at ETH Zurich.
Scalable peer-to-peer process management
References Aberer, K., Klingemann, J., Tesch, T., Wäsch, J., Neuhold, E., Even, S., Faase, F., Apers, P., Kaijanranta, H., Lehtola, A. and Pihlajamaa, O. (1997) ‘Transaction support for cooperative work: an overview of the TRANSCOOP project’, Proceedings of the Workshop on Extending Data Management for Cooperative Work, 5–6 June, Darmstadt, Germany. Altintas, A., Bhagwanani, S., Buttler, D., Chandra, S., Cheng, Z., Coleman, M.A., Critchlow, T., Gupta, A., Han, W., Liu, L., Ludäscher, B., Pu, C., Moore, R., Shoshani, A. and Vouk, M. (2003) ‘A modeling and execution environment for distributed scientific workflows’, Proceedings of the 15th International Conference on Scientific and Statistical Database Management, SSDBM’03, 9–11 July, Cambridge, USA: IEEE Computer Society Press, pp.247–250. Bawa, M., Cooper, B.F., Crespo, A., Daswani, N., Ganesan, P., Garcia-Molina, H., Kamvar, S., Marti, S., Schlosser, M., Sun, Q., Vinograd, P. and Yang, B. (2003) ‘Peer-to-peer research at Stanford’, ACM SIGMOD Record, Vol. 32, No. 3, pp.23–28. Bernstein, P.A. and Newcomer, E. (1997) Transaction Processing – For the Systems Professional, San Francisco, CA: Morgan Kaufmann Publishers. Bornhövd, C., Altinel, M., Krishnamurthy, S., Mohan, C., Pirahesh, H. and Reinwald, B. (2003) ‘DBCache: middle-tier database caching for highly scalable e-business architectures’, Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, 9–12 June, San Diego, CA, p.662, Braumandl, R., Keidl, M., Kemper, A., Kossmann, D., Kreutz, A., Seltzsam, S. and Stocker, K. (2001) ‘ObjectGlobe: ubiquitous query processing on the internet’, The VLDB Journal, Vol. 10, No. 1, pp.48–71. Casati, F., Ilnicki, S., Jin, L., Krishnamoorthy, V. and Shan, M. (2000) ‘Adaptive and dynamic service composition in eFlow’, in B. Wangler and L. Bergman (Eds). Advanced Information Systems Engineering, Proceedings of the 12th International Conference, CAiSE 2000, Stockholm, Sweden, 5–9 June, Lecture Notes in Computer Science, Berlin: Springer-Verlag, Vol. 1789, pp.13–31. Casati, F. and Shan, M-C. (2002) ‘Semantic analysis of business process executions’, in C.S. Jensen, K.G. Jeffery, J. Pokorný, S. Saltenis, E. Bertino, K. Böhm and M. Jarke (Eds). Advances in Database Technology – EDBT 2002, Proceedings of the Eighth International Conference on Extending Database Technology, Prague, Czech Republic, March, Lecture Notes in Computer Science, Berlin: Springer-Verlag, Vol. 2287, pp.287–296. Dayal, U., Hsu, M. and Ladin, R. (2001) ‘Business process coordination: State-of-the-Art, trends, and open issues’, VLDB 2001, Proceedings of 27th International Conference on Very Large Data Bases, 11–14 September, Roma, Italy, pp.3–13. Dogac, A., Gokkoca, E., Arpinar, S., Koksal, P., Cingil, I., Arpinar, B., Tatbul, N., Karagoz, P., Halici, U. and Altinel, M. (1997) ‘Design and implementation of a distributed workflow system: METUFlow’, in A. Dogac, L. Kalinichenko, T. Özsu and A. Sheth (Eds). Workflow Management Systems and Interoperability, Proceedings of the NATO Advanced Study Institute on Workflow Management Systems (WFMS), Istanbul, Turkey, 12–21 August, Nato ASI Series, Berlin: Springer-Verlag, pp.60–90.
141 Dogac, A., Tambag, Y., Tumer, A., Ezbiderli, M., Tatbul, N., Hamali, N., Icdem, C. and Beeri, C. (2000) ‘A workflow system through cooperating agents for control and document flow over the internet’, Proceedings of the Cooperative Information Systems, Seventh International Conference, CoopIS 2000, Eilat, Israel, 6–8 September, Lecture Notes in Computer Science, Springer-Verlag, Vol. 1901, pp.138–143. Elmagarmid, A.K., Leu, Y., Litwin, W. and Rusinkiewicz, M. (1990) ‘A multidatabase transaction model for interbase’, in D. McLeod, R. Sacks-Davis and H-J. Schek (Eds). Proceedings of the 16th International Conference on Very Large Data Bases, VLDB’90, Brisbane, Australia, 13–16 August, Palo Alto, CA: Morgan Kaufmann Publishers, pp.507–518. ETH Database Research Group (2005) ‘OSIRIS: an open services infrastructure for reliable and integrated process support’, Available at: http://www.osiris.ethz.ch. Foster, I. and Kesselman, C. (Eds) (2004) The Grid: Blueprint for a New Computing Infrastructure, 2nd edition, San Francisco: Morgan Kaufmann. Giladi, R., Glezer, C., Melamoud, N., Ein-Dor, P. and Etzion, O. (2000) ‘The metaknowledge-based intelligent routing system (MIRS)’, Data and Knowledge Engineering, Vol. 34, No. 2, pp.189–217. Gillmann, M., Weikum, G. and Wonner, W. (2002) ‘Workflow management with service quality guarantees’, in M.J. Franklin, B. Moon and A. Ailamaki (Eds). Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, 3–6 June, Madison, WI: ACM Press, pp.228–239. Grefen, P., Aberer, K., Ludwig, H. and Hoffner, Y. (2001) ‘CrossFlow: cross-organizational workflow management for service outsourcing in dynamic virtual enterprises’, IEEE Data Engineering Bulletin, Vol. 24, No. 1, pp.52–57. Haller, K. and Schuldt, H. (2003) ‘Consistent process execution in peer-to-peer information systems’, Proceedings of the 15th Conference on Advanced Information Systems Engineering, CAiSE 2003, Klagenfurt/Velden, Austria, Lecture Notes in Computer Science, Berlin: Springer-Verlag, Vol. 2681, pp.289–307. IBM WebSphere Application Server Enterprise Process Choreographer (2005) Available at: http://www128.ibm.com/ developerworks/websphere/zones/was/wpc.html. Keidl, M., Seltzsam, S., Stocker, K. and Kemper, A. (2002) ‘ServiceGlobe: distributing e-services across the internet’, Proceedings of the 28th International Conference on Very Large Data Bases, VLDB 2002, Hong Kong, China, San Francisco, CA: Morgan Kaufmann Publishers, pp.1047–1050. Mehrotra, S., Rastogi, R., Korth, H.F. and Silberschatz, A. (1992) ‘A transaction model for multidatabase systems’, Proceedings of the 12th International Conference on Distributed Computing Systems (ICDCS), Yokohama, Japan, Washington, DC: IEEE Computer Society Press, pp.56–63. Meng, J., Su, S., Lam, H. and Helal, A. (2002) ‘Achieving dynamic inter-organizational workflow management by integrating business processes, events and rules’, Proceedings of the 35th Hawaii International Conference on System Sciences (HICSS-35 2002), Abstracts Proceedings, 7–10 January, Big Island, HI: IEEE Computer Society, p.7. Metha, B., Levy, M., Meredith, G., Andrews, T., Beckman, B., Klein, J. and Mital, A. (2001) ‘BizTalk server 2000 business process orchestration’, IEEE Data Engineering Bulletin, Vol. 24, No. 1, pp.35–39.
142
C. Schuler et al.
Müller, R. and Rahm, E. (1999) ‘Rule-based dynamic modification of workflows in a medical domain’, in A.P. Buchmann (Ed). Datenbanksysteme in Büro, Technik und Wissenschaft, BTW’99, GI-Fachtagung, Freiburg, März 1999, Informatik aktuell, Berlin: Springer-Verlag, pp.429–448. Microsoft .NET. (2005) Available at: http://www.microsoft. com/net/. Oracle Database 10g (2005) ‘The database for the grid’, Available at: http://otn.oracle.com/products/database/oracle10g/. Özsu, M.T. and Valduriez, P. (1999) Principles of Distributed Database Systems, 2nd edition, Upper Saddle River, NJ: Prentice Hall. Pu, C., Schwan, K. and Walpole, J. (2001) ‘Infosphere project: system support for information flow applications’, ACM SIGMOD Record, Vol. 30, No. 1, pp.25–34. Reichert, M. and Dadam, P. (1998) ‘ADEPTflex – Supporting dynamic changes of workflows without losing control’, Journal of Intelligent Information Systems, Vol. 10, No. 2, pp.93–129. Schek, H-J., Schuldt, H., Schuler, C. and Weber, R. (2002a) ‘Infrastructure for information spaces’, Advances in Databases and Information Systems, Proceedings of the Sixth East-European Symposium, ADBIS’2002, Bratislava, Slovakia, September, Lecture Notes in Computer Science, Berlin: Springer-Verlag, Vol. 2435, pp.23–36. Schek, H-J., Schuldt, H. and Weber, R. (2002b) ‘Hyperdatabases: infrastructure for the information space’, Visual and Multimedia Information Management, IFIP TC2/WG2.6 Sixth Working Conference on Visual Database Systems, IFIP Conference Proceedings, 29–31 May, Bisbane, Australia: Kluwer, Vol. 216, pp.1–15. Schuldt, H. (2001) ‘Process locking: a protocol based on ordered shared locks for the execution of transactional processes’, Proceedings of the 20th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS 2001, 21–23 May, Santa Barbara, CA, New York: ACM Press. Schuldt, H., Alonso, G., Beeri, C. and Schek, H-J. (2002) ‘Atomicity and isolation in transactional processes’, ACM Transactions on Database Systems, Vol. 27, No. 1, pp.63–116. Schuler, C. (2005) ‘Distributed peer-to-peer process management – the realisation of a hyperdatabase’, PhD Thesis, Swiss Federal Institute of Technology (ETH) Zurich, ETH-Zentrum, CH-8092 Zurich, Switzerland, (In German).
Schuler, C., Weber, R., Schuldt, H. and Schek, H-J. (2003) ‘Peer-to-peer process execution with OSIRIS’, Service-Oriented Computing – ICSOC 2003, Proceedings, First International Conference, Trento, Italy, 15–18 December, Lecture Notes in Computer Science, Berlin: Springer-Verlag, Vol. 2910, pp.483–498. Sivashanmugam, K., Verma, K., Sheth, A. and Miller, J. (2003) ‘Adding semantics to web services standards’, Proceedings of IEEE International Conference on Web Services, ICWS ’03, 23–26 June, Las Vegas, NV: CSREA Press, pp.395–401. SOAP – Simple Object Access Protocol (2003) Available at: http://www.w3.org/TR/SOAP/. W3C (World Wide Web Consortium) (1998) ‘Extensible markup language (XML) 1.0, W3C recommendation’, Available at: http://www.w3.org/TR/1998/REC-xml-19980210. Weber, R., Schuler, C., Schuldt, H., Schek, H-J. and Neukomm, P. (2003) ‘Web service composition with O’GRAPE and OSIRIS’, VLDB 2003, Proceedings of 29th International Conference on Very Large Data Bases, 9–12 September, Berlin, Germany: Morgan Kaufmann, pp.1081–1084. WSDL – Web Service Description Language (2001) Available at: http://www.w3.org/TR/wsdl/. Zhang, A., Nodine, M., Bhargava, B. and Bukhres, O. (1994) ‘Ensuring relaxed atomicity for flexible transactions in multidatabase systems’, Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, Minneapolis, ACM SIGMOD Record, New York: ACM Press, Vol. 23, pp.67–78.
Notes 1
This is required for the typical request-reply model of service infrastructures (a process can again be considered as a service with a dedicated entry point).
2
We do not provide a freshness predicate with transactional guarantees, that is, the update at the central repository is separated from the updates at the HDB layers. Otherwise, costs for updates would negatively affect the scalability of our approach.
3
This actually is only true if the service supports the 2PC protocol. Otherwise, the result might get lost if the host fails immediately after the delivery of the result and before it is made persistent by the HDB.