Document not found! Please try again

A Framework for Nonrepudiatable and Scalable Cross ... - IEEE Xplore

16 downloads 12003 Views 826KB Size Report
A Framework for Nonrepudiatable and Scalable Cross-Enterprise Workflow. Management Systems in the Cloud. Gwan-Hwan Hwang. Yu-Cheng Hsiao. Yi-Chan ...
2012 IEEE 201226th IEEE International 26th International ParallelParallel and Distributed and Distributed Processing Processing Symposium Symposium Workshops Workshops & PhD Forum

A Framework for Nonrepudiatable and Scalable Cross-Enterprise Workflow Management Systems in the Cloud Gwan-Hwan Hwang

Yu-Cheng Hsiao

Yi-Chan Kao

Heng-Yi Lin

Department of Computer Science and Information Engineering National Taiwan Normal University Taipei, Taiwan

definition by the WfMS, and is usually controlled by the workflow engine.

Abstract—Cloud computing is gaining tremendous momentum in both academia and industry, with more and more people and enterprises migrating their data and applications into the cloud. Cloud computing provides a new computing model with elastic scaling, a resource pool of unprecedented size, and the ondemand resource provisioning mechanism, which bring numerous challenges in implementing workflow management systems (WfMSs) in the cloud. Establishing scalable and crossenterprise WfMSs in the cloud requires the adaptation and extension of existing concepts for process management. This paper presents a framework for how cross-enterprise processes can be implemented, secured, controlled, and scaled. We also explain why existing engine-based centralized and distributed WfMSs cannot guarantee the nonrepudiation requirement. The proposed framework is a document-routing system that implements major required security features such as authentication, confidentiality, data integrity, and nonrepudiation in the cloud computing environment. Its security framework is built by applying element-wise encryption and a cascade-based method of embedding digital signatures. The implementation and experimental results demonstrate the feasibility of the proposed framework.

Cloud computing provides computation, software, data access, and storage services that do not require end-user knowledge of the physical location and configuration of the system that delivers the services. Some companies and organizations are becoming increasingly concerned about the costs of creating and maintaining applications in traditional ways, and are considering running their applications in a cloud computing environment maintained by cloud providers. The cloud computing environment can scale to any throughput requirement, although the cost will increase with its size. The associated WfMSs need be durable and resilient to any failures.

A2

(A)

A3

A1 A4

A5

A6

Workflow engine

Keywords – cloud; workflow management system; WfMS; groupware; security; XML Workflow engine 2

1. INTRODUCTION

A2

Workflow management systems (WfMSs) are software systems that support coordination and cooperation among members of an organization whilst they perform complex business tasks [1–3]. Business tasks are modeled as workflow processes that are automated by the WfMS. The workflow model (also referred to as the workflow process definition) is the computerized representation of the business process that defines the starting and stopping conditions of the process, the activities in the process, and control and data flows among these activities. An activity is a logic step within a workflow, which includes the information about the starting and stopping conditions, the user who is allowed to participate (the participant), the tools and/or data needed to complete the activity, and the constraints on how the activity should be completed. The activities in a process are usually organized into a directed graph that defines the order of their execution, where nodes and edges in the graph represent activities and control flow, respectively. A workflow process instance represents a state of execution of a workflow process

Public network

Public network

(B)

A1 Workflow engine 1

Public network

A4

A5

A6

Workflow engine 3

Start of workflow

End of workflow

Activity

Flow control edge

Workflow engine User communication

Process instance migration Participant

Figure 1. Working models of the centralized WfMS and the ordinary enginebased distributed WfMS.

A popular way to construct WfMSs in the cloud is to deploy workflow engines in the cloud computing infrastructure; examples of this approach include RunMyProcess [4], Visual Workflow [5], Aneka [6], Azure Services Platform [7], and Google App Engine [8]. Below we give an overview of the working models of the centralized WfMS and the ordinary engine-based distributed WfMS. We show the kind of security problems that arise when applying WfMSs in the cloud computing environment. Centralized WfMSs focus on executing workflow processes within a single



The corresponding author is Gwan-Hwan Hwang (e-mail: [email protected])

978-0-7695-4676-6/12 $26.00 © 2012 IEEE DOI 10.1109/IPDPSW.2012.271

A3

2185 2191

communication between participants and workflow engines is very similar to a client/server computing model, and the workflow engines are easy to discover since they are always associated with a fixed domain name. Due to security requirements such as confidentiality, the workflow engine shows only parts of the data in the process instance (e.g., some forms, texts, pictures, and files) to the participant, who then sends the execution results (e.g., some texts or files) to the connected workflow engine. The workflow engine stores the obtained execution results in a database or some other storage device as part of the process instance. After the execution of an activity, the participant might repudiate his/her previous execution by claiming that the stored execution results have been altered or the forms shown to him/her by the workflow engine during the execution of activity are inconsistent or wrong, or have been modified illegally. This is because superusers exist in the administration domain of WfMSs. For example, the administrator of a relational database always has the privilege to update the contents and logs in the database. It is obvious that the central WfMS also cannot guarantee the nonrepudiation requirement.

organization at one location in a single workflow engine (see Figure 1A). A workflow process is executed by a single workflow engine that communicates with all of the participants in the activity. To fit with the rapidly changing business environment and with the highly varying load of ebusiness applications, WfMSs should be scalable and should provide the required flexibility to cope with peaks in the system load and distributed environment. Consider the working model of the engine-based distributed WfMS shown in Figure 1B [9–12]. The workflow process has six activities (numbered A1–A6) that are designed to be executed in three workflow engines deployed at three locations. Establishing multiple workflow engines provides the abilities to (1) balance the load among the workflow engines as the number of users increases and (2) reduce the communication time between the participants in the activity and the workflow engines since the latter are close to the user (in terms of the network transmission). Note that the workflow engines must be managed by administrators in this model, since the workflow process instances are usually stored in them or database servers close to them. The user communicates with the administrated workflow engines to participate in the execution of the WfMS via the network. The engine-based distributed WfMS can be used to build up the cross-enterprise WfMS that controls the execution of cross-enterprise workflow processes. The participants of activities in a crossenterprise workflow process belong to different companies or organizations. These participants usually connect to the workflow engines in their own companies or organizations due to authentication requirements.

In addition, engine-based WfMSs (either centralized or distributed) are confronted by several difficulties. The first is that their architecture lacks scalability. Although we can increase the number of workflow engines to enhance the capability of the system, the accesses and coherence of shared workflow process instances are a bottleneck. If a process instance is replicated in multiple servers, we have to use a coherence protocol to maintain the consistency between concurrent accesses of it. Also, the load should be balanced between these workflow engines [14]. The second difficulty is that the engine-based system readily suffers from a denial-ofservice attack because the workflow engine always has a fixed location (or domain name). In this form of attack the enemy interferes with the system by making excessive and pointless invocations of services or message transmissions in a network, resulting in overloading the physical resources (e.g., network bandwidth or server processing capacity). Such attacks are usually made with the intention of delaying or preventing actions by other, legitimate users. The third difficulty is that the security of a process instance is assured by the server rather than by the process instance itself. The workflow engines in cross-enterprise WfMSs may be administrated by different companies or organizations in a distributed network environment, and the overall security is insufficient if the security mechanism is broken in any one of the servers that may access or store the process instance. The question then arises about how to design a trust model for workflow engines in cross-enterprise (or cross-organization) WfMSs.

In the operation of engine-based distributed WfMSs it is essential for a workflow engine to store the state of a workflow process (i.e., the process instance). Also, it is obvious that the workflow process instances must be transmitted during their execution because the large physical separation between the workflow engines prevents them from directly sharing these instances. As shown in Figure 1B, workflow engine 1 must send the workflow process instance that contains the execution result of activity A1 to workflow engines 2 and 3. It is also possible that workflow engines in a distributed WfMS are executed in heterogeneous machines. Thus, the interchange of workflow process instances between different types of workflow engine must be supported in this type of environment. The working model shown in Figure 1B suffers from serious security problems because the workflow process instances might be transmitted on a public network such as the Internet. Therefore, the execution results of activities might be eavesdropped or malicious users might intercept the process instances and then alter their contents before passing them on to the next workflow engine. Although the above two problems can easily be solved by applying common methods used to secure electronic transactions with secure sockets, such as the SSL (Secure Sockets Layer) protocol [13], such methods still cannot guarantee the nonrepudiation requirement. That is, it is difficult for the system to prevent a participant from being able to deny what he/she had done in an activity. In this working model, participants connect to a workflow engine in order to execute an activity. The

Applying engine-based WfMS in the cloud computing environment increases the security problems because the workflow engines are usually installed on machines that are not administrated by the company or organization that uses them. Users of any cloud computing service are trusting it with sensitive information, whether that be personal, regulated, or proprietary, and recent prominent data security breaches all serve to demonstrate risks that are common to any cloud service. For instance, the enterprise providing cloud computing services will need to manage and maintain data,

2192 2186

and the associated existence of superusers represents a serious threat to user privacy. Thus, customers using a cloud computing infrastructure should realize that security holes might be present. For example, the workflow process instances containing the execution results of activities in the workflow engine might be eavesdropped, and malicious users may intercept the process instances and then alter their contents. Thus, we need to develop a WfMS that implements major required security features including authentication, confidentiality, data integrity, and nonrepudiation in the cloud computing environment.

1. 2. OPERATIONAL MODEL OF DRA4WFMS The DRA4WfMS is a document-routing-based and an engine-less WfMS. The DRA4WfMS documents will be routed to participants of activities in different locations. During the routing, the execution result of each activity will be appended to it. Thus, a DRA4WfMS document contains static and dynamic information within it. AEA

AEA (Activity Execution Agent)

In this paper we propose a secured, nonrepudiatable, and scalable WfMS for the cloud computing environment. Our goal is to solve these security problems in engine-based WfMSs using an architecture that we have called the Document Routing Architecture for WfMS (DRA4WfMS [15,16]). It is not only a distributed WfMS but also a totally engine-less WfMS – the activity execution is not controlled by any workflow engine, instead being carried out by a software agent in the local computer of the participant. The software agent can be located anywhere or be any type of computer device. We employ the element-wise encryption that is commonly used in XML security [17,18] and propose a cascade-based way to embed digital signatures for the routed documents. This method enables a DRA4WfMS document to be secured without requiring an access-control server. A DRA4WfMS document contains the secured workflow definition, process instance, and some digital signatures. During the execution of an activity, the software agent parses the received DRA4WfMS document to verify the previously embedded digital signatures, decrypts and shows information in the process instance to the participant, appends the input of the participant to the received DRA4WfMS document, embeds a new digital signature of the participant, and sends out the modified DRA4WfMS document to the subsequent participant(s) according to the protocol defined for DRA4WfMS documents. The DRA4WfMS has the following features:

A1 Start

End

A3

A2 AEA The secured initial document XA0 Workflow definition Digital signature embedded by the workflow designer

Execution result of the activity Digital signature embedded by the workflow participant Synchronous communication

Figure 2. The basic operational model of DRA4WfMS.

— It can work without a centralized workflow engine that must be maintained by the administrator of the WfMS. — The workflow process instance can be stored in a format that implements required security features such as authentication, confidentiality, data integrity, and nonrepudiation. — It can support dynamic flow control and a dynamic security policy in its run-time environment. — Although the execution of activity is not controlled by a workflow engine, the system still supports workflow monitoring. This paper is organized as follows. In Section 2 we present the basic and advanced operational model of the DRA4WfMS. The way to apply the DRA4WfMS in the cloud computing environment is presented in Section 3. The implementation details and experimental results are given in Section 4. Conclusions are drawn in Section 5.

The static information is actually the initial document. It contains the workflow definition as well as a digital signature embedded by the workflow designer. This digital signature signs the workflow definition in the initial document, which is necessary for the participants of the activities to verify whether it is a legal workflow process. The first part of the workflow definition includes the starting and stopping conditions of the workflow process, the activities in the process, control and data flows among these activities, and the requests and responses of each activity. The second part of the workflow definition describes the security policy during the execution of the workflow process, which includes how to encrypt the data in the workflow process instance. Actually, different portions in the workflow process instance may need to be encrypted using different keys since each activity may be executed by different participants. It requires so-called element-wise encryption [17,18]. An obvious way to encrypt an XML document is to employ existing cryptography to encrypt an XML document as a whole. The receiver of an encrypted XML document then decrypts it with the

2193 2187

Def where “Def” contains a unique process id, the definition of the workflow process, and the security policy of the workflow process. The unique process id is for supporting multiple instances of workflow process and resisting replay attacks [13]. The definition of the workflow process includes the starting and stopping conditions of the workflow process, the activities in the process, control and data flows among these activities, and the requests and responses of each activity. Since workflow definition may contain some information which is confidential, it may need to be secured. We have XA′′ Def , Def which is a secured P A initial DRA4WfMS document and will be sent to the participants of the following activities to start the execution of a workflow process. Note that Pri(A ) is the private key of the workflow designer. We assume that there is only one following activity, A1. Then, we have X A X ′′A . According to X A , the participant of A employs an AEA to start the execution of A and produces the execution result R A . Adding the element-wise encrypted execution result R A to X A , we have XA′ XA′′ , R A . Finally, AEA embeds a digital signature to X ′A and generates ′′ ′′ ′′ XA XA , R A , RA , Sig X A where Sig X ′′A P A ′′ is the digital signature embedded in XA , i.e., Sig XA′′ = Def . In RA , RA , Sig X A′′ , it P A P A

appropriate key and algorithm. However, some researchers considered encrypting a whole XML document without dividing it up as cumbersome, and consequently proposed element-wise encryption that the encryption is held at the element level. The execution result of each activity is part of the dynamic information in a DRA4WfMS document. To fulfill security requirements such as authentication, confidentiality, data integrity, and nonrepudiation, execution results should be element-wisely encrypted and digital signatures which sign these execution results are embedded. When the monitoring of workflow processes is necessary, some timestamps should also be embedded in the DRA4WfMS document. All the data items which are appended or embedded into the DRA4WfMS document during the execution of the workflow process are considered as dynamic information. In the follows, we first present the basic operation model and then the advanced operational model of DRA4WfMS. 2.1

THE BASIC OPERATIONAL MODEL

First, we show the basic operational model of DRA4WfMS as shown in Figure 2. Before the execution of an activity A , DRA4WfMS document X A is sent to the participant of the activity A who employs a software tool called the activity execution agent (AEA) to activate the execution of activities. First, the AEA parses X A and verifies all the embedded digital signatures therein so as to ensure that the workflow definition in X A is legal and all the stored execution results of previously executed activities are valid. Second, the AEA checks if the participant is the correct executor of this activity. Third, the AEA looks at the requests and responses of this activity and shows them to the participant in a graphical user interface in which the participant can perform the execution of this activity. Fourth, the AEA appends the execution result (i.e., response) of the participant to X A (the resulting document is X ′A ). Fifth, the AEA embeds a digital signature that signs the execution result of the participant and some of the digital signatures embedded in previous activities (the resulting document is X ′′A ). Finally, the AEA checks the control flow information defined in the workflow definition and forwards X ′′A to the participant (or AEA) of the next activity (or activities).

contains the encrypted execution result of activity A , R A , and the digital signature for maintain the nonrepudiation, RA , Sig X A′′ . We call it the characteristic P

A

execution result of activity A . For simplicity, we have CER( ) denote the characteristic execution result of activity A . Since the execution of a workflow process contains a lot of executions of activities, a DRA4WfMS document may contain multiple characteristic execution results of activities. For example, XA′′ = XA′′ , R A , R A , Sig XA′′ P A = Def

, Def

P

,

A

RA

, RA

, Sig X A′′

P

=

A

CER A , CER A . With loss of generality, we use CER A to denote XA′′ Def , Def although “Def” is not P A the execution result of an activity. We use Set_of_CER(d) to denote the set of characteristic execution results in a DRA4WfMS document d. For example, we have Set_of_CER(XA′′ )={CER(A ), CER(A )}.

In the follows, we describe the manipulation of the DRA4WfMS document formally. We employ some notations to represent the applying of cryptographic operations. First, we use O to denote that data object O is element-wisely encrypted. Second, O P A is used to denote a digital signature on data object O which is generated by the private key of participant of activity A . Third, data objects embraced in square brackets are separated by comma to represent that these objects are first connected and then perform cryptographic operations on them. For example, O , O , O P A presents a digital signature which signs data objects O , O , and O in the private key of the participant of activity A .

If an activity A is only with a single predecessor A , then we have XA′′ XA′′

Set_of_CER X′′A ,

RA

, RA

, Sig XA′′

P

A

=

, CER(A ).

Note that a workflow process may need to support the AND-split and AND-join in the flow control [19]. In this case, an activity has multiple predecessors. If an activity A has n , we have predecessors, A , A , ..., A XA′′

Set_of_CER XA′′ RA

In the follows, we illustrate the structure of DRA4WfMS documents. First, the initial document is denoted as X A

2194 2188

, RA

, Sig

∪ Set_of_CER XA′′ XA′′

, Sig

XA′′

∪ ... ∪ Set_of_CER XA′′ , . . . , Sig

XA′′

P

A

, .

convenience, we denote XA′′

Note that the resulting document of activity A , X ′′A , must include a digital signature which signs the embedded digital signatures of all the predecessors. It is the basis to build up the nonreputiation in DRA4WfMS. See Section 3.1 for details. In the follows, we offer some examples to demonstrate the structure of the DRA4WfMS document. Figure 3A shows a control flow of a workflow process. First, we have XA′′

Def

XA′′

Def

, Def

P

, Def

P

= CER(A ), CER(A ), Def , Def XA′′

XA′′

RA

,

A

, RA

, Sig

XA′′

P

, Sig XA′′

P

A

= CER(A ), CER(A ), CER(A ), and , RA Def , Def XA′′ P A

, RA

, Sig XA′′

P

A

, RA

A

,

, Sig XA′′

P

A

= XA′′

,

P

P

RA

, Sig XA′′

, RA

XA′′

, Sig

P

A

, Sig

XA′′

,

A

P

P

A

, Sig XA′′

, RA

, P

A

,

.

A



A0 Ai A1

(A)

A3 Aj

(B)

A2 A4

, CER(A

False

b

= R A , and

P

A

).

l

∪ Set_of_CER X′′A

l

, Sig XA′′

, RA

2.2

RA

0

P

A

l

l

, Sig XA′′

,

∪ ... ∪ l

,

.

THE ADVANCED OPERATIONAL MODEL

In this section we show the advanced operational model of DRA4WfMS. It supports the monitoring of WfMS and a more sophisticated security architecture. We have good reasons of thinking that the basic operational model is insufficient in some situations. First, the monitoring of workflow processes is sometimes necessary. This monitoring encompasses the tracking of individual processes so that information on their state can be easily seen and statistics on the performance of one or more processes provided [20]. We must have to place a timestamp of the finish time of the activity and there should be a way to query the progress or status of the execution of workflow processes. Second, decentralized execution of interorganizational workflows may raise a number of security issues including those related to conflict-of-interest among competing organization [21]. Intuitively, it may be necessary to apply element-wise encryption to the activity execution results to maintain the security policy during the execution of the workflow process. In [21], it demonstrated an example that the control flow information should not be revealed to the participant who is responsible to forward the workflow document. Since the participant cannot see the control flow information, it is unable to decide how to forward the document for subsequent processing. It leads to another problem that the AEA is not able to know how to encrypt the documents as it does not know who will be the next participant. We use Figure 4 to illustrate the problem caused by the flow information concealing. Figure 4A shows the partial flow definition of some workflow process and Figure 4 B is the corresponding dataflow illustration of this flow definition. In activities A and A , the participants Peter and Tony are responsible to give inputs for variables X and Y, respectively. However, the security policy requires that variable X input by Peter should be encrypted by the public key of Amy for later use. The participant Tony in activity A is not allowed to see the value of variable X. After activity A , there is a conditional branch. Either A or A will be executed, it depends on the value of a Boolean function Func(X). Also, the variable Y is a confidential data and thus should be

A

= CER(A ), CER(A ), CER(A ), CER(A ), CER(A ) , RA , RA , Sig XA′′ = Def , Def P A , RA

, Sig XA′′

. . . , Sig XA′′

∪ Set_of_CER X′′A ,

= CER(A ), CER(A ), CER(A ), CER(A ), , RA , Sig XA′′ , Sig XA′′ RA

RA

, RA

RA

, Sig XA′′ , Sig XA′′

, RA

,

XA′′ Set_of_CER X′′A Set_of_CER XA′′ l ,

Activity A has two predecessors, A and A . Thus, we have RA

= X ′′A , R A

If an activity A has n predecessors, A , A , ..., A we have

= CER(A ), CER(A ).

XA′′ = Set_of_CER XA′′

0

= CER(A .

= Set_of_CER X′′A

RA

A

, RA

P

0

If an activity A is only with a single predecessor A and the receiving document is X ′′A which is the DRA4WfMS document generated in the (l+1)th execution of activity A , then we have

= CER(A ),

A

RA

RA

CER(A

True

Ak



Figure 3. The control flow of some workflow processes.

Another situation which should be taken into consideration is that the execution of a workflow process may contain loops. See Figure 3B as an example. After the execution of activity Aj, The control flow branches to either Ai or Ak. It depends on the evaluation of a Boolean predicate b. If the evaluation result of b is True, then Ai and Aj will be executed again. In our previous discussion, we use X ′′A to represent the produced DRA4WfMS document after the execution of activity Ai. To support the loop of process execution in our system, we make an extension in our notation. We use X ′′A to represent the generated DRA4WfMS document. XA′′ 0 , XA′′ 1 , XA′′ represent the first, second, and (k+1)th DRA4WfMS documents after the execution of activity Ai, respectively. Similarly, R A ) represents the execution result and and CER( A , respectively. For characteristic execution result in X ′′A

2195 2189

encrypted during the transmission. If A is going to be executed, the variable Y should be encrypted by the public key of John. Otherwise, it should be encrypted by the public key of Mary. Since Tony in activity A cannot check the value of X, he is impossible to perform the correct encryption and then forward of the document. The problem is easy to be solved in an engine-based WfMS as the workflow engine which is responsible to control the flow always has the right to refer to all the information manipulated in the workflow execution. However, in DRA4WfMS, we do not have a workflow engine to control the execution of activities, the only way to protect the data in DRA4WfMS is through the applying of element-wise encryption.

presented in Section 2.1. The TFC server helps to generate a proper DRA4WfMS document and sends it to the following activities. Furthermore, for monitoring of workflow processes, the TFC could keep a copy of each forwarded document and make a record of the document processing so that the status of workflow process executions can be obtained by examining these records. In the advanced operational model, we have the identical secured initial document as that in the basic operational model, i.e., X A Def , Def = CER( A ). If an P A activity A is only with a single predecessor A as shown in Figure 6A, then the AEA of A has to generate an intermediate document X A . See below:

……

Input X

(A)

……

A1

Input Y

A2

A3

Peter

If Func(X)=True

……

A4

OR-split

Tony If Func(X)=False

A5

……

EncryptAmy(X)

X

=

XA′′

……

,

RA

……

P

= XA′′

A

RA

Figure 4. An example to illustrate the problem of data concealing.

P

We call

TFC Server P

Time stamp embedded by the timestamp server

A1 Start

P

A

P

, Sig XA′′

TFC

).

TFC

is to encrypt the execution

, in the public key of the TFC server. TFC

only can be decrypted by the TFC server.

RA

P

TFC

, RA

P

TFC

, Sig XA′′

the intermediate character execution result of

(k+1)-th execution of activity A and use CERit(A ) to denote it. The immediate document X A is sent to the TFC server for further processing and forwarding. When the TFC server receives X A , it first checks the digital . signatures and then decrypt the document to obtain R A Final, it prepares a document as follows.

(1) (2)



XA′′

XA P

A2

, CERit(A

result of A , R A

EncryptAmy(X) EncryptMary(Y)

, RA

TFC

Note that R A

OR-split ……

AEA

P

Mary

EncryptAmy(X) EncryptJohn(Y)

(B)

XA

John

TFC

,

= XA′′

RA

, t, R A it

, CER (A

, t, Sig XA

), CER(A

).

TFC Server AEA

Ap1

Figure 5. The advanced operational model of DRA4WfMS.

Ai

TFC

Aq

Ap2 …

The advanced operational model shown in Figure 5 is actually an improvement of that in Figure 2 according to the above mentioned problems. Since an AEA may not be able to correctly encrypt and forward the document, the DRA4WfMS document processed by an AEA is first sent to a timestamp and flow control server (TFC server), which is analogous to a notary public and has legal authority to witness the finish time of the activity. Note that a TFC server is not a workflow engine as it only embeds timestamps to DRA4WfMS documents and helps with their forwarding. Since the participant may not have enough information to perform correct element-wise encryption on its execution result, it may be impossible for the AEA to construct a document like X ′′A

Aj

TFC

TFC

Ar

TFC

Apn-1

(A)

(B)

Figure 6. Document routing through TFC server.

After the TFC server obtaining R A , it encrypts it according to the security policy defined in CER( A ) and . “t” is the timestamp of the time when generates R A it processed this document.

2196 2190

If an activity A has n predecessors, A , A , ..., A as shown in Figure 6B, we have the intermediate document which should be sent to TFC server is Set_of_CER X′′A

XA Set_of_CER

XA′′

RA

l P

RA

TFC

P

TFC

… , Sig XA′′

∪ Set_of_CER X′′A

l

l

cannot repudiate the execution result of himself and predecessors. Also, all his/her predecessors must sign embedded signatures of their predecessors. According to recursive process, each participant cannot repudiate execution of all his ancestors.

∪ ... ∪

,

A DRA4WfMS document contains a lot of CERs. Each CER has its own nonreputiation scope. A nonreputiation scope is consisted of a set of CERs. If a CER α is with a nonreputiation scope Γ, then the participant which generated the CER α cannot deny having received a DRA4WfMS document containing CERs in Γ and accordingly generates α. Algorithm 1 shows how to compute the nonreputiation scope of a CER in a DRA4WfMS document.

, , Sig XA′′

l

P

A

l

, Sig XA′′

l

,

.

The final document which TFC should send to the subsequence activities is XA′′

XA P

,

TFC

Set_of_CER X′′A Set_of_CER XA

l l

RA

, t, R A

his the this the

Algorithm 1. Deriving the nonreputiation scope of a CER in a DRA4WfMS document. Input: A DRA4WfMS document D and one of the CERs in D, α. Output: A set Γ contains CERs in D. (1) Let Γ=∅. (2) Γ=Γ∪α and Changes=True. (3) While (Changes==True) { Changes=False. For each CER β in Γ { { Let Δ be the set of CERs which β signs their signatures. If (Δ - Γ≠∅) then { Change=True. Γ=Γ∪Δ. } } }

, t, Sig XA

∪ Set_of_CER X′′A l ∪ ... ∪ , CERit(A ), CER(A ).

2.3 IMPLEMENTING INFORMATION SECURITY IN THE DRA4WFMS In this section we discuss the types of information security that are implemented in the DRA4WfMS. The operational model and the document structure can be used to implement various types of information security, including authentication, confidentiality, data integrity, and nonrepudiation. 2.3.1 CONFIDENTIALITY AND INTEGRITY Confidentiality is to prevent the disclosure of information to unauthorized individuals or systems. In DRA4WfMS, the confidentiality is maintained by the element-wise encryption which is specified in the secured initial document X ′′A . Since a DRA4WfMS document is an XML document and we employ the XML encryption [22] to activate the element-wise encryption in a DRA4WfMS document, the confidentiality of a DRA4WfMS document is based on the element-wise encryption of XML encryption. An XML element in a DRA4WfMS document can be encrypted by different public keys of users or groups to generate several XML elements in the secured DRA4WfMS document so as to have only a limited number of users to read the data.

3. APPLYING THE DRA4WFMS IN THE CLOUD COMPUTING ENVIRONMENT Figure 7 shows how to apply the DRA4WfMS in the cloud computing environment using the advanced operational model presented in Section 1.2. A user connects to one of the portal servers to access the DRA4WfMS cloud system, and that server checks what activities are going to be executed by the user. Since the workflow process instances are stored as DRA4WfMS documents, the portal server just simply sends a copy of the DRA4WfMS document, X ′′A , to the user. The user employs an AEA to execute the activity and generate ′ , by embedding a CER, a DRA4WfMS document, X A and then sends it back to the portal server. When an AEA ′ ) to the portal sends the resulting document ( X A server, the portal server verifies it and embeds a timestamp ′ in the into the resulting document and stores X ′′A pool of DRA4WfMS documents. By checking this document, the DRA4WfMS cloud system can inform the subsequent participant(s) that s/he(they) is(are) the participant(s) of the next activity.

Integrity means that data cannot be modified without authorization. The execution result embedded in a CER must be signed. Illegal alteration of elements in a DRA4WfMS document will cause errors when we verify digital signatures which signs them. Thus, we can achieve the requirement of integrity. 2.3.2 NONREPUTIATION In DRA4WfMS, we propose a cascade-based way to embed digital signatures in a DRA4WfMS document as shown in Sections 2.1 and 2.2. With the help of AEA, each participant must embed a digital signature which signs his execution result and digital signatures embedded by his predecessor activities (or activity). Thus, the participant

The nonrepudiation requirement is obtained automatically as shown in Section 2.3.2, and the confidential data in workflow process instances are encrypted (see Section 2.3.1). To start a workflow process, the participant has to download a

2197 2191

and contains two subsections: workflow definition and security definition, and a digital signature, Def P A . The last activity execution result section consists of some CERs appended after the activities have been executed. The detailed syntax of a DRA4WfMS document, which is an XML document, is described elsewhere [16].

secured initial DRA4WfMS document from the DRA4WfMS cloud system. The secured initial DRA4WfMS documents can be prepared by the system or uploaded to the system by the user. The DRA4WfMS cloud system should provide interfaces for users to search and manage DRA4WfMS documents.

We propose and implement the DRA4WfMS API to offer an easy way for the programmer to design an AEA. We implemented the DRA4WfMS API in the Java programming language and then constructed an AEA to conduct experiments on the operational models presented in Section 1. All of the experiments were performed on a PC with a 2.4GHz Intel Core 2 Quad processor, 4 GB of RAM, the Microsoft Windows 7 operating system, and Java Development Kit 6. The operations of the digital signature in the DRA4WfMS API employ the Java XML Digital Signature API [23], which follows the standard of XML signature syntax and processing [22]. The element-wise encryption and decryption in the DRA4WfMS API uses the Apache Santuario library [24], which follows the standard of XML encryption syntax and processing [25].

DRA4WfMS documents pool (3) (6)

DRA4WfMS Cloud system

…… Portal servers

(1)

(5)

(4)

(2)

AEA

Attachment is insufficient.

B1 A2

A1

Initial document

AEA

A

Andsplit

Andjoin

C

D Accept

Figure 7. Applying the DRA4WfMS in the cloud computing environment. B2

4. IMPLEMENTATION AND EXPERIMENTAL RESULTS

(A)

The implementation is divided into two parts: (1) the DRA4WfMS API, which should be used to construct an AEA and the portal server, and (2) the pool of DRA4WfMS documents, which is capable of storing a huge number of DRA4WfMS documents.

Attachment is insufficient.

B1 Initial document

A

Header section

Andsplit

Andjoin

C

D Accept

B2

Unique process id

(B) Application definition section

Workflow definition section

Def

ee

Security definition section A digital signature

Def

ee

Pri A0

End of workflow

Condition

Activity

Connection edge

TFC Server

Figure 9. Two workflow processes for conducting experiments.

Activity execution result section

We successfully conducted experiments on some workflow processes comprising numerous activities and complex flow control mechanisms. However, because of space limitations, herein we only report experimental results for a workflow process with a smaller number of activities. Figure 9A shows a workflow that contains five activities and representative flow control mechanisms such as sequence, loop, split, and join. We first employ the basic operational model of the DRA4WfMS as shown in Figure 2 to execute this workflow process. We measured the following parameters: (1) the time required for the AEA to decrypt cipher data and verify digital signatures when it received a DRA4WfMS

Figure 8. Architecture of a DRA4WfMS document.

4.1

Start of workflow

THE DRA4WFMS API

The syntax of the DRA4WfMS document is designed according to the security requirements described in Section 1 as well as the operational models shown in Section 1. Referring to Figure 8, a DRA4WfMS document consists of three sections: the header, application definition, and the result of the activity execution. The application definition section actually represents a secured initial DRA4WfMS document, Def , Def , as defined in Section 1, P A

2198 2192

document, (2) the time required for the AEA to encrypt the document and embed digital signatures after the participant finished executing an activity, and (3) the size of the generated DRA4WfMS documents after the execution of each activity. As indicated in Table 1, it took more time to verify all of the embedded digital signatures than to actually embed them after the participant finishes executing the activity, since only a single digital signature was embedded after executing the activity in most cases. The size of the DRA4WfMS and the time for decrypting and verifying signatures were proportional to the numbers of CERs and signatures in the documents. However, only a constant time was needed to encrypt and embed signatures. A workflow with five activities required less than 23 kB of space to record the workflow definition, security definition, activity execution results, timestamps, and digital signatures.

XA XB XB XB XB XC XC XD XD XA XA XB XB XB XB XC XC XD XD

Number of CERs

α

β

0.0030 0.0041 0.0049 0.0055 0.0072 0.0079 0.0088 0.0093 0.0133 0.0132

0.0156 0.0167 0.0145 0.0148 0.0147 0.0130 0.0132 0.0116 0.0118 0.0121

Σ

Initial Document

XA XB XB XC XD XA XB XB XC XD

0 0 0 0 0 1 1 1 1 1

7,119 1 2 2 4 5 6 7 7 9 10

1 2 2 4 5 6 7 7 9 10

8,667 10,184 10,184 13,503 15,015 16,562 18,079 18,079 21,398 22,910

Figure 9B is the other workflow process investigated in our experiments; it is a workflow process that executes the process in Figure 9A according to the advanced operational model of the DRA4WfMS shown in Figure 5. The results are presented in Table 2. Similar to the results for the basic operation model, the size of the DRA4WfMS and the time for decrypting and verifying signatures in AEA and TFC were proportional to the numbers of CERs and signatures in the documents. Moreover, again only a constant time was needed to encrypt and embed signatures. Although the AEA and the TFC server had very similar total processing times, the TFC server did not need to make a connection-oriented session with the participant. Thus, the TFC was not the bottleneck in the operation of the DRA4WfMS.

TABLE 2. EXECUTION TIMES FOR THE WORKFLOW SHOWN IN Figure 9B. α

β

1

1

0.0021

0.0145

Initial Document

XA

0

γ

0.0141 0.0135

0.0119

0.0080

0.0178 0.0121

0.0120

0.0156 0.0159

0.0110

0.0241 0.0219

0.0107

0.0294 0.0206

0.0116

0.0291 0.0228

0.0141

0.0350 0.0227

0.0108

0.0080 0.0078 0.0084 0.0089 0.0083 0.0094

0.0336

0.0082

0.0269

0.0123

0.0407 0.0289

0.0116

0.0431

0.0100 0.0090

11,147 13,508 15,109 13,508 15,109 21,717 23,312 25,669 27,263 29,661 31,290 33,651 35,252 33,651 35,252 41,860 43,455 45,812 47,406

— Search DRA4WfMS documents: The user can obtain a list of links of DRA4WfMS documents where s/he is one of the participants of the subsequent activities. A very similar procedure is used to obtain the TO-DO list in a WfMS. — Retrieve a DRA4WfMS document: According to the search result, the user can retrieve a DRA4WfMS document and then execute the activity in an AEA. — Store a DRA4WfMS document: The resulting DRA4WfMS document produced by an AEA should be stored in the pool of DRA4WfMS documents. — Notify the subsequent participants: After a resulting DRA4WfMS document is stored, the portal server should inform the participants of the next activities. — Perform workflow monitoring or statistical analyses: The user can display information about the activity status of a workflow process. Also, the MapReduce computing model supported

The binary codes of the DRA4WfMS API, the sample codes to use it to construct an AEA, and all of the DRA4WfMS documents generated in the two workflow processes can be downloaded at http://www.csie.ntnu.edu.tw/~ghhwang/DRA4WfMS/DRA4 WfMS_EXAMPLES.zip. Number of CERs

2 3 4 3 4 7 8 9 10 11 12 13 14 13 14 17 18 19 20

4.2 THE POOL OF DRA4WFMS DOCUMENTS We employed the Apache Hadoop [26] to implement the pool of DRA4WfMS documents (shown in Figure 7), which was constructed by the HBase database. HBase is a distributed column-oriented database built on top of HDFS (the distributed file system of Hadoop), and is the optimal Hadoop application to use when real-time read/write random accesses to very large datasets are required. A DRA4WfMS document is stored as a cell in a row of an HBase table. An AEA connects to one of the portal servers in the DRA4WfMS cloud system. The portal server authenticates the ID of the user and then interacts with an HBase cluster (the pool of DRA4WfMS documents) to perform the following operations:

α: Time for decrypting and verifying signatures (in seconds) β: Time for encrypting and embedding signatures (in seconds) Σ: Size of the generated file (in bytes)

Number of signatures to verify

2 3 4 3 4 7 8 9 10 11 12 13 14 13 14 17 18 19 20

α: Time for decrypting and verifying signatures in AEA and TFC (in seconds) β: Time for encrypting and embedding signatures in AEA (in seconds) γ: Time for encrypting and embedding signatures in TFC (in seconds) Σ: Size of the generated file (in bytes)

TABLE 1. EXECUTION TIMES FOR THE WORKFLOW SHOWN IN Figure 9A. Number of signatures to verify

0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1

Σ 7,119 9,518

2199 2193

in the HBase system can apply some statistical analyses to workflow processes or instances stored in the DRA4WfMS cloud system.

[7]

[8]

5. CONCLUSION

[9]

In this paper we propose a secured WfMS for the cloud computing environment. Engine-based WfMSs – which use the workflow engines to secure or protect sensitive data in workflow process instances – are confronted with severe security problems in the cloud computing environment. The architecture of the DRA4WfMS does not require a workflow engine to control the execution of activities, which avoids the security problems that may arise in engine-based distributed WfMSs. The application of element-wise encryption and a cascade-based method of embedding digital signatures makes a DRA4WfMS document self-protected without requiring an access-control server. As a result, security requirements such as authentication, confidentiality, data integrity, and nonrepudiation do not need to rely on service-level agreements between users and cloud service providers. The user does not have to worry about alteration to the contents of workflow process instances because any illegal modification of a process instance will be detected by cryptographic algorithms. Thus, different enterprises or organizations can simultaneously use a single DRA4WfMS cloud system. It is easy to implement a cross-enterprise WfMS in the DRA4WfMS cloud system. The proposed framework can be used to construct WfMSs in private, community, hybrid, and public clouds.

[10]

[11]

[12]

[13] [14]

[15]

[16]

Our implementation of the DRA4WfMS API and the DRA4WfMS cloud system in the HBase database of Apache Hadoop [26] has demonstrated the feasibility of the proposed framework. The current Hadoop cluster has only a small number of data nodes. We are working on extending the number of data nodes in our system and measuring the performance of querying, storing, monitoring, and statistical analyses when the pool of DRA4WfMS documents contains a huge number of documents.

[17]

[18]

[19] [20]

REFERENCES [1]

[2]

[3] [4] [5] [6]

D. Georgakopoulos, M. Hornick, and A. Shet. Overview of Workflow Management: From Process Modeling to Workflow Automation Infrastructure. Distributed and Parallel Databases, Vol. 3, No. 2, 1995, Pages 119–153. Shi Meilin, Yang Guangxin, Xiang Yong, and Wu Shangguang. Workflow Management Systems: A Survey. International Conference on Communication Technology, 1998. Workflow Management Coalition. Workflow: An Introduction. Workflow Handbook, 2002. Workflow Software via Cloud Computing Service - RunMyProcess. http://www.runmyprocess.com/. Visual Workflow: experience the speed of visual app development. http://www.salesforce.com/platform/cloud-platform/workflow.jsp. Aneka: Enabling .NET-based Enterprise Grid and Cloud Computing. http://www.manjrasoft.com/products.html.

[21]

[22]

[23] [24] [25]

[26]

2200 2194

Azure Services Platform. http://en.wikipedia.org/wiki/Microsoft_Azure#Azure_Platform_Compo nents. Implementing Workflows on Google App Engine with Fantasm. http://code.google.com/intl/zh-TW/appengine/articles/fantasm.html. S. Ceri, P. Grefen, and G. Sánchez: WIDE − A Distributed Architecture for Workflow Management. The 7th Int. Workshop on Research Issues in Data Engineering, Birmingham, 1997. P. Muth, D. Wodtke, J. Weißenfels, A. Kotz-Dittrich, and G. Weikum: From Centralized Workflow Specification to Distributed Workflow Execution. Journal of Intelligent Information Systems, 10(2):159-184, 1998. H. Schuster, J. Neeb, and R. Schamburger: A Configuration Management Approach for Large Workflow Management Systems. Proc. Joint Conf. on Work Activities Coordination and Collaboration, San Francisco, 1999. T. Bauer and P. Dadam. Efficient Distributed Workflow Management Based on Variable Server Assignments. B. Wangler, L. Bergman (Eds.): CAiSE 2000, LNCS 1789, pp. 94-109, 2000. Distributed Systems: Concepts and Design (3rd Edition), by George Coulouris, Jean Dollimore, Tim Kindberg. Addison Wesley, 2000. Li-jie Jin, Fabio Casati,Mehmet Sayal, and Ming-Chien Shan, Load balancing in distributed workflow management system. Proceedings of the 2001 ACM symposium on Applied computing (SAC '01). Gwan-Hwan Hwang, Yu-Cheng Hsiao, and Sheng-Ho Chang, “XDWfMS: An XML-Based Distributed Workflow Management System,” The Fifth International Workshop on XML Technology and Applications (XMLTech'07) , June 25-28, 2007, Las Vegas, Nevada, USA. Gwan-Hwan Hwang and Yu-Cheng Hsiao. “A Security Framework for Decentralized Workflow Management Systems. Gwan-Hwan Hwang and Yu-Cheng Hsiao. Technical Report, National Taiwan Normal University, 2011. http://www.csie.ntnu.edu.tw/~ghhwang/TR/DRA4WfMS_Technical_R eport_2011_12_01.pdf An Operational Model and Language Support for Securing XML Documents, by Gwan-Hwan Hwang and Tao-Ku Chang. Computers & Security,Volume 23, Issue 6, pp. 498-529, 2004. Towards Attribute Encryption and a Generalized Encryption Model for XML, by Gwan-Hwan Hwang and Tao-Ku Chang. The 4th International Conference on Internet Computing 2003 (IC'03), Las Vegas, Nevada, USA. OMG. “Business Process Modeling Notation (BPMN) 1.2.” 2009. WFMC. Workflow Management Coalition Workflow Standard: Workflow Process Definition Interface – XML Process Definition Language (XPDL) (WFMC-TC-1025). Technical report, Workflow Management Coalition, Lighthouse Point, Florida, USA, 2002. V. Atluri, S. Chun and P. Mazzoleni. 2004. Chinese Wall Security for Decentralized Workflow Management Systems. Journal of Computer Security, Volume 12, Number 6. Donald Eastlake, Joseph Reagle, Takeshi Imamura, Blair Dillaway, and Ed Simon, “XML Encryption Syntax and Processing. W3C Recommendation 10 December 2002,” http://www.w3.org/TR/xmlenccore/. XML Digital Signature APIs, http://jcp.org/en/jsr/detail?id=105. The Apache Santuario library, http://santuario.apache.org/Java/index.html. Donald Eastlake, Joseph Reagle, David Solo, Mark Bartel, John Boyer, Barb Fox, Brian LaMacchia, and Ed Simon, “XML-Signature Syntax and Processing W3C Recommendation,” 12 February 2002. http://www.w3.org/TR/xmldsig-core/. The Apache Software Foundation, “Welcome to Apache Hadoop!” http://hadoop.apache.org/

Suggest Documents