The building block of a business process is the service nodes involved. The ânodeâ here refers to the physical computing instance where the service application ...
2011 IEEE International Conference on Web Services
Modelling collaborative services for business and QoS compliance Jinhui Yao∗† , Shiping Chen∗ , Chen Wang∗ , David Levy† and John Zic∗ ∗ Information Engineering Laboratory, CSIRO ICT Centre, Australia {Firstname.Lastname}@csiro.au † School of Electrical and Information Engineering, University of Sydney {jinhui, dlevy}@ee.usyd.edu.au
will have their individual priorities and interests. Given that admission to violations may lead to penalties in some form, it is conceivable that they may intend to deceive and hide this fact. Therefore, a mechanism to detect and prove incompliance is needed for this collaboration paradigm to prosper. As a solution, we have proposed to enforce strong accountability to enhance the trustworthiness [17][19]. While this will be elaborated shortly in later sections, in short, accountability provides means to verify compliance according to evidence in a provable and undeniable way. In our past work, we have described the overall architecture of using Accountability Service (AS ) to aggregate evidence and verify compliance. To utilize this design in the real practice, detailed methodology, both in form of theory and implementation need to be developed. This requires a decent study on the nature of the collaboration, or be more specifically, a generic modeling of the problem domain is needed. This model shall serve as the ground basis for conceptualizing higher level objects, and as a tool for analyzing and reasoning about the compliance issues in the collaboration. In this paper we extend our work by developing a quantitive model to represent the horizontal and vertical structures of the collaborations involving multiple un-trusted parties. Using this model, we classify four types of compliance and determine the logging needed for their verification. Based on the compliance types, we extensively analyze and reason about the extent to which different compliance types can be verified in a provable using the evidence logged. Then we evaluate the practical effectiveness of the model and methodology proposed by implementing them in a collaborative business process.
Abstract—In recent years, we witnessed a range of innovations in the ‘service’ related technologies, such as Software as a Service, Platform as a Service and Infrastructure as a Service. Along with the Service Oriented Architecture, companies can wrap their technological product as a service, to collaborate with others. Facing the ever-escalating global competition, such collaboration is crucial. The viability of this paradigm highly depends on the compliance and therefore the trustworthiness of all collaborators. However, it is challenging to achieve trustworthiness in such a dynamic cross-domain environment, as each participator may deceit for individual benefits. As a solution, we have proposed to enforce strong accountability to enhance the trustworthiness. With this accountability, incompliance can always be determined in a provable and undeniable way. In this paper, we extend our work by proposing a novel modeling of the collaborative business process. Based on this modeling, we thoroughly analyze the evidence and proving procedure needed for different types of compliance, and evaluate the extent to which those compliance can be indeed proved. We have implemented a demonstrative system to show its effectiveness in real practice. Index Terms—accountability, compliance, trustworthiness, service orieanted architecture, service collaboration
I. I NTRODUCTION In recent years, we have witnessed a range of innovations in the ‘service’ related technologies and concepts. Following the Software as a Service (SaaS), Platform as a Service (PaaS), Infrastructure as a Service (IaaS) and many more “as a Service” concepts have been proposed. Along with the widely adopted Service Oriented Architecture (SOA), companies, organizations can wrap various kinds of technological product they are offering as a service, to collaborate with services provided by others to form new value added business products. Facing the ever-escalating global competition, such collaboration is crucial to for their survival. The correctness of the inter-organizational collaboration relies on the correctness of all participators, that is, if the collaborator is compliant to pre-defined business logic, or Service Level Agreement (SLA). It follows that, the viability of this paradigm highly depend on the trustworthiness of the behaviors of all collaborators. Here we adopt the definition of trustworthiness on IETF [15]: a trustworthy system is a system that is already trusted, and continues to warrants that trust because the system’s behaviors can be validated in some convincing way. It is a challenging task to preserve trustworthiness in such a dynamic cross-domain environment, as each participator 978-0-7695-4463-2/11 $26.00 © 2011 IEEE DOI 10.1109/ICWS.2011.44
II. S ERVICE COMPLIANCE :
A MOTIVATING SCENARIO
Inherited from our previous work, we use a loan application process as the running example in this paper. As shown in Fig. 1, the process requires the collaboration of five entities. First, a loan application web portal allows customers lodge the application and fill in their personal information. This information will first be used to obtain a credit score from the credit rating authority, and then the score is attached with other personal information to be sent to the loan bidding company. The bidding company forwards the application to multiple loan companies (Star Loan & Ocean Loan), and select the cheapest offer available to return to the applicant. 299
Figure 1.
Loan application service composition Figure 2.
In this typical collaboration scenario, the overall correctness of the system depends on the correctness of all participants. As each of them may be interested to violate the collaboration rules for their own benefit or/and for avoiding possible penalties, the causer of a failure may be extremely difficult to determine. Therefore, a mechanism is required to prevent this denial of failure. This mechanism is essential to control the correctness of the business process established. Accountability is a concept to make the system accountable and trustworthy by binding each activity to the identity of its actor [21]. Such binding should be achieved under the assumption that all actors may lie according to their own interest. Therefore the bindings must be supported by provable or non-disputable evidence. In our approach, accountability can be incorporated into activity based process by requiring the actor (conductor) of the process to log non-disputable evidence about the activities in a separate domain from the domain of its own. Fig. 2 shows an example of such incorporation. In the example, domain A is required to perform logging operations before and after conducting the activity in its process. The evidence needs to be logged should contain enough information to describe the conducting activity. In the simple case in our example, intuitive enough, the evidence should include the states of the factors concerning the start of the activity (e.g. input variables) and the factors concerning its completion (e.g. output value). As aforementioned, the logged evidence needs to be nondisputable so as to undeniably link the activity to its actor. To achieve this, we assume the employment of PKI in all the domains in our example, so that each of them has its own associated public-private key pair issued by certificate authorities. Any evidence logged must be signed by the logger and receipted by the loggee (i.e. Domain B). Through this procedure, since the digital signature is un-forgeable, the signature of the evidence and the receipt enable both domains to prove the factor that domain A has logged such evidence at domain B. With this concept, we propose to use a central accountability service (AS ) to enforce accountability on all the participating business services (BS ). The space in the Cloud has been split into two domains: the Accountability Service Domain (ASD) and the Business Service Domain (BSD). In the BSD, business services (BS ) compose with each other to conduct complicated business processes. Each service in the BSD keeps a close association with the accountability services (AS )
Example of incorporating accountability into process
in ASD so as to ensure that the BS are held accountable. In this setting, the AS continuously receives logs and analyzes the evidence to verify the compliance of all the underlying participating services in the collaboration. In our past work, we have illustrated the logging protocols and demonstrated how this design can be used to verify compliance effectively. So in the following several sections, we will describe the model we have developed based on this design to define the detailed methodology as what evidence need to be submitted and how compliance can be determined. III. M ODELING THE COLLABORATION A collaborative business process may involve many service nodes from different administration domains. To clearly describe the settings of a collaboration, one needs to look at both its horizontal and vertical structure. With respect to a participating service, in the horizontal structure of the collaboration, this service interacts with all other participating services. Whereas in the vertical structure, this service may first, belong to a specific trust domain (e.g. a company) so as some other collaborating services, and second, have its physical service node(s) deployed in an infrastructure provided by other entities (apart from this company). This vertical structure contains much information essential for verifying one’s compliance. For instance, the service provider should not be blamed for the fault of the infrastructure provider. Our modeling intends to capture both the horizontal and vertical structure of the collaboration. Different service providers collaborate with each other to form business processes. And small processes are integrated to form massive ones. We model the business process (P ) formed through collaboration as a tuple P = (N, P, V )
(1)
where N is the service node involved in the process, P is the sub-process and V is the directed edge connecting them. The building block of a business process is the service nodes involved. The “node” here refers to the physical computing instance where the service application is deployed at. A service node is modeled as N = (Din , Dout , F, L)
300
(2)
Where Din and Dout are the input and output data of the service during execution (if there is any). L is the physical location where the node is deployed at (e.g. in the company, in the cloud). F refers to the function of this service node, the internal computational logic, which is a sequence of activities (A) with the first action to take input (Ain ) and the last action to emit output (Aout ) (if there is any): F =
n i=0
in the infrastructure view. It displays the locations of the deployed service nodes in different computing provisioning clouds – Amazon Web Service1 and Windows Azure2 in our example. By capturing both horizontal and vertical aspects of the collaboration, many specific details can be taken in to consideration when analyzing the compliance of participants. For instance, the location of the deployment may assist the diagnosis process in which the actual computing instance can be probed to test its availability; the logging amount can be reduced for interactions between entities belong to the same trust domain.
n−1
Ai = Ain ∗
Ai ∗ Aout
(3)
i=1
the product symbol ( ) here refers to sequential or nonsequential (e.g. parallel) relationships between the activities. A service node may have a range of functions (computational logic) designed for different input data, here for simplicity, we model the node to be dedicated to only one function. Every participator (e.g. company, organization) in the collaboration is regarded as service entity (En) which owns a group of service nodes, that is, En = {N1 , N ... N3 )
(4)
and each entity may have its own trusted partners Figure 3.
T = {En1 , En2 . . . Eni }
Modeling the horizontal and vertical structure of collaboration
(5)
In short, a business process consists of the service nodes provided by different service entities and inter-connected by directed edges while certain entities may be in the same trust domain (e.g. belong to the same financial group). Corresponds to the ownership of service nodes and the trust structure in the collaboration, directed edges (V ) could have many types. Broadly, they can be classified into three: a) inner edges (Vinner ), edges between the nodes owned by the same entity; b) external trusted edges (Vtrust ), edges between the nodes owned by two mutually trusted entities and c) external un-trusted edges (Vext ), edges between two nodes whose owner barely trust each other. For instance, the term means the inner edge from node a (Na ) to node b (Nb ): a inner b AN AN out −→ in . We can elaborate this model with our running example. For instance, the One-stop Loan App Company is our En1 , which has only one service node deployed in Amazon EC2. It can expressed as En1 = {N1 } and the node
IV. A PPLYING FOR SERVICE COMPLIANCE Compliance is the correctness of the activities conducted by entities. The compliance of certain function can be saved by logging evidence before the activity(s) and/or after them (as shown in Fig. 2). However, as compliance requirements could have a very wide spectrum, the evidence needed and the verification process can be quite apart. In order to systematically analyze various kinds of compliance, one has to identify the different types of compliance and tackle them accordingly. Here we have classified compliance into four generic types, they are: procedural compliance, contextual compliance, computational compliance and QoS compliance. For each of them, based on the collaboration model described in last section, we provide a formal definition and specify the logging method and the evidence content for their verification. Detailed analysis on their verification methodology will be elaborated in later sections.
N1 = {Dapplication , Dresult , Floan , EC2 − 184.72.253.241}
A. Procedural compliance Procedural Compliance aims to verify if the procedures of the process has been correctly carried out, or more generally, if a service node has been invoked (e.g. activity sequence compliance). Suppose we have inner edge Na −→inner Nb , the sub-process consists of node a and node b can then be described as Nb a Na Nb P : AN (6) in,t0 ∗ ... ∗ A out,t1 → A in,t2 ∗ ... ∗ A out,t3
Similarly, let us suppose the Credit Check Company is our En2 whose only service node is deployed in Microsoft Azure, it can be expressed as En2 = {N2 } and the node N2 = {Dperson , Drating , Fcredit , Azure − 192.78.24.33} Suppose the two entities are in the different domain, the edge between them is then . A complete example is shown in Fig. 3. The vertical structure of a collaboration is presented in different views. Process view displays the horizontal structure of the collaboration. Service entities from different (or same) trust domains interact with each other to form the business process. The deployment structure of the services is shown
a Procedural compliance for node a (Comp N procedure ) entails N a,N b a ∈ P , if ∃A N that: in a process P , ∃V in,t0 ∈ P and 1 Amazon
Web Service http://aws.amazon.com/ Azure Platform http://www.microsoft.com/windowsazure/
2 Windows
301
a Nb ∃A N out,t1 ∈ P , then there must exist ∃A in,t2 ∈ P where t2 > t1 > t0 . To save evidence for this compliance, we need to insert two a logging actions, i) A N log,tx for node a, with signed evidence Na E procedure = {processId} ⊗ K N a at time tx where t0 < tx < t1 + Tthres where Tthres is the threshold time defined b for this logging activity; and ii) A N log,ty for node b, with signed Nb evidence E procedure = {processId} ⊗ K N b at time ty where t2 < ty < t3 + Tthres . To elaborate, in our example, if the evidence is not received from En2 within the time frame, En1 will be considered procedural incompliant regarding the action to invoke En2 .
b 2) A N logOut,tz with evidence b Nb Nb ⊗ K N b at E compute = processId, D N out,t3 , S t3 time tz where t2 < ty ≤ tz < t3 + Tthres
If input to node b involves output from multiple nodes, all of those nodes need to log their output. The verification of computational compliance of node b requires the content compliance of its input and output edge. D. QoS compliance QoS Compliance focus on the performance issues of the business process, in general, it verifies if certain activities have been conducted within the pre-defined time frame. Using the expression in (7), QoS compliance on response time of node b N a,N b b (Comp N ∈ P and QoS ) entails: in a process P , ∃V N b,N c Nb Nb b ∈ P , if ∃A in,t2 ∈ P with D in,t2 ,and ∃A N ∃V out,t3 ∈ P N a,N b N b,N c Nb with D out,t3 , given Comp content and Comp content then b t 3 − t2 ≤ T N SLA where is the SLA on the response time provided by node b. Evidence for QoS compliance is very similar to computational compliance except the part that it requires the evidence to be logged right after the conductance of the activity and local states are not needed in the evidence, instead the time of the occurrence of the activity need tobe provided. For b Nb Nb instance, E N . QoS = processId, D in,t2 , t2 ⊗ K
B. Content compliance Content compliance aims to verify if the content is corrupted during the transmission by urging the service nodes to submit digital signatures of the content at certain stages (e.g. integrity compliance and undeniable data history). Using the expression in (6), content compliance for a a,N b transmission edge between node a and node b (Comp N content ) N a,N b Na entails that: in a process P , ∃V ∈ P , if ∃A out,t1 ∈ a Nb Nb P with D N out,t1 , and ∃A in,t2 ∈ P with D in,t2 , then Na Nb require D out,t1 = D in,t2 . The logging activity required for content compliance is very similar to the two for procedural evidence: compliance Nbut with Ndifferent N a,out N b,in a a E = processId, D and E = ⊗ K content content out,t1 b ⊗ K N b . The content compliance of processId, D N in,t2 V N a,N b implies the procedural compliance of node a. Note that, with sufficiently large Tthres , a node may choose to log the input as well as the output in one logging action if content compliance is required for both its input and output edge.
In our past work, we have discovered the computing cost for preparing the evidence at the business service nodes is negligible and the main latency comes from the transmission time to/from the AS node. Therefore, according to the logging method and the evidence size, here we can roughly estimate the cost which may be incurred due to logging for the four types of compliance. To simplify the matter, we here assume all service nodes have the same transmission bandwidth (bit rate) to AS and this bandwidth is consistent over time. The cost of the logging can be described in terms of the number of transmissions needed and the size of the evidence. For example, procedural compliance on one node requires 2 logging actions – one from sender and one from receiver. If N nodes are required to log for procedural compliance, the number of logging will be N +1 (when a node logs to acknowledge the compliance of the previous node, it can also acknowledge its intention to invoke the next). In the same way, the costs of the logging are shown in table I.
C. Computational compliance Computational compliance intends to record evidence which can indicate the computational process conducted by a service entity. Here, both the input/output of certain activity and the prior/post local states local states will be logged. For example, when En2 offers the credit rating for a person, it shall log the person’s information as well as the credit history of that person En2 possesses, in order to verify the fact that such rating is correctly computed. We here extend the business process to involve three service nodes - node a, b and c: Nb Nc a Nc P : AN (7) in,t0 ... → A in,t2 ... → A in,t4 ...A out,t5 b Computational compliance for node b (Comp N compute ) entails: in a process P , ∃V N a,N b ∈ P and ∃V N b,N c ∈ b Nb Nb P , if ∃A N in,t2 ∈ P with D in,t2 ,and ∃A out,t3 ∈ P N a,N b N b,N c Nb with D out,t3 , given content andComp content then Comp N Nb Nb Nb b Nb D in,t2 , S t2 = D out,t3 , S t3 where F Nb is the F b Nb function of node b, S N t2 and S t3 are the internal states of node b at t2 and t3 . For computational compliance, all three nodes need to log for the output/input, moreover, node b need to attach its internal states with the log being submitted. That is b 1) A N logIn,ty with evidence b Nb Nb ⊗ K N b at EN compute = processId, D in,t2 , S t2 time ty where t2 < ty < t3 + Tthres
Compliance Type
Number of logging
Evidence
Timing
Procedural
N +1
Notification
Tthres
Content
N +1
Full message
Tthres
Computational
2N + 2
Full message & local states
Tthres
QoS
2N + 2
Full message
Immediately
Table I E STIMATION OF THE LOGGING COSTS
302
where t is the current time and ε (ty ) is the expected time node b should log (the threshold time) which can be estimated according to the historical logging (i.e. the time node a logs plus the average transmission latency between a and b), and ε (TN b,AS ) is the expected latency for the log to be received by AS node. fconf is a function to increase the confidence according to the unusual delay that has occurred. Note that, in case even node a’s log is also missing, time obtained from nodes prior to node a can be used for the estimation.
From the table, we can see that, apart from procedural compliance, all other three types of logging will incur large overhead which grows with the size of the data transferred between the nodes. Among those three, QoS compliance requires evidence to be logged immediately after the action, this stringency certainly brings further costs in real practice. V. C OMPLIANCE ANALYSIS As aforementioned, even though the participating service providers are instructed to submit evidence to demonstrate their compliance, it is highly likely they may i) choose not to submit certain evidence or ii) submit bogus evidence to avoid possible penalties. In our design, we make very limited assumption on the logging and the truthfulness of the evidence submitted. Before a service node has been concluded for being compliant for a specific action, the AS node constant hypothetically presume it is incompliant. The confidence of such hypothesis – incompliance hypothesis confidence Conf ∈ [0, 1] is a continuous value, with 1 being definitely incompliant and 0 being definitely compliant. It approaches 1 as more and more evidence suggest incompliance (in some case, the fact that no evidence is submitted itself indicates violations). The AS node continuously updates Conf for all observed activities that are happening in the collaboration. Evidence logged by a node can reflect its compliance, and sometimes a service node’s compliance can be inferred indirectly by evidence logged by other nodes. This section discusses the inference and reasoning the AS node conducts to verify the four types of compliance we have previously defined.
B. Content compliance analysis a,out Content compliance can be proved when both E N content . and have been received, they reflect the input and output E of both service nodes, N b,in content .
a,out N b,in Na Nb Na Nb EN content , E content ⇒ A out,t1 , A in,t2 ⇒ D out,t1 , D in,t2
The compliance predicates on the consistency between input a,out b,in and output data, assuming ∃E N ∃E N content content , the compliant and incompliant state can be expressed as a,N b Conf N ¬content =
a Nb 0 if D N out,t1 = D in,t2 a Nb 1 if D N = D out,t1 in,t2
Unlike procedural compliance, content compliance cannot be inferred indirectly, it can only be verified by using the evidence from the sender and the receiver. When the evidence is missing, the confidence can be derived through a,N b Conf N ¬content =
a,out fconf {t − ε (tx ) − ε (TN a,AS )} if ¬∃E N content N b,in fconf {t − ε (ty ) − ε (TN b,AS )} if ¬∃E content
A. Procedural compliance analysis To prove if node a has invoked the node b, AS node needs to receive the evidence logged by node b, this evidence unambiguously prove node b logged evidence (as it has been received) which further proves it has been invoked , that is
C. Computational compliance analysis As stated previously an approximation function is needed to simulate the computation of the service node to map the input / prior states to the output / post states. Sometimes the approximation function will be difficult to obtain in the real practice or may need domain experts to conduct the analysis. However, as long as the relevant evidence is recorded, this provides a means to investigate what actually happened in a post-facto manner even though costly. The verification of computational compliance need to be carried out after the conclusion of the content compliance of the edge connecting to the node and the edge connecting this node to the succeeding node, i.e. V N a,N b and V N b,N c for computational compliance on node b according to (7). This is to ensure the input and the output in evidence are the actual data incurred in the collaboration. Given the content compliance, the function of the service node can be inferred as
b Nb Nb Na EN procedure ⇒ A log,ty ⇒ A in,t2 ⇒ Comp procedure
Where ‘⇒’ symbol refers to material implication. Procedural compliance can also be inferred indirectly, any service node that will be invoked by node b and its succeeding nodes can log evidence to prove the process has been through node b, that is, if ∃V N b,N c ∈ P then c Nb Nb Na E Nc ⇒ A N in ⇒ A out ⇒ A in ⇒ Comp procedure
The confidence for procedural incompliance of node a is then b if ∃E N Na procedure Conf ¬procedure = 0 N b,N c if ∃V and ∃E N c
b,in N b,out Nb Nb Nb Nb Nb EN compute , E compute ⇒ D in,t2 , S t2 , D out,t3 , S t3 ⇒ F
and the confidence approaches 1 as the expected evidence from node b continues to be missing, in this case, it can be adjusted as
With content compliance, by assuming an approximate function , the compliant and incompliant state are expressed as
a Conf N ¬procedure = fconf {t − ε (ty ) − ε (TN b,AS )}
b Conf N ¬compute =
303
Nb b Nb Nb 0 if F N b D N in,t2 , S t2 = D out,t3 , S t3 Nb b Nb b = D N 1 if F N b D N out,t3 , S t3 in,t2 , S t2
way, the confidence of QoS incompliance of node b can be computed as b Conf N QoS = Conf ¬T
When any needed evidence is missing, this can be reflected in the confidence of the two content compliance (input and output edge), in this case the confidence of computational incompliance equals to the one of two content incompliance confidence whichever is bigger:
AS a Conf ¬E N QoS,t1 = fconf t1 − t 1 − ε (TNa,AS ) a Where ¬E N QoS,t1 represents the logical state of evidence a EN being incorrect, ε (TN a,AS ) is the expected QoS,t1 (average) transmission latency between node a and AS node.
Similarly, dishonest nodes can also exploit the possible transmission congestion at its input/output edge. For response time QoS for example, a dishonest node may log evidence a moment after it has started processing a job or log evidence before the output can be sent to the next node, in order to reduce its processing time observed by AS . So apart from the confidence of the time in the submitted evidence, the confidence of the transmission latency at service’s input/output edge also needs to be analyzed. For simplicity, let us assume the confidence estimation equations are linear, the confidence of dishonest transmission time at the input edge of node b is N a,N b t1,t2
N b,N c t3,t4
In order to visualize the capability of our approach in monitoring the status and compliance of underlying business process, in the past, we have implemented a compliance monitoring console, to show the information the AS node has collected and concluded. We now have extended the console according to the model we built. A screen-shot of the current implementation of the monitoring console is show in Fig. 4. The console consists of four panels. The Document Panel displays the documentations registered by the services participating in the business process, such as BPEL, WSDL and WSLA3 . The top right panel displays the overview of the business process, animation is used to show the interactions between the underlying entities or service nodes so it can be seen that at which stage the process is being conducted. Two views are provided (which can be switched between), the process view shows the horizontal structure of the collaboration and displays the interactions at the entity level. In the infrastructure view, the business process is represented in terms of service nodes deployed in different clouds. Those nodes are identified according to their IP address, should any incident occurs, the suspected computing instance will be located by its IP address for diagnosis. Note that, each service entity may have several service nodes deployed in the clouds. The infrastructure view can help to narrow down the fault domain to the instance level, that is, find out with instance owned by which entity is being incompliant. The bottom left panel shows the performance statistics of the services being monitored, as well as the incompliance confidence about their behaviors. Both performance and confidence are continuously updated according to the evidence received. The statistics shows the numerical performance according to the evidence while the confidence exposes the credibility of such values and the fulfillment of other nonnumerical compliance types. The bottom right panel displays the status and statistics of the whole business process. It shows the number of jobs have been done, the stage of current process and the health of the process concluded by the AS . Apart from the console, we have implemented an experimental business process in Amazon EC2 – a computing resource provisioning service that charges the user according to the CPU usage. Our implementation used Apache Tomcat 5.5 as our Servlet container, and Axis2 1.5 as our web service engine. We use BPEL to orchestrate the service nodes to form our loan application business process. Apache Orchestration
QoS compliance is more challenging to be verified compared to the other three, because it is determined by recording the time elapsed when a service node is conducting certain activity. As an external party, it is difficult for the AS node to find out the actual time elapsed from the evidence submitted by the logger, since the network congestion may arbitrarily affect the transmission latency, and this fact may be exploited by dishonest participators to hide incompliance. Therefore, AS node needs to record the time when the evidence is received and use it to determine the credibility of the claimed activity time enclosed in the evidence. For example, if t1 is provided in the evidence as the time the node a claims when activity happened, the evidence will reach at AS node at t AS 1 . QoS compliance requires the logger to submit evidence immediately after the occurrence of the a activity, so the confidence of evidence E N QoS,t1 being bogus can be computed as
¬T
+ Conf ¬T
VI. E VALUATIONS
D. QoS compliance
The response time of node b can be calculate by subtracting t3 and t2 (7) and this confidence reflects how reliable this result is, the higher the incompliance confidence, the less the this subtracting result can be trusted.
N a,N b N b,N c b Conf N ¬compute = max Conf ¬content , Conf ¬content
= fconf {t2 − t1 − ε (TN a,N b )}
a Nb +Conf ¬E N QoS,t1 + Conf ¬E QoS,t2
Conf
N a,N b t1,t2
a,N b Where T N is the transmission latency between node a t1,t2 a,N b is the and node b with duration t2 − t1 , and ¬T N t1,t2 logical state this transmission latency reflected from the evidence being bogus. ε (TN a,N b ) is the expected transmission latency between node a and node b. In this
3 Web
304
Service Level Agreement http://www.research.ibm.com/wsla/
Figure 4.
Screen-shot of Compliance Monitoring Console
Director Engine (ODE)4 has been used to conduct the business process defined in BPEL. We deployed 16 service nodes on 16 computing instances with computing power equivalent to 1GHz CPU and 1.7GB memory. With one being accountability service node, each of the five service entities in our example has 3 service nodes deployed in the cloud. The 15 nodes are orchestrated by BPEL scripts to simulate the business process in our running example. The service nodes belong to the same service entity form a simple sequential composition (i.e. sequentially invoked). Logging actions are inserted into the BPEL scripts as ‘invoke’ actions to the AS node. Details of this can be found in our previous work [20]. With the 15 service nodes, we have tested the overhead introduced when services log for different types of compliance. The results are shown in Fig. 5. We have tested the business process with request message size from 1KB (equivalent to a long sentence) to 50KB (equivalent to a medium size document). Procedural compliance is the only type which incurs minor overhead regardless of the message size. Other three types all introduce substantial extra latency (up to 60%) which grows as the request message becomes larger. However, this result is quite as expected, as those three types of compliance simply requires the same message to be re-transferred (more than ones in some case). One way to reduce this overhead is to log hash of the data instead of the data, this would require the loggers to archive the data until it is no longer needed for compliance verification. In this way, the overhead will be significantly reduced to close to the extra latency of procedural compliance logging. Nevertheless, the testing results suggest substantial cost when logging entire evidence for strong accountability, in the real practice, such stringent logging shall only be applied to critical actions.
Figure 5.
to conveniently capture processing data at the service node during the execution. Some other approaches like [10][13][16] let service nodes emit evidence to a central authority to process them aggregately. Focus of these approaches is usually on the ease of deployment and measuring accuracy. It is often assumed in these approaches that the collected evidence is not bogus and the incompliant entity will admit the violation. Our work, on the other hand considers a more hostile environment where all service entities are expected to behave in any possible manner and deceive for their own benefit. Approaches like [22][5] share this point with us that cryptographic techniques are employed to achieve provability. There are many existing modeling of business process, popular ones like BPEL and XPDL5 . They are designed solely to describe or define the action sequence of the process. Whereas our approach aims to capture the trust relationships and deployment details for compliance analysis and reasoning. In fact, it is common that approaches for compliance analysis will develop a model based on similar concept to BPEL and XPDL. Approaches like [1][4] model the process as a sequence of event traces emitted by the service nodes. Compliance is verified through matching the patterns of the event or mine the event traces [2]. Approaches like [8][3], model such
VII. R ELATED WORK Compliance is an issue which has been extensively studied in recent years. Traditionally, monitoring techniques are applied for its verification. Approaches like [7][11] are heuristics 4 Apache
Figure 5. Evaluation of the logging cost
5 XML Process Definition http://www.wfmc.org/xpdl.html
ODE http://ode.apache.org/
305
Language
(XPDL)
[7] C. Ghezzi, L. Baresi, and S. Guinea, “Smart monitors for composed services,” in International Conference on Service Oriented Computing, 2004, pp. 193–202. [8] A. Haeberlen, P. Druschel, and P. Kouznetsov, “The case for byzantine fault detection,” in Conference on Hot Topics in System Dependability, 2006, pp. 5–10. [9] J. Hartog, M. Dekker, R. Corin, S. Etalle, and J. Cederquist, “An audit logic for accountability,” in International Workshop on Policies for Distributed Systems and Networks, 2005, pp. 34–43. [10] T. Holmes, U. Zdun, F. Daniel, and S. Dustdar, “Monitoring and analyzing service-based internet systems through a model-aware service environment,” in Advanced Information Systems Engineering, 2010, pp. 98–112. [11] O. Moser, F. Rosenberg, and S. Dustdar, “Non-intrusive monitoring and service adaptation for ws-bpel,” in International conference on World Wide Web. New York, NY, USA: ACM, 2008, pp. 815–824. [12] T. Murata, “Petri nets: Properties, analysis and applications,” Proceedings of the IEEE, vol. 77, no. 4, pp. 541 –580, apr. 1989. [13] PlanetFlow, “Planetlab’s traffic monitoring system,” 2008. [Online]. Available: http://planetflow.planet-lab. org/#planetlab [14] E. Rik and D. Juliane, “Reactive petri nets for workflow modeling,” in International Conference on Applications and Theory of Petri Nets, 2003, pp. 296–315. [15] R. Shirey, “Trustworthy system definition, at IETF RFC 4949,” August 2007. [Online]. Available: http: //tools.ietf.org/html/rfc4949 [16] A. Squicciarini, W. Lee, B. Thuraisingham, and E. Bertino, “End-to-end accountability in grid computing systems for coalition information sharing,” in Workshop on Cyber Security and Information Intelligence Research, 2008. [17] C. Wang, S. Chen, and J. Zic, “A contract-based accountability service model,” in IEEE International Conference on Web Services, 2009, pp. 639–646. [18] Q. Xiong, H. Zhang, and B. Meng, “The practical detailed requirements of accountability and its application in the electronic payment protocols,” in IEEE International Conference on e-Technology, e-Commerce and eService, 2005, pp. 556–561. [19] J. Yao, S. Chen, C. Wang, D. Levy, and J. Zic, “Accountability as a service for the cloud,” in IEEE International Conference on Services Computing, 2010, pp. 81–90. [20] J. Yao, S. Chen, C. Wang, J. Zic, and D. Levy, “Accountability as a service for the cloud: From concept to implementation with bpel,” in World Congress on Services (SERVICES), 2010, pp. 91–100. [21] A. Yumerefendi and J. Chase, “The role of accountability in dependable distributed systems,” in First Workshop on Hot Topics in System Dependability, 2005. [22] A. R. Yumerefendi and J. S. Chase, “Strong accountability for network storage,” ACM Trans. Storage, vol. 3, no. 3, p. 11, 2007.
execution as a sequence of service state changes, with the assumption that states are all preserved, compliance is verified by examining the causality between the states. In a similar way, Petri-net [12] has been used to model the actions and the state changes in a process [6][14]. Our modeling method differs from those approaches in the way that we model the different domains and implementation infrastructures, and explicitly define the difference between the actions happened and the actions observed by the AS node. Inference and reasoning involved in compliance assurance mostly focus on verifying the logical consistency and causality of the events. As in [18][9], correctness of certain action is proved by looking up previous actions to check if the actor has been properly authorized. In our approach, the AS node first reason about the credibility of the evidence, then analyze the extent to which the compliance can be proved by it. VIII. C ONCLUSIONS In this paper, we proposed a novel modeling of the collaborative business process, as an extension of our previous work to enforce strong accountability for compliance. This model captures both the horizontal structure (process level) and the vertical structure (infrastructure level) of the collaboration. With this model, we classified and defined four generic types of compliance, and we thoroughly analyze the evidence and the logging required for their verification. We have implemented a monitoring console to demonstrate the utilization of our modeling. We evaluate the costs of the logging in an experimental business process deployed in Amazon EC2, the overhead observed is substantial yet still acceptable. Different logging schemes shall be wised applied to meet the specific compliance stringency required. IX. R EFERENCES [1] W. M. P. v. d. Aalst, M. Dumas, C. Ouyang, A. Rozinat, and E. Verbeek, “Conformance checking of service behavior,” ACM Trans. Internet Technology, vol. 8, no. 3, pp. 1–30, 2008. [2] T. Abraham, “Event sequence mining to develop profiles for computer forensic investigation purposes,” in Australasian workshops on Grid computing and e-research, 2006, pp. 145–153. [3] J. Chase and A. Yumerefendi, “Trust but verify:accountability for network services,” in ACM SIGOPS European Workshop, 2004. [4] F. Daniel, F. Casati, and V. D’Andrea, “Business compliance governance in service-oriented architectures,” in International Conference on Advanced Informa&on Networking and Applications, 2009, pp. 113–120. [5] P. Druschel, A. Haeberlen, and P. Kouznetsov, “Peerreview:practical accountability for distributed systems,” in ACM SIGOPS symposium on Operating systems principles, 2007, pp. 175–188. [6] Y. Du, C. Jiang, and M. Zhou, “A petri net-based model for verification of obligations and accountability in cooperative systems,” IEEE Transactions on Systems, Man and Cybernetics, vol. 39, no. 2, pp. 299–308, 2009.
306