Dynamic Workflow Instrumentation for Windows ...

1 downloads 132 Views 785KB Size Report
Dynamic Workflow Instrumentation for Windows Workflow Foundation ... Windows Workflow Foundation (in short WF) was re- ..... RAM, running Windows Vista.
Dynamic Workflow Instrumentation for Windows Workflow Foundation Bart J.F. De Smet, Kristof Steurbaut, Sofie Van Hoecke, Filip De Turck, Bart Dhoedt Ghent University, Department of Information Technology (INTEC), Gent, Belgium {BartJ.DeSmet, Kristof.Steurbaut, Sofie.VanHoecke, Filip.DeTurck, Bart.Dhoedt}@UGent.be Abstract As the complexity of business processes grows, the shift towards workflow-based programming becomes more attractive. The typical long-running characteristic of workflows imposes new challenges such as dynamic adaptation of running workflow instances. Recently, Windows Workflow Foundation (in short WF) was released by Microsoft as their solution for workflow-driven application development. Although WF contains features that allow dynamic workflow adaptation, the framework lacks an instrumentation framework to make such adaptations more manageable. Therefore, we built an instrumentation framework that provides more flexibility for applying workflow adaptation batches to workflow instances, both at creation time and during an instance’s lifecycle. In this paper we present this workflow instrumentation framework and performance implications caused by dynamic workflow adaptation are detailed.

1. Introduction In our day-to-day life we’re faced with the concept of workflow, for instance in decision making processes. Naturally, such processes are also reflected in business processes. Order processing, coordination of B2B interaction and document lifecycle management are just a few examples where workflow is an attractive alternative to classic programming paradigms. The main reasons to prefer the workflow paradigm over pure procedural and/or object-oriented development are: • Business process visualization: flowcharts are a popular tool to visualize business processes; workflow brings those representations alive. This helps to close the gap between business people and the software they’re using since business logic is no longer imprisoned in code.

• Runtime services free developers from the burden of putting together their own systems for persistence, tracking, communication, etc. With workflow, the focus can be moved to the business case itself Macroscopically, the concept of workflow can be divided into two categories. First, there’s human workflow where machine-to-human interaction plays a central role. A typical example is the logistics sector, representing product delivery in workflow. The second category consists of so-called machine workflows, primarily used in B2B scenarios and inter-service communication. On the technological side, we can draw the distinction between sequential workflows and state machine workflows. The difference between them can be characterized by the transitions between the activities. In the former category, activities are executed sequentially, whilst state machines have an event-driven nature causing transitions to happen between different states. Many frameworks that implement the idea of workflow exist, including Microsoft’s WF in the .NET Framework 3.0. The architecture of WF is depicted in Figure 1. A detailed overview of WF and its architecture can be found in [1]. The WF framework allows developers to create intraapplication workflows, by hosting the runtime engine in a host process. Windows applications, console applications, Windows Services, web applications and web services can all act as hosts and façades for a workflow-enabled application. A workflow itself is composed of activities which are the atoms in a workflow definition. Examples of activities include communication with other systems, transactional scopes, if-else branches, various kinds of loops, etc. Much like controls in UI development, one can bring together a set of activities in a custom activity that encapsulates a typical unit of work. Finally, there’s the workflow runtime that’s responsible for the scheduling and execution of workflow instances, together with

• Transparency: workflows allow for human inspection, not only during development but even more importantly during execution, by means of tracking. • Development model: building workflows consists of putting together basic blocks (activities) from a toolbox, much like the creation of UIs using IDEs.

Figure 1. The architecture of WF [2]

various runtime services. One of the most important runtime services is persistence, required to dehydrate workflow instances becoming idle. This reflects the typical long-running nature of workflow. Our research focuses on building an instrumentation tool for WF, allowing activities to be injected flexibly into an existing workflow at the instance level. This paper is structured as follows. In section 2, we’ll cover the concept of dynamic adaptation in more detail. The constructed instrumentation framework is then presented in section 3 and a few of its sample uses in section 4. To justify this instrumentation technique, several performance tests were conducted, as explained in section 5.

2. Dynamic adaptation 2.1. Static versus dynamic: workflow challenges In lots of cases, workflow instances are long-lived: imagine workflows with human interaction to approve orders, or systems that have to wait for external service responses. Often, this conflicts with ever changing business policies, requiring the possibility to adapt workflow instances that are in flight. Another application of dynamic workflow adaptation is the insertion of additional activities into a workflow. By adding logic for logging, authorization, time measurement, state inspection, etc dynamically, one can keep the workflow’s definition pure and therefore more readable.

2.2. WF’s approach to dynamic adaptation To make dynamic adaptation possible, WF has a feature called “dynamic updates”. Using this technique it’s relatively easy to adapt workflow instances either from the inside or the outside. We’ll refer to these two approaches by internal modification and external modification respectively. A code fragment illustrating a dynamic update is shown in Figure 2. Internal modifications are executed from inside a running workflow instance. An advantage is having access to all internal state, while the main drawback is the need to consider internal modifications upfront as part of the workflow definition design process. External modifications are performed at the host layer where one or more instances requiring an update are selected. Advantages are the availability of external conditions the workflow is operating in and the workflow definition not needing any awareness of the possibility for a change to happen. This flexibility comes at the cost of performance because a workflow instance needs to be suspended in order to apply an update. Also, there’s no prior knowledge about the state a workflow instance is in at the time of the update. WorkflowChanges changes = new WorkflowChanges(instance); // Add, remove, modify activities changes.TransientWorkflow.Activities.Add(...); foreach (ValidationError error in changes.Validate()){ if (!error.IsWarning) { //Report error or fix it.}} instance.ApplyWorkflowChanges(changes);

Figure 2. Apply a dynamic update to an instance

3. An instrumentation framework for WF 3.1. Design goals In order to make the instrumentation of workflow instances in WF easier and more approachable, we decided to build an instrumentation framework based on WF’s dynamic update feature. Our primary design goal is to realize a framework as generic as possible, allowing for all sorts of instrumentation. All instrumentation tasks are driven by logic added on the host layer where the workflow runtime engine is hosted; no changes whatsoever are required on the workflow definition level. This allows the instrumentation framework to be plugged into existing workflow applications with minimum code churn. It’s important to realize that instrumentations are performed on the instance level, not on the definition level. However, logic can be added to select a series of workflow instances to apply the same instrumentation to all of them. At a later stage, one can feed back frequently performed instrumentations to the workflow definition itself, especially when the instrumentation was driven by a request to apply functional workflow changes at runtime. This feedback step involves the modification and recompilation of the original workflow definition. Another use of workflow instrumentation is to add aspects such as logging to workflow instances. It’s highly uncommon to feed back such updates to the workflow definition itself because one wants to keep the definition aspect free. However, such aspects will typically be applied to every newly created instance. In order to make this possible, instrumentation tasks can be batched up for automatic execution upon creation of new workflow instances. The set of instrumentation tasks is loaded dynamically and can be changed at any time.

3.2. Implementation Each instrumentation task definition consists of a unique name, a workflow type binding, a parameter evaluator, a list of injections and optionally a set of services. We’ll discuss all of these in this section. The interface for instrumentation tasks is shown in Figure 3. The instrumentation tool keeps a reference to the workflow runtime and intercepts all workflow instance creation requests. When it’s asked to spawn a new instance of a given type with a set of parameters, all instrumentation tasks bound to the requested workflow type are retrieved. Next, the set of parameters is passed to the parameter evaluator that instructs the framework whether or not to instrument that particular instance. If instrumentation is requested, the instrumentation task is applied as described below. The tool accepts requests to instrument running instances too; each such request consists of a workflow instance identifier together with the name of the desired instrumentation task. Filtering logic to select only a subset of workflow instances has to be provided manually. When asked to apply an

instrumentation task to a workflow instance, the instrumentation tool performs all injections sequentially in one dynamic update so that all injections either pass or fail. Every injection consists of an activity tree path pointing to the place where the injection has to happen, together with a before/after indicator. For example, consider the example in Figure 4. In order to inject the discount activity in the right branch, the path priceCheck/cheap/suspend1 (after) will be used. This mechanism allows instrumentation at every level of the tree and is required to deal with composite activities. Furthermore, each injection carries the activity that needs to be injected (the subject) at the place indicated, together with (optionally) a set of data bindings. Finally, every instrumentation task has an optional set of services. These allow injected activities to take advantage of WF’s Local Communication Services to pass data from the workflow instance to the host layer, for example to export logging messages. When the instrumentation task is loaded in the instrumentation tool, the required services are registered with the runtime. interface IInstrumentationTask { // Parameter evaluator bool ShouldProcess(Dictionary args); // Instrumentation task name string Name { get; } // Type name of target workflow definition string Target { get; } // Local Communication Services list object[] Services { get; } // Set of desired injections Injection[] Injections { get; } } class Injection { public string Path { get; set; } public InjectionType Type { get; set; } public Activity Subject { get; set; } public Dictionary Bindings { get; set; } } enum InjectionType { Before, After }

Figure 3. Definition of an instrumentation task

3.3. Design details In order to apply workflow instrumentation successfully, one should keep a few guidelines in mind, as outlined below: • Instrumentations that participate in the data flow of the workflow can cause damage when applied without prior validation. Invalid bindings, type mismatches, etc might cause a workflow to fail or worse, producing invalid results. Therefore, a staging environment for instrumentations is highly recommended. • Faults raised by injected activities should be caught by appropriate fault handlers; dynamic instrumentation of fault handlers should be considered. • Dynamic loading of instrumentation tasks and their associated activities should be done with care since the CLR does not allow assembly unloading inside an app domain. Therefore, one might extend our framework to allow more intelligent resource utilization, certainly

when instrumentations are applied on an ad hoc basis. More granular unloading of (instrumented) workflow instances can be accomplished by creating partitions for (instrumented) workflow instances over multiple workflow engines and application domains. • There’s currently no support in our instrumentation framework to remove activities (injected or not) from workflow instances, which might be useful to “bypass” activities. A workaround is mentioned in 4.2. • When applying instrumentations on running workflow instances, it’s unclear at which point the instance will be suspended prior to the instrumentation taking place. Unless combined with tracking services, there’s no easy way to find out about an instance’s state to decide whether or not applying certain instrumentations is valuable. For example, without knowing the place where a workflow instance is suspended, it doesn’t make sense to add logging somewhere since execution might have crossed that point already. This problem can be overcome using suspension points as discussed in paragraph 4.2.

4. Sample uses of instrumentation 4.1. Authorization and access control In order not to poison a workflow definition with access control checks, distracting the human from the key business process, authorization can be added dynamically. Two approaches have been implemented, the first of which adds “authorization barriers” to a workflow instance upon creation. Such a barrier is nothing more than an access denied fault throwing activity. The instrumentation framework uses the parameter evaluator to decide on the injections that should be executed. For example, when a junior sales member is starting a workflow instance, a barrier could be injected in the if-else branch that processes sales contracts over $ 10,000. This approach is illustrated in Figure 4. An alternative way is to add authorization checkpoints at various places in the workflow, all characterized by a unique name indicating their location in the activity tree. When such a checkpoint is hit, an authorization service is interrogated. If the outcome of this check is negative, an access denied fault is thrown.

4.2. Dynamic adaptation injection Using the instrumentation tool, internal adaptation activities can be injected into a workflow instance. This opens up for all of the power of internal adaptations, i.e. the availability of internal state information and the fact that a workflow doesn’t need to be suspended in order to apply an update. Furthermore, we don’t need to think of internal adaptations during workflow design, since those can be injected at any point in time. This form of instrumentation has a Trojan horse characteristic, injecting

additional adaptation logic inside the workflow instance itself, allowing for more flexibility. However, internal modifications still suffer from the lack of contextual host layer information. This can be solved too, by exploiting the flexibility of dynamic service registration to build a gateway between the workflow instance and the host. Still, scenarios are imaginable where external modifications are preferred over injected internal modifications, for example when much host layer state is required or when more tight control over security is desirable. In order to allow this to happen in a timely fashion, we can inject suspension points in the workflow instance at places where we might want to take action. When such a suspension point is hit during execution, the runtime will raise a WorkflowSuspended event that can be used to take further action. Such an action might be instrumentation, with the advantage of having exact timing information available. As an example, suspend1 is shown in Figure 4. This suspension point could be used to change the discount percentage dynamically or even to remove the discount activity dynamically, as a workaround for the lack of activity removal in our instrumentation framework. Notice that the discount activity itself was also added dynamically, exploiting the flexibility of dynamic activity bindings in order to adapt the data in the surrounding workflow, in this case the price.

4.3. Time measurement Another instrumentation we created during our research was time measurement, making it possible to measure the time it takes to execute a section of a workflow instance’s activity tree, marked by a “start timer” and “stop timer” pair of activities. Instrumentation for time measurement of an order processing system’s approval step is depicted in Figure 4. This particular case is interesting because of a few things. First of all, both timer activity injections have to be grouped and should either succeed or fail together. This requirement can be enforced using additional logic too, i.e. by performing a check for the presence of a “start timer” activity when a “stop timer” activity is added. However, it’s more complicated than just this. Situations can arise where the execution flow doesn’t reach the “stop timer” activity at all, for instance because of faults. Another problem occurs when the injections happen at a stage where the “start timer” activity location already belongs to the past; execution will reach the “stop timer” activity eventually, without accurate timing information being available. Implementing this sample instrumentation also reveals a few more subtle issues. When workflow instances become suspended, persistence takes place. Therefore, non serializable types like System.Diagnostics.Stopwatch can’t be used. Also, rehydration of a persisted workflow could happen on another machine too, causing clock skews to become relevant.

Instrumentation before (left) and after (right)

Figure 4. An example instrumentation

5. Performance evaluation of instrumentation 5.1. Test methodology To justify the use of workflow instrumentation, we conducted a set of performance tests. More specifically, we measured the costs imposed by the dynamic update feature of WF in various scenarios. In order to get a pure idea of the impact of a dynamic update itself, we didn’t hook up persistence and tracking services to the runtime. This eliminates the influences caused by database performance and communication. In real workflow scenarios however, services like persistence and tracking will play a prominent role. For a complete overview of all factors that influence workflow applications’ performance, see [3]. All tests were performed on a machine with a 2.16 GHz Intel Centrino Duo dual core processor and 2 GB of RAM, running Windows Vista. Tests were hosted in plain vanilla console applications written in C# and compiled in release mode with optimizations turned on. To perform timings, the System.Diagnostics.Stopwatch class was used and all tests were executed without a debugger attached. For our tests, we considered a workflow definition as shown in Figure 4. One or more activities were inserted in corresponding workflow instances, at varying places to get a good image about the overall impact with various injection locations. The injected activity itself was an empty code activity in order to eliminate any processing cost introduced by the injected activity itself, also minimizing the possibility for the workflow scheduler to yield execution in favor of another instance. This way, we isolate the instrumentation impact as much as possible.

5.2. Internal versus external modification First, we investigated the relative cost of internal versus external modification. Without persistence taking place, this gives a good indication about the impact introduced by suspending and resuming a workflow instance when applying ad hoc updates to a workflow instance from the outside. In our test, we simulated different workloads and applied the same dynamic update to a workflow both from the inside using a code activity and from the outside. Each time, the time required to perform one injection was measured and best case, worst case and average case figures were derived. The results of this test are shown in Figure 5. The graphs show the time to apply the changes as a function of the number of concurrent workflow instances (referred to as N). One clearly observes the much bigger cost associated with external modification caused by the suspension-updateresumption cycle, even when not accounting for additional costs that would be caused by persistence possibly taking place. Also, these figures show that internal modification is hurt less by larger workloads in the average case, while external modification is much more impacted. These longer delays in the external modification case can be explained by workflow instance scheduling taking place in the runtime, causing suspended workflows to yield execution to other instances that are waiting to be serviced. In the internal modification case, no suspensions are required, so no direct scheduling impact exists. However, one should keep in mind that the suspension caused by external modification is only relevant when applying updates to running workflow instances. When applying updates at workflow instance creation time, no suspension is required and figures overlap roughly with

the internal modification case. Because of this, instrumentation at creation time is much more attractive than applying ad hoc injections at runtime.

5.3. Impact of update batch sizes Since instrumentation tasks consist of multiple injections that are all performed atomically using one dynamic update, we’re interested in the relationship between the update batch size and the time it takes to apply the update. To conduct this test, we applied different numbers of injections to a series of workflow instances using external modification, mimicking the situation that arises when using the instrumentation framework. Under different workloads, we observed a linear correspondence between the number of activities added to the workflow instance (referred to as n) and the time it takes to complete the dynamic update. The result for a workload of 100 concurrent workflow instances is shown in Figure 6. Based on these figures, we decided to investigate the impact of joining neighboring injections together using a SequenceActivity and concluded that such a join can impact the update performance positively. However, designing an efficient detection algorithm to perform these joins isn’t trivial. Moreover, wrapping activities in a composite activity adds another level to the activity tree, which affects other tree traversal jobs but also subsequent updates.

Figure 6. Impact of workfow update batch size on update duration

5.4. The cost of suspension points a) Internal modification

b) External modification

Figure 5. Overhead of applying dynamic modifications to workflows

In order to get a better indication about the cost introduced by the use of suspension points for exact external update timing, we measured the time it takes to suspend and resume a workflow instance without taking any update actions in between. This simulates the typical situation that arises when inserting suspension points that are rarely used for subsequent update actions. Recall that suspension points are just a way to give the host layer a chance to apply updates at a certain point during the workflow instance’s execution. The result of this test conducted for various workloads (referred to as N) in shown in Figure 7. We observe similar results to the external modification case, depicted in Figure 5. From this, we can conclude that the major contributing factor to external modification

duration is the suspend-resume cycle. Naturally, it’s best to avoid useless suspension points, certainly when keeping possible persistence in mind. However, identifying worthy suspension points is not an exact science.

Figure 7. Workflow instance suspend-resume cost for suspension points

5.5. Discussion Based on these results, we conclude that ad hoc workflow instance instrumentation based on external modification has a significant performance impact, certainly under heavy load conditions. Instrumentation taking place at workflow instance creation time or adaptation from the inside is much more attractive in terms of performance. Injection of dynamic (internal) adaptations as explained in section 4.2 should be taken under consideration as a valuable performance booster when little contextual information from the host layer is required in the update logic itself. Overuse of suspension points, though allowing a big deal of flexibility, should be avoided if a workflow instance’s overall execution time matters and load on the persistence database has to be reduced. These results should be put in the perspective of typically long running workflows. Unless we’re faced with time-critical systems that require a throughput as high as possible, having a few seconds delay over the course of an entire workflow’s lifecycle shouldn’t be the biggest concern. However, in terms of resource utilization and efficiency, the use of instrumentation should still be considered carefully.

6. Conclusions Shifting the realization of business processes from pure procedural and object-oriented coding to workflowdriven systems is certainly an attractive idea. Nevertheless, this new paradigm poses software engineers with new challenges such as the need for dynamic adaptation of workflows without recompilation ([4]), a need that arises from the long-running characteristic of workflows and the ever increasing pace of business process and policy changes. To assist in effective and flexible application of dynamic updates of various kinds, we created a generic instrumentation framework capable of applying instrumentations upon workflow instance creation and

during workflow instance execution. The former scenario typically applies to weaving aspects into workflows, while the latter one can assist in production debugging and in adapting a running workflow to reflect business process changes. The samples discussed in this paper reflect the flexibility of the proposed instrumentation framework. Especially, the concept of suspension points allowing external modifications to take place in a time-precise manner opens up for a lot of dynamism and flexibility. From a performance point of view, instrumentations taking place at workflow instance creation time and internal modifications are preferred over ad hoc updates applied on running workflow instances. Also, applying ad hoc updates is a risky business because of the unknown workflow instance state upon suspension. This limitation can be overcome by use of suspension points, but these shouldn’t be overused.

7. Future work One of the next goals is to make the instrumentation framework easier and safer by means of designer-based workflow adaptation support and instrumentation correctness validation respectively. Workflow designer rehosting in WF seems an attractive candidate to realize the former goal but will require closer investigation. Furthermore, our research focuses on the creation of an activity library for patient treatment management using workflow, in a data-driven manner. Based on composition of generic building blocks, workflow definitions are established to drive various processes applied in medical practice. Different blocks for data gathering, filtering, calculations, etc will be designed to allow the creation of data pipelines in WF. Our final goal is to combine this generic data-driven workflow approach with the power of dynamic updates and online instrumentation. The need for dynamic adaptation in the health sector was pointed out in [5]. This introduces new challenges to validate type safety with respect to the data flowing through a workflow, under the circumstances of activity injection that touches the data. Tools to assist in this validation process will be required.

References [1] [2] [3] [4]

[5]

D. Shukla and B. Schmidt, Essential Windows Workflow Foundation. Addison-Wesley Pearson Education, 2007. Windows SDK Documentation, MS Corp., Nov. 2006. M. Mezquita, “Performance Characteristics of WF,” on the Microsoft Developer Network (MSDN), 2006. P. Buhler and J.M. Vidal. Towards Adaptive Workflow Enactment Using Multiagent Systems. Information Technology and Management Journal, 6(1):61--87, 2005. J. Dallien, W. MacCaull, A. Tien, “Dynamic Workflow Verification for Health Care,” 14th Int. Symposium on Formal Methods, August 2006.

Suggest Documents