Process Spaces Management Systems Hamid R. Motahari Nezhad1 , Boualem Benatallah1 , and Fabio Casati2 1
The University of New South Wales, Australia, hamidm|
[email protected], 2 The University of Trento, Italy,
[email protected]
Abstract. We introduce process spaces as a new abstraction for the definition of a process metaphore over various (heterogeneous) information systems in an enterprise, which are not necessarily process-aware, and propose the design and development of process spaces management systems (PSMSs), which provide support for the discovery, analysis, monitoring, querying, and exploration of process spaces. We also present our current research progress towards the development of a PSMS, which is the design and development of a system, called process spaces discovery system, for the discovery of the process space of an enterprise.
1
Introduction
Business processes are at the heart of what public and private enterprises do. For most companies, the success is strictly related to how efficient and effective the execution of their processes is. Buzzwords such as process cockpit [17], business activity monitoring (BAM) [3] and business process intelligence [6] are becoming commonplace. Recently, the problem of understanding the behavior of information systems as well as the processes and services they support has become a priority in medium and large enterprises. This is demonstrated by the proliferation of tools for the analysis of process executions, service interactions, and service dependencies [9, 10, 15], and by recent research work in process data warehousing and process discovery [16, 18, 1, 12]. Indeed, the adoption of business intelligence techniques for business process improvement is the primary concern for medium and large companies [7]. Typical questions that business analysts, IT managers, developers in the company and in some cases even end users would like to have answered related to above scenario are: where are the bottlenecks in the purchasing process? what is the actual process typically followed for invoice payment? what is the status of purchase order number 325, who processed it and how?, and how to find all the information related to a specific purchase of a customer (order, invoice, payment, shipping, etc) in the enterprise? (the latter is also referred to as enterprise search [8]). While successful, existing business process management tools enable monitoring and analysis of operational business processes, i.e., the ones that are explicitly defined and the process is managed by a process-aware system, such
as a workflow management system (WfMS) [1, 19]. However, in reality, only fraction of process executions is supported by a WfMS and business process is implemented across several heterogeneous IT systems. Indeed, data related to the same process execution is spread across various information systems and possibly in different document formats and data representation models. Such challenges are also ubiquitous, i.e., they arise in large and small enterprises and in different domains, e.g., health, insurance, supply chains, loan approval, etc. However, in each of these scenarios there are identifiable data sources, and data and control flow over them and users in the enterprise, which specify a space of processes in the enterprise corresponding to the views of different users and systems. Indeed, we can no longer impose a single definition of the process execution in the whole enterprise, and for all users, rather different users may have, or would like to see, different perspectives on process execution corresponding to their responsibilities in the enterprise. This is also captured by Gartner as the phenomenon of “process of me” [5]. According to other reports of Gartner, process analysis and monitoring in such environments as a major challenge that play a vital role in the survival and competitiveness of process management systems vendors in near future [3, 4]. In this paper, we introduce process spaces as a new abstraction for capturing and defining various process perspectives in the enterprise, process view as a notion that allows to look at the process execution in the enterprise from a specific perspective, and we propose the design and development of Process Spaces Management Systems (PSMS) for process definition, management and analysis (browse, search, query, and track, etc) in a process space. A PSMS enables to analyze and tag apparently uncorrelated information items scattered across various IT systems in the enterprise so that they can be seen “as if” they were captured by a single business process management system. More precisely, it allows defining a holistic view of the process executions over various information systems and services, to enable interpretation of the information in the enterprise in the context of executions of processes. In the reminder of the paper, we define a process space in Section 2, and introduce PSMS and outline the functionalities that it should offer. We present current research progress towards development of a PSMS in Section 3. Finally, related work is presented in Section 4.
2
Process Spaces
A process view is the representation of the process execution in the enterprise from the perspective of a given user or system. The notion of process space refers to the superimposition of various process views over heterogeneous information systems and services in an enterprise. A process space is defined on top of information related to the execution of processes in an enterprise, possibly captured by or stored in various information systems, e.g., event logging systems, document management systems, e-mail systems, WfMS, ERP systems, etc. The definition of the process space over information sources requires the ability to:
Fig. 1. The logical components of a process space. Process views are defined over information items in a process space
– Map information items (events and data) in the various IT systems of the enterprise to progressions of a process, i.e., to the start/execution/completion of process tasks (e.g., define that a data entry in a certain SAP table corresponds to the start of the supplier evaluation phase). We refer to an information item mapped to the execution of a process as a process item; – Correlate process items in the different IT systems of the company to process instances, that is, to understand which process items correspond to the same execution (instance) of a process (e.g., be able to detect that a data entry in SAP and a message sent over an enterprise service bus correspond to two items related to the same purchase order no. 325); – Define or discover a model of the process to be analyzed, used as a reference for asking queries such as those mentioned in the introduction (e.g., model the purchase order process at the appropriate level of abstraction) from a set of process instances. In a process space, different correlations can be defined over the same set of items as different analysts may be interested in different views over such items. For example, the shipments of a set of goods may be related to the same process from the view of the warehouse manager, but if the goods are the results of different orders, they are unrelated from the view of the sales manager. Therefore, we model a process view as the set of “process items” which are grouped into “process instances” using a given correlation, and the corresponding “process model” at the right level of abstraction (see Figure 1). A process view may be nested (e.g., process view of the purchase order management system is nested within that of whole enterprise, considered as a sub-process). This allows looking at the process space at various levels of abstractions and granularities from the high-level (system-level) to details of process executions. A process space is then
modeled as a set of process views defined on top of enterprise resources. Figure 1 shows the logical components of a process space, and in particular process views defined starting from the information items. We propose the development of process space management systems (PSMS) for the definition and management of process spaces in an enterprise. We envision that a PSMS offers the following typical categories of functionalities: process space discovery, process space analysis and management, and end user tools for process space exploration and visualization.
3
Process Spaces Discovery System
The first step towards the development of a PSMS is providing approaches and tools to define the process space. That is to identify process-related data sources, and to identify process views on top of them. This can be done both by human users to manually define, or automatically discovered from the data sources. Tools such as HP Business Process Insight allow manual definition of process models by human users, however, they provide very limited support for mapping and correlations of information in an enterprise. In the following, we focus on the development of a system for automated discovery of process spaces, and we refer to it as process spaces discovery system. To this end, we have tackled the problem using a “diagonal” approach, and focused on the analysis of Web services interactions in the enterprise. In particular, we have proposed Process Spaceship system for discovery of process spaces [14]. We take the log of Web services interactions in the enterprise as the input. We propose an approach to correlate messages related to a same conversation between Web services (a conversation is a sequence of messages exchange to fulfill a certain functionality, e.g., to order goods and pay for them). In Process Spaceship, we consider the arrival of input message of an operation as the start of a process task corresponding to the operation invocation, and its output message as the completion of the task. The process spaces discovery approach consists of three steps [14]: message correlation, process model inference, and visual exploration and refinement, discussed in the following. Correlation. The goal of this step is to find which messages belong to the same process execution. In the most common approach, two messages are correlated if they share the same value for a key attribute in both messages (called key-based correlation), or if an attribute, called reference, from the second message, in chronological order, has the same value of an attribute from the first message (called reference-based correlation) [14]. For instance, two messages Quote and P O may have the same value for attribute qID, then the key-based correlation condition is Quote.qID=P O.qID. Similar to the concept of composite keys in relational database, more than one attributes from a given message may be used to correlate it to another message, e.g., both customerID and orderID. In some other cases, different message pairs in the same process instance may use different correlation conditions.
SCM
SCM Process Map
OS: PurchaseOrder System PS: Payment System CRS: Customer Relationship System NP OS
Quote: OS: getQuote PO → OS:submitPO Inv → PS:sendInv Pay → PS:makePayment NP → CRS:NewProductInfo SR → CRS:SurveyResult
SR
PS
Subsumed-by Retailer Quote PO
Subsumed-by
Is-part-of
Is-part-of OS
PO
Pay
PS
OS
Quote
Inv
CRS
PS
Inv
Pay
NP
SR
Fig. 2. Part of process map of a supply chain management
Discovering correlation conditions presents several challenges. A first challenge lies in the large space of possible correlation conditions that can be built based on combinations of attributes of messages. Another challenge consists of identifying the interestingness of views as a result of using a given condition. In [14], we have proposed an efficient algorithm for discovering interesting conditions. We adopting a level-wise approach [11] to explore the space of possible conditions, and employing a set of heuristics (e.g., heuristics on the number of messages in a process instance, and the number of instances in the log) to identify interestingness of process views. Process model inference. Using a certain correlation condition on messages in the log results in grouping messages into conversations. Once we have a set of instances, we can derive the process model that these instances obey. We use techniques proposed in [13] for this purpose. However, other process model inference algorithm can be used, as well. Visual exploration. Each correlation condition, and consequently each grouping of messages into instances, corresponds to a process view. There are many possible views, as there are many possible correlation conditions. To facilitate exploration and refinement of results, we organize the discovered process views in a process map. Nodes in the map represent process views. The links between nodes represent the relationship between process models of views. The map is arranged according to the level of granularity of process models, having views with highly correlated instances (e.g., quote and purchase order in Figure 2) at the lower levels, and those with large and possibly more loosely coupled instances (e.g., including all messages related to purchase order and to its payment, or all messages related to the interaction with customer relationship service) at the higher levels. The process views provide an index of data in the log, and allow one to focus the analysis on a given views’ process model and data.
Fig. 3. A Screenshot of Process Spaceship
Figure 3 shows a screenshot of Process Spaceship illustrating the process map discovered for services in a supply chain management scenario. A process space administrator can use this tool to discover the process space from the log of interactions of systems and services in enterprise. We have evaluated Process Spaceship using both synthetic and real-world datasets that demonstrates its viability (in terms of precision and recall) and efficiency (in terms of search space pruning and execution time) of the approach on both synthetic and real-world service logs (see [14] for more details). We believe that integrating this tool with available process analysis and tracking tools (e.g., HP Business Process Insight) greatly simplifies the job of end-users since using the map and the process view centric perspective make it easy to locate the desired views for subsequent querying, analysis, monitoring and tracking tasks.
4
Related Work and Discussion
Process spaces and process space management systems (PSMSs) offer a new era of process definition, analysis and management for the today’s enterprises. The research in this area is just the beginning, and this paper is the first to introduce these concepts and the need for such systems. Here, we briefly discuss the contrasts of PSMSs with their traditional counterparts, i.e., workflow management systems (WfMSs), and business process management systems (BPMSs), and business activity monitoring (BAM). Workflow and Business Process Management. WfMSs support the definition, development, execution, and maintenance of business processes. BPMSs,
as an extension of classical WfMSs, focus on analysis, prediction, and tracking of business processes. WfMSs and BPMSs cover only operational business processes, i.e., the ones that are explicitly designed and modeled [19]. In contrast, process spaces aim at supporting a variety of processes, regardless of whether process models are defined or the interaction is supported by a WfMS or not. In fact, the process space aims to identify process-related information and provide an opportunity to explicitly define them. An additional important difference is that in WfMS it is assumed that the correlation of events into process spaces is predefined. However, in process spaces discovery system, we propose to discover them from the information in the enterprise. Another key distinguishing feature of process definition and management in a process spaces with BPMS and WfMS is that a BPMS (WfMS) assumes the full control of the systems (data sources) that execute the underlying process. However, in process spaces we recognize the independence of systems executing the process and we intend mainly to provide an understanding of the process execution by the aggregate of existing systems and services. Business Activity Monitoring (BAM) and Business Intelligence (BI) tools. BI tools [6] allow to collect information from various data sources in the enterprise, e.g., using ETL-based tools, and analyze them and report the results in terms of KPIs (key performance indicators). BAM tools, on the other hands, play a similar role, however, they process real-time events. These approaches take advantages of event processing systems, e.g., [2]. However, one limitation of such tools is that they mainly provide data-level rather than processlevel business analysis. Another limitation, as identified also by Gartner [3], is enabling BAM on information generated by several sources in the enterprise, which needs correlation of information into process instances to enable analysis in the context of process executions. The proposed process space discovery system in this paper complements the BI/BAM tools. BI/BAM tools can build on top of PSDS to enable various analyses in the context of process executions in such environments.
References 1. Fabio Casati, Mal´ u Castellanos, Umeshwar Dayal, and Norman Salazar. A generic solution for warehousing business process data. In VLDB, pages 1128–1137, 2007. 2. Alan J. Demers, Johannes Gehrke, Biswanath Panda, Mirek Riedewald, Varun Sharma, and Walker M. White. Cayuga: A general purpose event monitoring system. pages 412–422, 2007. 3. Gartner. Business Activity Monitoring: Calm Before the Storm. ID Number: LE15-9727, http://www.gartner.com/resources/105500/105562/105562.pdf, April 2002. 4. Gartner. Gartner’s Application Development and Maintenance Research. Note M-16-8153, The BPA Market Cathes another Major Updraft. www.gartner.com, 2002. 5. Yvonne Genovese, Jeff Comport, and Simon Hayward. Person-to-process interaction emerges as the ’process of me’. In www. gartner. com/ DisplayDocument? ref= g_ search&id= 492389 , 2006.
6. Daniela Grigori, Fabio Casati, Malu Castellanos, Umeshwar Dayal, Mehmet Sayal, and Ming-Chien Shan. Business process intelligence. Comput. Ind., 53(3):321–343, 2004. 7. Gartner Group. Gartner exp report. In www. gartner. com/ press_ releases/ asset_ 143678_ 11. html , 2006. 8. Alon Y. Halevy and et al. Enterprise information integration: successes, challenges and controversies. In SIGMOD Conference, 2005. 9. HP. HP OpenView solutions. In www. managementsoftware. hp. com . 10. IBM. FileNet enterprise content management solutions. In www. filenet. com . 11. Heikki Mannila and Hannu Toivonen. Levelwise search and borders of theories in knowledge discovery. Data Min. Knowl. Discov., 1(3):241–258, 1997. 12. Hamid Motahari and et al. ServiceMosaic: Interactive analysis and manipulation of service conversations. In ICDE, 2007. 13. Hamid Motahari, Regis Saint-Paul, Boualem Benatallah, and Fabio Casati. Protocol discovery from web service interaction logs. In ICDE, 2007. 14. Hamid Motahari, Regis Saint-Paul, Boualem Benatallah, Fabio Casati, and Periklis Andritsos. Peocess spaceship: Discovering process views in process spaces. Technical Report UNSW-CSE-TR-0721, The University of New South Wales, Australia, 2007. 15. Oracle. Oracle BPEL Process Manager. In www. oracle. com/ technology/ bpel . 16. W. Pauw and et. al. Web services navigator: Visualizing the execution of web services. IBM System J., 44(4):821–845, 2005. 17. Mehmet Sayal, Fabio Casati, Umeshwar Dayal, and Ming-Chien Shan. Business process cockpit. In VLDB, 2002. 18. W. van der Aalst and et. al. Workflow mining: a survey of issues and approaches. DKE Journal, 47(2):237–267, 2003. 19. Wil M. P. van der Aalst, Arthur H. M. ter Hofstede, and Mathias Weske. Business process management: A survey. In Proc. Int’l Conf. Business Process Management (BPM 2003), pages 1–12, 2003.
Biography Hamid R. Motahari Nezhad is a PhD student in the School of Computer Science and Engineering, the University of New South Wales, Australia. His research interests business process management, service oriented computing, business intelligence, and process discovery from process and service logs. He is a student member of IEEE and Australian Computer Society. Boualem Benatallah is a Professor at the University of New South Wales, Australia, where he is the founder and the leader of Service Oriented Computing Group. His research interests lie in the areas of Web service protocols analysis and management, enterprise services integration, process modeling and service oriented architectures for pervasive computing. Fabio Casati is a Professor in computer science at the University of Trento, Italy. Previously, he was with Hewlett-Packard in Palo Alto, California. His research interests go into three main directions: the first is on middleware for integration and in the analysis of middleware data for improving the integration. The second is about bringing and extending traditional integration technologies to all enterprise data and to the Web. The third is related to improving how scientists produce, disseminate, evaluate, and consume scientific knowledge.