Process Mining Functional and Structural Validation Maria Laura SEBU, Horia CIOCÂRLIE Computer and Software Engineering Department Politehnica University of Timisoara Timisoara, Romania
[email protected],
[email protected]
Abstract. Current study proposes solutions for functional and structural validation of business process models extracted after mining the event log dataset with several process mining algorithms. Structural validation (verification) assesses the quality of the business processes by using conformance analysis techniques and computed statistical results. Cross validation for structural validation is also presented as a methodology used for evaluating business processes. Furthermore we propose extending verification of process models with functional validation with the scope of aligning business processes with business objectives. Functional validation starts with process requirement definition, split of process requirements on clear use cases and generating event log data capturing the use case functionality. Functional validation is applied on real event log data generated during one software release in automotive industry, tools development area. Structural and functional validation techniques are captured in a proposal for a framework.
Keywords: Process mining, Functional validation, Conformance analysis, Process discovery
1. Introduction Business Process Management (BPM) includes a set of techniques and methodologies designed to produce better processes. Business processes as the base in any organization are composed of all the activities executed inside an organization with the scope of producing business results. BPM takes independent processes
2
and transforms them into flexible, orchestrated business services that work together to create substantial business value. Process analysis as the first important phase of BPM includes discovery of business processes. Once the business process is available, further analysis could be performed with the scope of redesigning the business process by correcting possible flaws and enhancing the process models. Identifying the most consuming activities, bottlenecks, rework, obsolete activities are possible actions performed with the scope of improving business processed and increasing the quality of the business results. After the improvements are put in practice, the organization is able to measure progress of the enhancements proposed in the monitor phase. Management of improvements and full process automation represent the last phases in BPM lifecycle as in Fig. 1.
Fig. 1. Business process management phases
As several process models could be available such as the ones retrieved in the analysis phases or the ones resulted after performing corrections and enhancements, the identification of the most suitable one for accomplishing business objectives becomes a priority. We propose an approach for validating the most suitable process model from structural perspective, how compliant the model is with the event log dataset and from functional perspective, black box validation, if the business expectations are accomplished. A good BPM solution put in place inside an organization offers functionality to reduce costs by management of processes automatically. BPM allows the process analyst to correctly model and iteratively improve the business process. The resulting systems constructed on business process models are called process-aware management systems (PAIS). A PAIS can be defined as a software system that manages and executes operational processes involving people, applications and information sources based on process models [1]. PAIS systems are based on abstractions of process definition. Execution is automated and directed
3
by process models. If process models are not available as abstraction, process mining techniques could be used to extract them. Process mining includes techniques and tools for discovering, monitoring and enhancing process models used inside an organization [2]. Process mining as abstract is part of the process modeling and analysis area and uses data mining to obtain results. Data recorded during the execution of a process is used to extract knowledge: discover real process models, organizational structure and additional information about internal processes. Complex organizations managing complex business processes are the perfect candidate for process mining techniques. Due to the complexity of the results, validation of the resulted process models is of critical importance. This would increase the confidence in the resulting process models and could aid identifying the areas which need corrections or redesign. This theoretical approach for validation in PAIS is proposed for implementation in a Functional and Structural Validation Framework described in Ch.4. The illustration of this approach is performed by considering a software development process, the Change Control Board process model used in several organizations and described and analyzed in [3]. A component of the Software Configuration Management (SCM) in charge of handling changes required by users of different types (change requests, feature requests, problem reports and information requests) is called Change Control Board (CCB). For exemplification, process mining discovery algorithms are applied on a real event log dataset created during one release of a software product in automotive industry.
Fig. 2. Change control board process
The current study uses implementations of process mining tools and methods captured in ProM Framework 6.3, an academic project of Eindhoven Technical University. ProM provides a wide variety of algorithms and supports process mining in the broadest sense [4]. It can be used to discover processes, identify bottlenecks, analyze social networks, and verify business rules.
4
2. Process Mining PAIS systems are constructed on process models. Process Mining area solves the problem of the unavailability of process models. Data available from process tracking systems is used to extract knowledge: real process models, organizational structure and additional information about internal processes. Process mining could be split into 3 major phases [2] resumed in Fig. 2 [7].
Fig. 3. Process mining phases
First phase, data preparation, includes the actions performed to prepare the input for process mining algorithms. Input data could be provided by different sources in different formats. In this phase, the data must be transformed into Extensible Event Stream (XES) format recognized by process mining implementations available on the market. XES data is flexible, being able to capture event log data from any background, it provides a simple way of representing the information, and it’s transparent, intuitive and extensible for specific domains. Due to these advantages, XES becomes the standard for process mining event log data representation [6]. Once the input data is available, the next step is defining the objectives of the process mining analysis: discovery of business process, analysis of business process, performance improvement, identify bottlenecks, identify the most time consuming activities. Once the objectives are stated, the dataset could be adapted to increase the visibility of the targets. Filtering after specific values of attributes is one technique used for this purpose. As soon as the event log dataset is prepared and the targets for the analysis are identified, in the second phase, pattern discovery, the process mining algorithms are applied on the input data. Several mining algorithms for extracting process
5
models are currently available on the market in different process mining implementation. Assessing the quality of the obtained process models from functional and structural perspective is the subject of the current study. Once the most suitable process model is available, an analysis of the results could be performed further. This includes checking the statistical results for calculating process performance indicators, verifying the conformance of the resulted process model, retrieving other relevant information internal to organizations, identifying bottlenecks. Process Mining operates with the following concepts [7]: A process is an abstract representation of a set of activities logically ordered with the scope of accomplishing a business objective An event corresponds to an activity executed in the process and is composed of attributes with specific values Events are linked together in cases Event log dataset represents the start point for mining from different perspectives [2]: Process perspective focused on identifying all possible paths an item could follow to reach a close state Organizational perspective focused on checking which are the resources involved in the activities and how they interact Case perspective focused on analyzing specific behavior due to specific values of the attributes
3. Business Process Structural Validation Process Mining analyzes event process logs generated during the execution of a process and covering the process lifecycle. The scope is to discover real process models. Even if in some cases a reference process model is available as it was the one used for illustration in this study, in most cases where process mining is applied, the process model represents the output of process discovery algorithms applied on event log data. Below a brief comparison of process mining algorithms is presented. The αalgorithm is the most known process mining algorithm and it’s the first process discovery algorithms that could deal with concurrency. But it’s not the most suitable choice for real life data with noise, different granularity, coming from different sources. Heuristic Miner algorithm was the second process mining algorithm proposed after the α-algorithm having the advantage of overcoming the limitations of the α-algorithm. Heuristic Miner can also abstract exceptional behavior and noise
6
and it’s suitable for real log data. The Fuzzy Miner algorithm is included in the new generation of process mining algorithms and as improvement, the Fuzzy Miner is able to deal with event log data with very high granularity. As many process mining algorithms are available in academic and commercial implementations, assessing the quality of the obtained process model represents an important step for making the right choice in identifying the most suitable business process model. For this purpose, the resulted process models are analyzed by using conformance checking techniques on the event log dataset. The event log is replayed on the process model and several useful metrics are calculated [2]. Fitness metric informs about the percentage of the log traces that could be replayed by the process model from beginning to end. This could be calculated at case and event level. Precision (Behavioral Appropriateness) is another useful metric describing how accurate the model describes the process (the model should not be too generic). Structure (Structural Appropriateness) captures the level of details of the process model; the simplest model capturing all possible execution paths represents the best process model choice. Another useful approach for assessing the structural quality of process models is the usage of a common technique in data mining, separation of data into training and test dataset. The training set is used as input for process discovery algorithms. The test data set and process model obtained after discovery is applied on the training set is used as input for conformance analysis of the process models. This approach could be extended in a k-fold cross validation: partitioning the data into k subsets, process discovery is applied on one subset and validating the analysis results on the other subsets. A description of this technique applied in process mining could be found in [5].
4. Business Process Functional Validation Functional validation is the software testing process used within software development in which software is tested to ensure it conforms to all requirements. Functional testing of a business process implies checking if the process model is compliant with the process requirements representing business objectives. Business process functional testing involves evaluating and comparing each process with business requirements. The first step is determination of the business process expectations. Our proposal is to capture the expectations in use cases with step by step scenarios. The use cases could be defined from 3 different perspectives: process perspective, case perspective and organizational perspective. The second step is represented by the creation of test data based on use case definition. Once the test dataset is available, the next step includes checking the conformance of the obtained process models with the generated test data at previous step. The output of the conformance analy-
7
sis step is considered the result of the test. In the last phase, several testing metrics could be calculated for assessing the functional validity of process models.
Fig. 4. Process mining functional validation framework.
5. Functional Validation on CCB process model Theoretical approach described in previous chapter is exemplified on a real event log dataset. The base business process is the one described in Ch.1. CCB business process is followed to implement and include in deliveries a set of changes proposed by users. The event logs recorded are exported from IMS in csv format: users.csv – containing all users having permissions to access and update the items in IMS items.csv – containing all requirements (change requests, feature requests, problem reports, information requests) events.csv - containing all operations performed by the users to update the state of the items
8
Once data is prepared in XES format as process mining algorithms require, the process models are extracted from the event log data. The algorithms chosen are Heuristic Miner algorithm as it’s the most suitable for real life data like in the current situation and α algorithm as it’s the most widely known process mining algorithms.
Fig. 5. CCB process model extracted with Heuristic Miner algorithm.
Fig. 6. CCB process model extracted with α Miner algorithm.
The first step in functional validation of the resulted process models obtained is the definition of business expectations from the process model captured in use cases.
9 Table 1. Use case definition Use Case No Description Use Case 1
The items could be created by support team, opened, analyzed, implemented, tested and released by engineering team and accepted by support team
Use Case 2
The items could be created by support team, opened, analyzed and cancelled by engineering team and accepted by support team
Use Case 3
Items could be created by support team, opened, implemented, tested and released by engineering team
Use Case 4
Items should always be tested after implementation
Use Case 5
Items should always be cancelled only after an analysis is performed
Use Case 6
Items should always be created by users within group “Support”
Use Case 7
Items should always be implemented by users within Group “Engineering Team”
Use Case 8
Items should always be analyzed after delivery
Each of the following use cases could be split into one or more test scenarios. For each scenario, a dataset is created. The complete dataset and process models obtained with different process mining algorithms are the input for a conformance checking phase. The compliance of each process model with the test dataset represents the factor for deciding which of the business process is the most suitable for accomplishing the business objectives and should be used in practice and for further analysis. The use case approach is an efficient and effective technique for collecting essential requirements from stakeholders, helping to focus on the real needs. It will help business analysis and project teams to arrive at a common, shared vision of what the process should do. In Fig. 7 a global image of the complete validation flow is captured.
Fig. 7. Business process validation flow
10
We propose implementation of structural and functional validation of process models in a framework. For evaluating the compliance of a process model, specific metrics are used in case of structural and functional validation. These metrics represent the output results of the proposed framework.
Fig. 8. Functional and structural validation framework.
Conclusions The benefits of introducing BPM in any organizations are undoubtful. BPM enables organizations to align business objectives with internal processes and has the capacity to reduce costs, improve efficiency and minimize errors and risks. When a business process is not available, process mining techniques allow retrieval of process models from internal event log traces. Applying process mining on a dataset requires a thorough understanding of the data and a lot of expertise in interpreting the results and finding solutions to enhance the process. In this context, validation of process models becomes an important step as the process models represent the base for any further analysis. In this study we proposed validation from different perspectives. A structural approach identifies the correctness of process models by checking the compliance of the model with the full event log trace, precision of the model to capture the scenarios in the event log and the simplicity of the model. The functional approach on the other hand validates a business process based on a set of business objectives defined as use case scenarios.
11
Validation is an important part in any business for ensuring good results and it should be applied in all the areas inside an organization. Validation of business processes inside an organization which produces results by executing internal processes is even more important as the quality of process models could impact all business areas. Validation of process models could also increase the confidence in process mining results as the organizations will have the insurance that the process reflects the expectations and it was extracted correctly from the event log dataset.
References 1.
M. Dumas, W.M.P van der Aalst, and A.H.M Hofstede – Process Aware Information Systems: Bridging People and Software through Process Technology, 2005
2.
W.M.P. van der Aalst - Process Mining : Discovery, Conformance and Enhancement of Business Processes, Berlin-Heidelberg, 2011
3.
E. Urena Hinojosa - Process Mining Applied to the Change Control Board Process. Discovering Real Processes in Software Development Process. Master's Thesis. Technische Universiteit Eindhoven, Eindhoven, The Netherlands, 2008
4.
M.W. Verbeek, J.C.A.M. Buijs, B.F. van Dongen, and W.M.P. van der Aalst - ProM6: The Process Mining Toolkit BPM 2010 Demo, September 2010
5.
A. Rozinat, A.K. Alves de Medeiros, C.W. Gunther, A.J.M.M. Weijters, and W.M.P. van der Aalst - Towards an Evaluation Framework for Process Mining Algorithms (2011)
6.
http://www.xes-standard.org/
7.
Maria Laura Sebu, Horia Ciocarlie - Applied process mining in software development Case Study (2014)