IT SYS T EM S PERSPEC TI V E S
Auditing 2.0: Using Process Mining to Support Tomorrow’s Auditor Wil M.P. van der Aalst, Eindhoven University of Technology and Queensland University
of Technology
Kees M. van Hee and Jan Martijn van der Werf, Eindhoven University of Technology Marc Verdonk, Deloitte Netherlands and Eindhoven University of Technology
Auditors can use process mining techniques to evaluate all events in a business process, and do so while it is still running.
A
uditors validate informat ion a bout organizations and their business processes. Reliable information is needed to determine whether these processes are executed within certain boundaries set by managers, governments, and other stakeholders. Violations of specific rules enforced by law or company policies may indicate fraud, malpractice, risks, or inefficiencies. Traditionally, an audit can only provide reasonable assurance that business processes are executed within the given set of boundaries. Auditors assess the operating effectiveness of process controls, and when these controls are not in place or functioning as expected, they typically check samples of factual data. However, with detailed information about processes increasingly available in high-quality event logs, auditors no longer have to rely on a small set of samples offline. Instead, using process mining techniques, they can evaluate all events in a business process, and do so while it is still running. The omnipresence of electronically recorded business events
90
COMPUTER
coupled with process mining technology enable a new form of auditing that will dramatically change the role of auditors: Auditing 2.0.
PROCESS MINING The goal of process mining is to discover, monitor, and improve real (not assumed) processes by extracting knowledge from event logs. Over the past decade, process mining techniques have matured and are being integrated into commercial software products (W.M.P. van der Aalst et al., “Business Process Mining: An Industrial Application,” Information Systems, vol. 32, no. 5, 2007, pp. 713-732).
Business provenance Process mining starts with the event log: a sequentially recorded collection of events, each of which refers to an activity (well-defined step) and is related to a particular case (process instance). Some mining techniques use other information such as the person or resource executing or initiating the activity, the event’s time stamp, or data elements recorded with the event—for example, the size of an order. Published by the IEEE Computer Society
The systematic, reliable, and trustworthy recording of events, known as business provenance, is essential to auditing. This term acknowledges the importance of traceability by ensuring that history cannot be rewritten or obscured.
Process discovery By analyzing frequent patterns, process mining techniques can extract from event logs models that describe the processes at hand. For example, the Alpha process mining algorithm can automatically extract a Petri net that concisely models behavior in the event log. This gives the auditor an unbiased view of what has actually happened.
Conformance checking An auditor can use an a priori process model to check if reality, as recorded in the event log, conforms to the model and vice versa. For example, a model may indicate that purchase orders exceeding one million euros require two checks. Auditors can use conformance checking to detect deviations, locate and explain them, and measure their severity (A. Rozinat and W.M.P. van 0018-9162/10/$26.00 © 2010 IEEE
People
der Aalst, “Conformance Checking of Processes Based on Monitoring Real Behavior,” Information Systems, vol. 33, no. 1, 2008, pp. 64-95).
Machines Violation
Organizations
World
Business processes
Documents
Information system
Prediction
Recommendation
Model extension Auditors can also extend an a priori model with a new aspect based on event log data. The goal is not to check conformance but to enrich the model. An example is extending a process model with performance data to find bottlenecks.
Toward operational support Although process mining has traditionally focused on offline analysis and is seldom used for operational decision support, it can consider running-process instances and compare them with models based on historic data or business rules (W.M.P. van der Aalst, M. Pesic, and M. Song, “Beyond Process Mining: From the Past to Present and Future,” tech. report BPM-09-18, BPMcenter.org, 2009). For example, an auditor can “replay” a running case on the process model in real time and check whether the observed behavior fits. The moment the case deviates, the auditor can alert an appropriate actor. Auditors can also use a process model based on historic data to make predictions about running cases—for example, to estimate the remaining processing time and a particular outcome’s probability—or provide recommendations, such as an activity that will minimize the expected costs and completion time.
AUDITING FRAMEWORK Event logs and process mining techniques enable new forms of auditing. Rather than sampling a small set of cases, auditors can consider the whole process and all of its instances. Moreover, they can do this continuously. Figure 1 shows an Auditing 2.0 framework based on process mining. “Current data” events are cases that
Detect Historic data
Current data
Filter/query log
Predict Discover model
Recommend
De jure models
De facto models
Check conformance
Control-flow
Control-flow
Extend model
Data/rules
Data/rules
Compare models
Resources/ organization
Promote model Violations
Resources/ organization
Diagnose model
Inconsistencies
Diagnostics
Figure 1. Auditing 2.0 framework based on process mining. Auditors analyze information in event logs using both de jure and de facto models.
are still running, while “Historic data” events are completed cases. The figure also shows two types of process models: De jure models describe a desired or required way of working, while de facto models aim to describe reality with potential violations of the boundaries defined in de jure models (W.M.P. van der Aalst et al., “Conceptual Model for On Line Auditing,” tech. report BPM-09-19, BPMcenter. org, 2009).
Auditing using historic data Auditors can use historic data to filter out irrelevant situations or scope the event log—say, for a particular process or group of customers. They can remove entire cases (for example, all process instances related to gold customers in a particular region) or individual events (for example, all
checking events done by people from a particular department), resulting in a smaller event log that can be used for further analysis. Querying the log for particular cases or events is especially useful for ad hoc auditing questions. Historic data can also be used to discover de facto models with different perspectives—for example, control-flow (ordering of activities), data/rules, and resources/organization. Most process mining algorithms, including Alpha, focus on process discovery with an emphasis on control flow. However, other process mining algorithms discover organizational models, and classical data mining algorithms such as ID3, C4.5, and CART (classification and regression trees) extract decision trees based on data attributes. MARCH 2010
91
IT SYS T EM S PERSPEC TI V E S
Figure 2. Using ProM to analyze the conformance of a process inside a Dutch municipality. The event log contains 5,187 events related to 796 cases (applications for support by citizens). Analysis shows the overall conformance (99.5 percent) and highlights those parts of the process where deviations are most frequent.
In addition, auditors can analyze historic data against de jure models. Conformance-checking techniques highlight those parts of the model where conformance is low or there are pinpoint deviations. Auditors can use such techniques to see which rules are violated and where and when people do not execute processes as specified. Finally, auditors can measure service levels and other performance indicators and project them onto a model—that is, they can use information extracted from event logs to extend existing models. This provides diagnostic information with which to spot possible problems.
Auditing using models only The bottom half of Figure 1 shows various types of analysis that do not directly involve any event data.
92
COMPUTER
First, auditors can compare de jure and de facto models and analyze their differences and commonalities. For example, if a de facto process model obtained using process discovery shows paths that are not possible according to the de jure model, then this serves as a good starting point for in-depth analysis. Second, it is possible to promote a de facto model to a de jure one. A comparison showing that the actual process execution is inconsistent with the standard preexisting model may motivate an update of the de jure model. This is relevant because people may find and adopt better ways to execute processes. Finally, auditors can diagnose de facto models using conventional model-based analysis techniques. For example, they can check models for deadlocks and other anomalies. Auditors can merge process mining
results to discover comprehensive simulation models incorporating the various perspectives for what-if analysis.
Auditing using current data Auditors can use new mining techniques to influence the operational process before events complete. These techniques typically involve replaying process instances—for example, the well-known token game of Petri nets—to detect deviations, predict particular outcomes, and recommend appropriate actions. Mapping business rules onto Petri nets or temporal logic, such as LTL, enables efficient checks. By comparing running cases with the de jure model, auditors can detect deviations as they occur and even predict whether any are likely to occur. For example, they can use various techniques to signal the like-
lihood of violating a legal deadline such as “claims must be handled within three weeks” (W.M.P. van der Aalst, M.H. Schonenberg, and M. Song, “Time Prediction Based on Process Mining,” BPM Center report BPM-09-04, BPMcenter.org, 2009). Similar techniques can be used to recommend particular actions, such as “Taking action X will minimize the risk of violating legal requirement Y.” The ability to provide operational support creates a dilemma. On the one hand, it seems odd not to act on readily available information. On the other hand, auditors may lose their independence by interfering with the operational process. For example, an auditor who provides warnings while the process is still running becomes partially involved in the actual execution—can the auditor still assess the process afterward?
PROM: A PROCESS MINING TOOLSET ProM is a generic, open source process mining toolset available for downloading at www.processmining. org. We ultimately aim to develop a customized version that operationalizes the framework shown in Figure 1 to meet auditors’ specific needs. The toolset has a pluggable architecture and supports a wide range of control-flow models including various types of Petri nets, event-driven process chains (EPCs), Business Process Modeling Notation (BPMN), and Business Process Execution Language (BPEL). ProM also supports models to represent rules (for example, LTLbased rules), social networks, and organizational structures. Multiple plug-ins are available for each of the activities shown in Figure 1—for example, there are dozens of plug-ins to discover and check the conformance of the process models supported by ProM. We recently developed plug-ins to support detect, predict, and recommend activities.
Figure 2 shows a screenshot of ProM while analyzing the conformance of a process inside a Dutch municipality.
CHALLENGES T he applic a t ion of proc e s s mining to auditing depends on the availability of relevant data, which is primarily stored in enterprise resource planning (ERP) systems. Mining such systems is challenging because they are not, despite having built-in workflow engines, processoriented. Because data related to a particular process it usually scattered over dozens of tables, extracting it for auditing is nontrivial. A second challenge of Auditing 2.0 concerns the so-called “auditing materiality” principle that guides current auditing practices. According to this principle, auditors typically consider only a small subset of data, and if they see no deviations take no further actions. By looking at all the data, auditors will inevitably find more exceptions requiring follow-up, increasing quality but also the audit’s time and cost. Finally, widespread adoption of process mining as an accepted auditing approach would require organizations such as the International Federation of Accountants to change their methodologies and issue new guidelines to companies that rely on them.
M
ajor corporate a nd ac c ou nt i ng s c a nd a l s including those affecting Enron, Tyco, Adelphia, Peregrine, and WorldCom have fueled interest in more rigorous auditing practices. Legislation such as the SarbanesOxley Act of 2002 and the Basel II Accord of 2004 were enacted in response to such scandals. The recent financial crisis also underscores the importance of verifying that organizations operate “within their boundaries.” Process mining techniques offer a means to more
rigorously check compliance and ascertain the validity and reliability of information about an organization’s core processes. Auditing 2.0—a more rigorous form of auditing that couples detailed event logs with process mining techniques—will dramatically change the auditing profession. Auditors will need better analytical and IT skills, and their role will shift as they work on the fly. Provenance data will make it possible to “replay” history reliably and accurately and to predict problems, thereby improving business processes. Wil M.P. van der Aalst is a professor of information systems in the Mathematics and Computer Science Department at Eindhoven University of Technology, Netherlands, where he leads the Architecture of Information Systems (AIS) group. He is also adjoint professor at Queensland University of Technology, Brisbane, Australia. Contact him at
[email protected]. Kees M. van Hee is a professor in the AIS group at Eindhoven University of Technology. Contact him at
[email protected]. Jan Martijn van der Werf is a PhD candidate in the AIS group at Eindhoven University of Technology. Contact him at
[email protected]. Marc Verdonk is a senior manager and IT auditor with Enterprise Risk Services at Deloitte Netherlands, as well as a PhD candidate in the AIS group at Eindhoven University of Technology. Contact him at mverdonk@ deloitte.nl.
Editor: Richard G. Mathieu, Dept. of Computer Information Systems and Management Science, College of Business, James Madison Univ., Harrisonburg, VA;
[email protected]
Selected CS articles and columns are available for free at http:// ComputingNow.computer.org. MARCH 2010
93