Proceedings of the IASTED International Conference Modelling, Identification, and Control (MIC 2011) February 14 - 16, 2011 Innsbruck, Austria
DISCRETE EVENT MODEL STRUCTURE IDENTIFICATION USING PROCESS MINING Agnes Werner-Stark1, Mikl´os Gerzson1 , Katalin M. Hangos1,2 Department of Electrical Engineering and Information Systems University of Pannonia Egyetem str. 10. Veszpr´em, Hungary email:
[email protected],
[email protected] 2 Process Control Research Group Computer and Automation Research Institute HAS, Budapest, Hungary email:
[email protected] 1
ABSTRACT A novel structure identification procedure for discrete event systems described by Petri nets are proposed in this paper for model-based diagnostic purposes that utilize the notions and tools of process mining. The identification of the structurally different discrete event system models describing a system in its normal and/or faulty modes was used for model-based isolation of the considered faulty modes. From the available process mining techniques that allow for the automatic construction of process models in Petri net form based on event logs, the genetic algorithmbased structure identification procedure has been found to be most capable of identifying the characteristic structural elements of the faulty models. The proposed procedures are illustrated on a simple example of an operated parking gate automaton with two faulty modes.
present paper. As an interesting novel approach, the methods and tools of process mining [10] have been examined in this paper to use them for model structure identification for diagnostic purposes. An example of an operated parking gate automaton with two faulty modes is used to illustrate the proposed tools and procedures.
2 Basic notions 2.1 Petri net model of a process Petri nets allow both the mathematical and the graph representation of a discrete event system to be modeled, where the signals of the system have discrete range space and time is also discrete [4]. Petri nets can be used for describing a controlled or open-loop system, for modeling the events occurring in it and for analyzing the resulted model. During the modeling and analysis process we can get information about the structure and dynamic behavior of the modeled system. One of the main advantages of modeling with Petri nets is the ability of describing sequences of events. In a real system the events can be occurred both in a serial and in a parallel way. In case of parallelism we can distinguish two different situations. In the first case the two or more series of events can take place independently from each other. In the other case only one of the sequences can take place because two or more events have the same precondition, and the occurrence any of them makes this precondition invalid. This kind of parallelism is called conflict situation. In the Petri net the conflict can be recognized when the same place is the input of two or more transitions. While the real parallelism can occur in normal operational circumstances, the conflicts mean fault situations in general. Although faults also have their preconditions, but these are frequently invisible for the operator and it seems to be the effect of randomness which event takes place. Different modifications of the original Petri net were
KEY WORDS discrete event systems, structure identification, process mining, Petri nets, fault isolation
1 Introduction The idea of using model structure identification of discrete event systems models [2] is not new, but the field has matured only recently by a review paper [3] that focuses on Petri net models used for model-based diagnosis. Model-based techniques [1] are widely used and very popular in control and diagnostic applications because of their efficiency and good performance both for systems with continuous and discrete range spaces. The appropriate models in the discrete range space case are built using the tools and techniques of discrete event systems [4], and these are mainly in the form of Petri nets. When used for model-based fault detection and isolation, one not only needs a model for the normal operation of the system, but also other models describing the considered faulty modes are required. This gives the possibility to isolate the actual faulty mode from measured data and the structurally different faulty models that is the subject of the DOI: 10.2316/P.2011.718-069
228
carried out and several types of nets were introduced by researchers from all over the world since the first application of Petri nets by C. A. Petri. The aim of these modifications is to improve the modeling capabilities of this method. One of them is the class of work flow nets (WF-nets) which can be applied for modeling of business processes. In our paper we use the WF-nets for diagnosis purposes [10], i.e. for the determination of faulty operational modes of the investigated system.
pre-constructed WF-nets with the WF-net recovered from the event log we can establish the most likely operational course and in case of fault the most feasible reason of it can be diagnosed.
3 Structure identification using process mining Process mining techniques have emerged in the area of business process modeling [10], theoretical computer science and artificial intelligence. Taking the data set of real process executions, the event logs, these techniques can be used for process discovery [9], i.e. to construct a discrete event model from them, or for conformance checking [8], to mention just the relevant problems for fault detection and diagnosis. It is very important for our purpose that process mining deals with the discovery of structures of process models from event-based data. The goal is to construct a process model which reflects the behavior that has been observed in some kind of event log.
2.2 WF-nets The WF-nets form a special subclass of Petri nets. They were introduced for description and analysis of business processes. The most important property of the WF-nets is the soundness. Soundness can be interpreted as a correctness criterion of WF-net. A WF-net is sound if all of their firing sequences are sound, and a firing sequence is sound if it terminates properly, i.e. when it terminates only the terminal place has one token. The criteria of sound WF-nets are as follows: there is (i) a single Start place and (ii) a single End place in the net, with (iii) every node (places and transitions) being connected to a path directed from the Start to the End places, (iv) they should be live, (i.e. no dead transitions), and (v) every process (i.e. firing sequence) started by a single token at the Start place should finish leaving a single token at the End place. The above requirements do not allow to have cyclic behavior and further input places (for example for describing fault indicators) in the system. Furthermore, each of the cycles should be implemented as separate path by inserting artificial Start and End places, while the faulty modes can be described using separate transitions for each that are in conflict to the transitions belonging to the other modes (resulting a WF-net with conflict). Sound WF-nets are also used as reference models in process mining techniques.
3.1 Event logs An event log is a set of finite event sequences, whereas each event sequence corresponds to a particular materialization of the process. We refer to an event sequence as a trace hereafter. We assume that it is possible to record events in a way that each event refers to an activity (i.e. a welldefined step in the process), and each event belongs to a case (i.e. a process instance). In addition, each event can have a performer also referred to as originator (the person who executes or initiates the activity), and events have a time stamp, while they are totally ordered. An event log is used as the starting point for mining.
2.3 Describing fault modes using Petri nets
3.2 Process mining tools for reconstructing Petri net models from event logs
Let us assume that we have the Petri net of the normal operational course in the form of WF-net, the so called normal reference model. Let us construct the WF-net of the actual operational course based on the observed data of process. This list of data is called event log (see later in sub-section 3.1), and it contains the measured values and the performed actions. The difference between the two nets refers to the deviation from the normal operational course. The comparison can be performed based on e.g. the reachability trees of the two nets (see later in sub-section 3.3), there are well known algorithms for the determination the distance between trees in the literature [5]. Note that complex system can have very large reachability tree, but in case of sound WF-nets this tree is relatively simple because of the only one terminal state and the lack of cyclic processes. If we know the possible faulty cases then we can construct the WF-nets describing them. Comparing these
The developed tools and techniques of process mining are collected in an unified toolbox, called ProM [9] that has been interfaced with other discrete event system modeling and simulation toolboxes using standard extended XMLtype input and output files. ProM accepts and produces event logs (or traces) in the form of MXML (Mining XML) files, and Petri nets described by standard PNML (Petri Net Markup Language) files. It is important to emphasize that an event log contains a set of event sequences that each corresponds to a particular behavior, and the events recorded in a log may have ”measurement errors”, that is, some of them may be omitted or have a perturbed time stamp, for example. This variability allows for analyzing cases when either the behavior or the event sequences exhibit random character.
229
3.2.1 The reconstruction algorithms
3.2.2 Structure identification procedures using ProM
The following algorithms are available in ProM for reconstructing process models from MXML files. α-algorithm: The α-algorithm can extract a process model from such a log and represent it in terms of a Petri net. This algorithm is proved to mine correctly sound WFnets without short loops. It uses the fact that for many WFnets two tasks are connected if their causality can be detected by inspecting the log. Note that the α-algorithm is rather sensitive to noise and exceptional behavior and has problems handling more advanced control flow patterns. Genetic algorithm: This method uses genetic algorithms to mine process models from event logs. The algorithm starts with an initial population of individuals. A fitness measure is assigned to every individual to indicate its quality. In this case, an individual is a possible process model and the fitness is a function that evaluates how well the individual is able to reproduce the behavior in the log. Populations evolve by selecting the fittest individuals and generating new individuals using genetic operators such as crossover and mutation. Its output is a set of process models that are decreasingly ordered by the fitness value. Heuristics Miner: The most important characteristic of the heuristics miner algorithm is the robustness for noise and exceptions. It is based on the frequency of patterns and allows to focus on the main behavior in the event log. The three steps of this algorithm: (1) the construction of the dependency graph, (2) for each activity, the construction the input- and output expressions and (3) the search for long distance dependency relations. In practical situation (i.e. with event log with thousands of traces, low frequent behavior and some noise) the algorithm can focus on all behavior in the event log, or only on the main behavior. DWS (Disjunctive Workflow Schema) mining: The algorithm is able to discover a set of workflow models that represent different subsets of the input log, and arrange them in a browsable tree. The mining is carried out through a top-down hierarchical clustering process, where the log is recursively split into homogeneous clusters. All discovered clusters are then equipped with a specific workflow model. Any partitioning step hence produces a refinement of the workflow model being discovered. Specific behavioral patterns, named discriminant rules, are used as features for clustering log instances by means of classical k-means algorithm. Multi-phase miner: This is a macro plugin that subsequently calls the Partial Order Generator (POG), then the Partial Order Aggregator (POA) and finally the Aggregation Graph to EPC (Event Process Chain) converter. The POG changes linear orders into partial ones inside a log. It requires the LogReader to allow for writing to log files. The POG will add data to the log, so that the ProM will recognize it as a log with partial orders. The POA takes a log where each instance represents a partial order on events. These instances are then aggregated into an aggregation graph.
With the above reconstruction algorithms offered by ProM, one can easily assemble a Petri net structure identification procedure as follows. 1. Collect the relevant event sequences into a set, and encode them into MXML format. 2. Select a reconstruction algorithm and give its parameters. (In the brackets there are the default values.) α-algorithm: This method does not have any parameters, the model is constructed automatically. The traces in the log can indeed be reproduced by the Petri net. Genetic algorithm: We have to give the population size (100), the initial population type (Possible Duplicates), the maximum number of generations (1000), the elitism rate (0.02), the fitness type (ExtraBehaviorPunishment), the selection method type (Tournament), the crossover type (Enhanced), the crossover rate (0.8), the mutation type (Enhanced) and the mutation rate (0.2). Heuristics Miner: We have to give the positive observations threshold (5), dependency threshold (0.8) and the relative-to-best threshold (0.2 on the event log with 5% noise). DWS mining: We have to give the sigma (the frequency in the log is over a given threshold sigma =0.05), the gamma (it’s frequency is below a given threshold gamma =0.01), the maximum number of clusters per split (4), the maximum length of features (5), maximum number of splits (1) and maximum number of features (2). Multi-phase miner: The undermentioned setting opportunities can be chosen before the running: • Use partial order information to derive succession relation. • Enforce causal dependencies between events of the same activity. • Enforce parallelism relations between all event of activities, if they overlap in time in some instance, taking into account the given start and final events. 3. The identification result is available graphically in the user interface and can be extracted for further analysis as a PNML file. 3.3 Comparison of the reference model and the reconstructed models Two discrete event system models in the same form (e.g. both in Petri net form) can be compared in two principally different ways.
230
1. Comparison in the space of events Here the comparison is performed by comparing the event log generated by the reconstructed model with the one generated by the reference model using some signal norm. As the event sequences (without the timing information but with their labels as symbols) can be seen as strings, the efficient algorithms of string comparison (see e.g. [6]) can be applied.
Petri net of the normal reference model as a sound WF-net is depicted in sub-figure (a) of Fig. 1. 4.1.2 The faulty modes Two faults are considered. Issuing double ticket: The first one is the fault of the parking gate automaton when it issues two tickets instead of one when someone pushes the button (called as the double ticket faulty mode). The model contains both the normal and the fault-related transitions, where the “PRINT TICKET (normal)” transition is in conflict with the “PRINT DOUBLE TICKET (fault)” transition. Unfair driver: The other one is the fault of the operating procedure (an unfair driver), who notices the wrongly issued ticket(s) and takes one without pushing the button for his/her own ticket (called as the two faults faulty mode); the corresponding Petri sub-net is depicted in sub-figure (b) of Fig. 1 with the fault-related elements shown in red. It is important to notice that the two faults model contains both the normal and the double ticket fault reference models as its sub-models. Moreover, the two individual faults present here are related: if a double ticket printing had not taken place before, the driver could not cheat. This fact, together with the non-deterministic nature of the underlying Petri net model makes it conceptually difficult to detect and isolate this faulty mode.
2. Comparison in the space of Petri net models Here one compares the structure of the two models by using some general graph comparison methods and related graph distance [5] based thereon. Suppose we have a model of the process in the form of a WF-net (N1). This model is based on our original concepts about the system and on the experiences resulted from the logs of many executions of the process. Note that this model may describe both normal (i.e. non-faulty) operation courses and operation courses belonging to the known faulty modes (in a form of WF-net with conflict). Based on the workflow log of the actual operational courses and using some mining algorithm we construct another WF-net (N2). The question is whether N2 is a subgraph of N1. If it is true then we can determine whether the system works under normal operational conditions or we can isolate the fault. If it is not true then probably new fault has been detected. If one looks at the tools available in ProM, then one can verify the reconstructed models on the basic of the following properties: fitness, precision, generalization, structure, visibility of faults etc., that are offered by the conformance checking plugin in ProM [8].
4.2 Event log and reference model generation The composite process model in its Petri net form implemented in CPNTools [7] was used to simulate possible event sequences in faulty and non-faulty situations under different circumstances. It is important to note that the net is non-deterministic caused by conflicting and parallel transitions. The different simulated logs were exported with the MXML Logging Extensions of CPNTools to enable further analysis in ProM. Because of the non-deterministic nature of the events in the describing Petri net model, the generated logs were required to have enough information about the possible event sequences in each faulty situation. Besides of the “clean” cases, when every log belonged to the same faulty mode, some realistic, “mixed” logs were also created and investigated where the faults occurred with a predetermined probability. The following test logs were generated and investigated: (1) a clean normal log, (2) a mixed log with normal and double ticket printing faulty cases, (3) a mixed log with normal, double ticket printing and cheating driver faulty cases. Three reference models have been generated and encoded into a PNML format: (a) the normal reference model corresponding to the normal operation, (b) the double ticket faulty reference model, and (c) the two faults faulty reference model with a multiple fault of an unfair driver and with issuing two tickets.
4 Simple case study The above structure identification procedures have been investigated and compared using a simple illustrative case study using a manually operated parking gate automaton. 4.1 The parking gate automaton and its operating procedure The parking gate automaton is a simple sequential discrete event system that detects a car to enter the parking place, accepts the request for a ticket and issues the ticket, opens the gate and lets the car into the parking place. The operating procedure of the automaton is a connected simple discrete event system that describes the actions of the driver of the car: he/she drives to the gate, pushes the appropriate button for the ticket, takes the ticket and drives into the parking place. 4.1.1 The normal operation The normal operation of the operated parking gate automaton is an event sequence where the automaton and its operating procedure initiate the events in turns. The describing
231
(a) PARKING (normal) START (normal)
STOP AT GATE (normal)
PUSH BUTTON (normal)
TAKE TICKET (normal)
ENTRY (normal)
END (normal)
INIT MACHINE (normal)
CLOSE GATE (normal) PRINT TICKET (normal)
OPEN GATE (normal)
(b)
PRINT TICKET (normal)
PARKING (normal)
Fault_double_ticket
PRINT DOUBLE TICKET (fault)
END (normal) CLOSE GATE (normal) OPEN GATE (normal)
INIT MACHINE (normal)
START (normal)
STOP AT GATE (normal)
Fault_cheat
RECOGNIZE WRONG TICKET (fault)
TAKE WRONG TICKET (fault)
PUSH BUTTON (normal)
TAKE TICKET (normal)
ENTRY (normal)
Can_cheat
Figure 1. The reference Petri net models of the operated parking gate automaton, (a) normal model, (b) two faults
4.3 Reconstructed models
4.4 Discussion and comparison
The structure identification algorithms that use the available model reconstruction methods in ProM (see in subsection 3.2) were used for different logs that correspond to different faulty situations (no fault, two faults).
The following detailed observations and comparison of the operation of the different procedures can be made. (We run the algorithms with different parameters, not only default values.) α-algorithm: The traces in the log can indeed be reproduced by this Petri net. At the same time, the αalgorithm was able to discover the process without any apriori information. Genetic algorithm: From the viewpoint of the diagnostic application it is important that this algorithm can deal with noise and incompleteness. Moreover, it is more flexible than the other available methods. Sub-figure (b) in Fig. 2 shows the mined model. Heuristics Miner: As the heuristics miner specializes in dealing with noise and exceptional situations, it seems to be too complex and difficult-to-use for fault isolation. Multi-phase miner: This method always produces a model that can replay the log. It uses Event-driven Process Chains as a default representation. However, the EPCs can be converted in other formats such as various types of Petri nets, but this, again, is too complex and difficult-to-use for fault isolation. DWS miner: This method uses the Heuristic Miner to mine a model for this log therefore the sub-figure of Heuristic miner and the sub-figure DWS mining are similar in this simple case.
4.3.1 Clean no fault case First the no fault case was investigated using a clean normal log, when we expected that the reference normal model would be reconstructed. We can see the reconstructed model in sub-figure (a) of Fig. 2 that was resulted by using the Heuristics miner. We can see that this model is perfectly reconstructed, and this was the case when all the other reconstruction algorithms were used. Note that the two numbers assigned to each arcs denote the number of traces and their fraction in the log that matches the particular arc. 4.3.2 Two faults case A mixed log was used here with normal, double ticket printing and cheating driver faulty cases. Comparing this figure with the reference model it was seen that each algorithm could reconstruct the structure of the fault-related part exactly, thus no difference with the two faults reference model is observed.
232
(a)
Acknowledgements This work was supported by the project TAMOP 4.2.1/B “Mobility and environment: Researches in the fields of motor vehicle industry, energetics and environment in the Middle- and West-Transdanubian Region”. Part of the work is also supported by the Hunga-rian National Research Fund through project K67625.
References [1] Blanke, M., Kinnaert, M., Lunze, J., Staroswiecki, M.: Diagnosis and Fault-tolerant Control (SpringerVerlag, 2006). [2] Meda, M.E., Ramirez, A., Malo, A.: Identification in discrete event systems, IEEE International Conference on Systems, Man, and Cybernetics, 1998, 740– 745.
(b)
[3] Fanti, M.P., Seatzu, C.: Fault diagnosis and identification of discrete event systems using Petri nets, 9th International Workshop on Discrete Event Systems, WODES 2008, 2008, 432–435. [4] Cassandras, C.G., Lafortune, S.: Introduction to Discrete Event Systems (Kluwer Academic Publishers, 1999). [5] Bille, P.: A survey on tree edit distance and related problems, Theoretical Computer Science, 337, 2005, 217–239.
Figure 2. The reconstructed normal model by Heuristics miner and the part of the extracted model by Genetic algorithm
[6] Cohen, W., Ravikumar, P., Fienberg, S.: A comparison of string distance metrics for name-matching tasks, Proceeding of the IJCAI, 2003. [7] CPN Group, University of Aarhus, Denmark: CPNTools 2.2.0. http://wiki.daimi.au.dk/cpntools/
5 Conclusion A novel structure identification procedure for discrete event systems described by Petri nets are proposed in this paper for model-based diagnostic purposes that utilize the notions and tools of process mining.
[8] Rozinat, A., van der Aalst, W.M.P.: Conformance Checking of Processes Based on Monitoring Real Behavior. Information Systems, 33, 2008, 64–95. van der Aalst, W.M.P., Weijters, A.J.M.M. , (Eds.) Process Mining, Special Issue of Computers in Industry, Elsevier Science Publishers, Amsterdam, 53(3), 2004.
For the fault isolation, structurally different discrete event system models describing a system in its normal and/or faulty modes were used as reference models that were compared to the ones reconstructed from observed event logs of real process executions. The reconstruction was performed by different process mining methods.
[9] van der Aalst, W.M.P., van Dongen, B.F., Gunther, C.W., Mans, R.S., Alves de Medeiros, A.K., Rozinat, A., Rubin, V., Song, M., Verbeek, H.M.W., Weijters, A.J.M.M.: ProM 4.0: Comprehensive support for real process analysis. In J. Kleijn and A. Yakovlev, editors, Application and Theory of Petri Nets and Other Models of Concurrency (ICATPN 2007), volume 4546 of Lecture Notes in Computer Science, Springer-Verlag, Berlin, 2007, 484–494.
From the available process mining techniques that allow for the automatic construction of process models in Petri net form based on event logs, the genetic algorithmbased structure identification procedure has been found to be most capable of identifying the characteristic structural elements of the faulty models.
[10] van der Aalst, W.M.P., van Hee, K.M.: Workflow Management: Models, Methods and Systems (MIT Press, Cambridge, MA, 2002).
The proposed procedures are illustrated on a simple example of an operated parking gate automaton with two faulty modes.
233