2013 Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises
Coupling case based reasoning and process mining for a web based crisis management decision support system. Julie Dugdale Université de Grenoble 2 Laboratoire d’Informatique de Grenoble, LIG Grenoble, France e-mail:
[email protected]
Sameh Triki 1, Narjès Bellamine Ben Saoud 2 1
Université de La Manouba, ENSI, Tunisia Université Tunis El Manar, ISI, Tunisia IT01 Laboratoire RIADI Tunis, Tunisia 1 e-mail:
[email protected] 2 e-mail:
[email protected]
2
Chihab Hanachi Université de Toulouse 1 Institut de Recherche en Informatique de Toulouse, IRIT Toulouse, France e-mail:
[email protected]
lack of standardization on how to describe a disaster, and each database has its own disaster model. Moreover, there is no database that records, in a formal way, previous disaster management practices. The successful or unsuccessful processes that have been used to manage previous disasters are usually recorded in an unstructured way in reports or articles.
Abstract— This paper presents a research in progress that aims to design and develop a web-based shared environment for stakeholders involved in disaster management. The goal of this environment is two-fold. Firstly it will provide a reliable disaster information source to facilitate the exchange and the analysis of previous crisis information. Secondly, it will assimilate best practices and provide recommendations based on experiences from previous disasters. One of the first steps towards such an environment is to elaborate a common and generic disaster model. This model is also a reference to define a template for the case base of previous disasters. In order for our system to provide recommendations based on previous practices, we combine case based reasoning with process mining. This article presents the first step towards a disaster management decision support system, specifically providing guidance on how to integrate process mining in the case based reasoning cycle.
Our approach is divided into two parts. Firstly, we try to develop a unifying disaster model. This model covers information on past disasters, their affected systems and their management life cycle including preparedness, response, recovery and prevention. Secondly, in order to provide online advice of good practices for disaster management inspired from previous experiences, we have integrated process mining for disaster management processes with casebased reasoning. In our case, we focus only on natural disasters. We use case-based reasoning to retrieve previously successful disaster management processes, and process mining to discover a new process merging past solutions of similar disasters.
Keywords- Generic Disaster Model; Process Mining; Case Based Reasoning; Previous practices; Crisis Management Decision Support.
I.
INTRODUCTION
Over the last decades, crisis management has become a key topic due to the increasing amount of crises, especially natural disasters, occurring all over the world. In the aftermath of such disasters, emergency services, humanitarian organizations and volunteers try to respond to the consequences in the best possible way. What is of paramount importance is the development of robust and reliable computer systems for storing information about past disasters that may be used to help manage ongoing disasters in real time.
The contribution of this work is three-fold. Firstly, we present a generic disaster model on top of which all the other contributions are built. Second, a web-based shared environment has been developed. Its goal is to facilitate the exchange and analysis of previous disaster information by providing online recommendations such as crisis solving processes. Our third contribution concerns the method by which we generate these online recommendations; here we use a novel combination of process mining and case based reasoning.
Nowadays, there are many national or international disaster databases that collect data concerning previous disasters. While the same disaster may be described in more than one database, its description may differ since there is a
The paper is organized as follows: Section 2 presents an introduction to process mining and case based-reasoning; Section 3 describes the state of the art of some of the existing crisis management models and related works in case based
978-0-7695-5002-2/13 $26.00 © 2013 IEEE DOI 10.1109/WETICE.2013.77
244 227 245
reasoning and process mining. Section 4 describes the proposed generic disaster model. Section 5 illustrates our approach used to provide disaster management recommendations. Section 6 illustrates the implemented web based shared environment and gives the first results. Section 7 concludes the paper and discusses some future improvements. II.
languages such as Petri nets or notations such as Business Process Model and Notation (BPMN). Conformance focuses on comparing a predefined process model with its real execution as stored in the log in order to detect inconsistencies. Enhancement aims at improving an existing model, based on information about its execution in an event log.
CASE BASED REASONING AND PROCESS MINING
III.
RELATED WORKS
At present, the majority of crises response organizations have their own adhoc databases of statistical information on natural disasters. Therefore, information is collected and analyzed according to their needs and priorities. However, we believe that it would be advantageous for disaster managers and researchers who are interested in collecting and analyzing data, and in using past experiences to have a general tool to better manage future crisis situations. Disaster management is a difficult domain to model since it deals with physically and socially interconnected complex and heterogeneous systems. In order to overcome this lack of standardization, a unifying disaster model is required. Moreover, current disaster databases, such as EM-DAT, Sigma and NatCati, do not sufficiently cover past disaster management practices. Hence, there is a lack in exploiting and learning from previous good or bad practices. Consequently, the development of a more sophisticated natural disasters database covering these gaps would be a valuable asset for both facilitating data comprehension, and improving disaster management in order to minimize damages. In this way, we adopt an approach that combines case based reasoning with process mining. CBR has mainly been applied in the medical domain for diagnostic and some therapeutic tasks [5]. In the domain of molecular biology, CBR systems have been developed to address problems such as the analysis of genomic sequences and determining the structure of proteins [6]. It has also been used in many other domains such as geographical information systems [7] and even to generate music [8]. Furthermore, it has been previously combined with data mining to improve case-based classification [9], and with genetic algorithms to facilitate selecting and weighting features to personnel rostering [10]. Both,[9] and [10] works, improve only the first step of CBR process while our work focuses on the reuse step. Process mining has been recently used in various domains, such as business for internal transaction fraud mitigation [11], healthcare [12] and in various other areas for analyzing resource behavior [13] and time prediction [14]. To our knowledge, neither case based reasoning, nor process mining, has yet been applied in the crisis domain. Moreover, process mining may not yet be combined with case based reasoning to manage ongoing crises.
A. Case based reasoning Case-based reasoning (CBR) is a problem solving and learning approach. It relies on previous experiences, or cases, to solve new ones. A case is composed of two parts: a description of the current situation that represents the problem, and a solution used to solve it. Moreover, the case may also describe the consequences of the given solution (e.g. success or failure). Most importantly, by extracting similar situations to the given problem, CBR techniques can create new solutions. CBR does not only resort to general knowledge, nor general sameness between problem descriptions and conclusions. Rather, it relies on specific knowledge from previous experiments. To deal with a given case, CBR performs a process based on four steps known as the 4REs [1]: 1. RETRIEVE the most similar case(s) to the one under consideration. 2. REUSE the information and knowledge collected from previous cases to extract a possible solution for the new case. 3. REVISE the proposed solution and adapt it to the new case. 4. RETAIN the parts of this experience likely to be useful for future problem solving. CBR efficiency relies on the rapidity and precision of its retrieval and reuse algorithms, which are considered as a combination of searching and pairing. Improving retrieval performance through more effective approaches has been the focus of several research works, with some authors combining it with other approaches such as data mining [2] and genetic algorithms [3] to enhance diversity and provide a full coverage of the available cases. Our work follows this trend by proposing to implement the CBR process by integrating process mining techniques to enhance the efficiency of the reuse step. B. Process mining Process mining aims at discovering, monitoring and improving real processes by extracting knowledge from event logs available in today's information systems. Log files may contain data recording three perspectives: behavioral (tasks and their time of execution), informational (data used and produced by tasks) and organizational (actors which perform tasks and their relationships). As mentioned by Van der Aalst, process mining contains three main tasks (i) discovery, (ii) conformance and (iii) enhancement (or extension) [4]. Process discovery, which is the one used in our approach, deals with the detection of process models from event logs. The discovered models can be represented by formal
In conclusion, there are very few complete approaches that take into account the past disaster information to better manage the present ones in real-time. i EM-DAT, Sigma and NatCat are international disaster databases that provide an objective basis for vulnerability assessment and rational decision-making in disaster situations. They are accessible through the following URLs EM-DAT: http://www.emdat.be, Sigma: http://www.mrnathan.munichre.com/ and NatCat: http://www.swissre.com/
246 228 245
To overcome this insufficiency, we first propose a unifying disaster model and then an approach that combines case based reasoning with process mining. The following section describes our disaster model which constitutes the backbone of our system. IV.
The model is divided into three packages: Disaster Description, Studied System, and Treatment System. The latter is sub-divided into four packages that represent the four phases of crisis management. The Disaster Description package contains all the information about the disaster itself: start date, end date, location, its major trigger(s) (which could be of different types, such as the result of climatic change or ice melts due to global warming, or as a result of a previous disaster, e.g. an earthquake that triggers a tsunami), the damages caused, and the disaster risk. The Studied System package describes the system affected by the disaster. This can be either a human system or a material system. The Treatment System package describes the crisis management process. It is divided into four packages: Preparedness Phase, Response Phase, Recovery Phase and Prevention Phase. Since our approach uses case based reasoning the knowledge base should be designed with respect to both the case base structure and the proposed disaster model. Therefore we used a case design pattern. This pattern was derived from our disaster model and it responds to the required structure of a case, which contains the problem description and the relative solution as shown in the top of Fig. 1.
A UNIFYING DISASTER MODEL
Crisis response may be supported by formalizing collected knowledge so that it may be transformed into an operational model. We have proposed a unified disaster model to provide a set of common concepts and generic interactions applied in various disasters. Our work is based on the meta-model of Benaben et al., [15] and that of Othman et al. [16]. It can be noted that all disaster models define their sub-systems according to their own specific needs. In our case we need to store previous disaster practices, as well as the descriptions of the disasters, in order to be able to select and extract similar disasters. As a result, we used the three sub-systems of the metamodel of Benaben and his colleagues: disaster description, affected system and treatment system. However, unlike Benaben’s metamodel, the treatment system is divided into sub-systems following the 4 phases of crisis management: Prevention, Response, Recovery, and Preparedness. This is because the disaster management recommendation depends on the disaster phase; we cannot recommend a recovery process when a response process is needed. While the model of Othman et al. is composed of 4 separated models, our model proposes a unifying view of all the concepts involved in a disaster. The bottom of Fig. 1 shows the conceptual model, as a UML package diagram.
247 229 246
Figure 1: Disaster meta-model and case pattern
248 230 247
automatically. - The disaster management processes are extracted from reports and articles shared on the web, then stored in our knowledge base with respect to the proposed disaster model. - Only “good” solutions, i.e. those judged efficient and considered to be good practices, are stored.
Therefore, we have created a case base of all past disasters as problem descriptions, together with the practices used to manage those disasters stored as solutions. V.
COMBINING CASE BASED REASONING AND PROCESS MINING
Our approach is shown in fig. 2 The right hand side of the figure describes the CBR cycle while the left hand side parts detail the retrieve and reuse steps.
Our idea is to provide in real time a crisis management process responding to a current crisis description (specified by the users). This process is discovered by analyzing the management processes of past similar crisis. Given the huge number of previous disasters and the diversity of applied management processes, the provided disaster management process model is non-deterministic. The potential solutions represent a culmination of past experiences and the user is free to choose one that is most appropriate to the current situation. For each given disaster description input, we start by calculating the similarity rate compared to stored past disasters before extracting their relative management processes. Using process mining, we aim to extract the processes and generate a nondeterministic process model to use. Moreover, to use CBR and to adapt the process mining technique to our problem, we define the following assumptions in order to reduce the scope of our investigation: - The knowledge base used is relevant and updated; it covers a considerable number of disasters and is filled
We start the CBR process using the current disaster as the new case, then we apply a retrieval algorithm to extract the most similar previous crises. The management process of those extracted cases will be the starting point of the process mining technique, thus the set of selected processes used on similar crises will replace the event log. Since process mining will be applied in the reuse phase of CBR cycle, the description of the management processes should contain all the important parts of the event log structure used for the discovery algorithm of process mining. As shown above, the proposed approach depends on two major algorithms: - The Retrieval algorithm used is a nearest neighbor retrieval algorithm [17], which is a simple approach that calculates the similarity between previous stored cases and the currently described one taking into account the selected given features of the crises. - The mining algorithm used is the alpha algorithm [18].
Figure 2: CBR cycle as used in our approach
Figure 2: CBR cycle as used in our approach 249 231 248
This is one of the well-known algorithms used in process mining which creates Petri nets based on an event log. The principle of this algorithm is to analyze the succession and causality relations between tasks in the log files and to deduce processes (set of coordinated tasks where the coordination patterns may be sequence, parallel iteration). Cases are represented according to the meta-model and the log file is limited to the treatment system package.
A. First phase: Similar Cases Retrieval The retrieval of similar cases assumes that the knowledge base of previous cases is already filled. As previously mentioned, we assume the existence of a reliable and relevant knowledge base that is populated with many cases; this is needed to ensure the usefulness of our approach. The retrieval approach consists of extracting a set of similar cases from the stored past crises, based on a highest similarity degree factor with the given crisis. The system receives the description of the current crisis and uses the nearest neighbor algorithm to calculate its similarity to previous crises based on features selected by the user. The nearest neighbor algorithm [17] consists in summing the processed characteristics of each crisis stored and then extracting those that satisfy the maximum number of features. Evaluation of this phase will be based on the response time to generate similar cases. Obviously we aim for a short time response in order to facilitate real-time crisis management. This is feasible since there are already several research works, which use for example data mining [9] or genetic algorithms [10], to enhance the retrieval algorithm speed.
A. Use case diagram of the system: As shown in Fig. 3, we have two types of user: the disaster managers, who have access to the environment; and the system’s administrator. Both have access to the disaster repository and to the online disaster management recommendation based on previous practices. However, only the expert has the right to update or populate the repository.
B. Second phase: Process Mining This phase consists in discovering a non-deterministic crisis management process from the extracted management processes of the previous similar practices. This phase is concerned with process discovery, which consists of combining the processes used for similar disasters after mining the similarities and differences between them. In order to use process mining, the knowledge base of the previous management processes should contain the same information as the event logs, which is used as input to the process mining algorithm. We assume that all the stored previous practices were successful. The discovery algorithm is inspired from the alpha algorithm of process mining [4]. This simple algorithm was chosen to test the applicability of our approach. A set of the similar crises management processes, organized according to their order of use in each disaster, is the input to the process mining algorithm. We have a log file including all the solutions of similar past experiences and we deduce a single process model synthesizing (playing) all the solutions. VI.
Record information in a structured way and facilitate information extraction of raw or processed data to help in the synchronization and management of a disaster. Display different views of this information following the user’s role. This ensures a better handling of data and therefore effective resource management. Conduct a graphical analysis of disaster data. Archive (model / store / capitalize) past disasters. Make recommendations based on experience from previous disasters to ensure that future abnormal or catastrophic ones do not get out of hand and can be managed with minimal damage.
Figure 3: General Use Case Diagram
B. Human Computer Interface: Our environment enables the manipulation of the disaster repository. This is shown in the expert’s interface (first tab), in fig. 4, which is composed of two parts. Part1: Visualize database, that itself is decomposed into four options:
IMPLEMENTATION AND FIRST RESULTS 1. Disaster profile, which searches for disaster information by the type and subtype of disasters. 2. Country profile, which searches for disasters by country.
To validate our approach, we have developed a web-based shared environment for previous disaster data analysis and disaster management support. More precisely, it is an environment that provides the following services: 250 232 249
3. Disaster list, which is a generic search that allows specifying more detailed search criteria. 4. Analysis, which allows the user to choose two variables to correlate in order to extract the needed data for the graph representation. Part2: Fulfill database is restricted to the expert and allows manually inserting data into the repository.
VII. CONCLUSION AND FUTURE WORK This paper has defined a unifying disaster model and specified a new approach for supporting disaster management systems. A web based environment has been developed that demonstrates the feasibility of our solution. The originality of our approach relies on the fact that it supports crisis management that uncovers past practices through the integration of process mining and case based reasoning. Although there are difficulties in modeling crises and in collecting information about existing crises and models due to restricted access of some databases, we believe that the approach has merit. Nevertheless some improvements are envisaged. The nearest neighbor algorithm, used for retrieving similar cases in CBR, is simple and was chosen just to validate the applicability of our approach. However, the similarity function can be improved by taking into account the effects of the disaster and not just its description. Considering that a crisis solving process aims at reducing crisis effects, two different disasters producing the same consequences should use the same management process. Secondly, although the alpha algorithm for process mining shows that our approach is feasible, in future we intend to use ProMii , which is a generic open-source tool for implementing process mining tools in a standard environment. This would allow us to add additional services to our application by using the two other functions of process mining: conformance and enhancement processes. Conformance may help to identify the deviation of a given crisis process from established norms or rules. A deviation could be considered as a transgression or in other cases as good practices that could help in enhancing predefined processes and building new versions of crisis processes.
Figure 4: Disaster repository manipulation framework
Disaster management support is accessed via the second tab, as shown in Fig. 5. Our proposed approach was tested using a case study, the result of which is shown below.
VIII. ACKNOWLEDGEMENTS Julie Dugdale would like to acknowledge the support of the University of Agder, to which she is affiliated. ii
All information concerning this extensible framework can be found on the URL: http://www.processmining.org.
Figure 5: Example of disaster management recommendation result
251 233 250
REFERENCES [1]
A.Aamodt and E.Plaza 1994. Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI Communications 7(1), 39–59. [2] N.Gouttaya and A. Begdouri 2012. Integrating data mining with case based reasoning (CBR) to improve the proactivity of pervasive applications. Information Science and Technology (CIST), 136 - 141. [3] J. Jarmulak S.Craw and R.Rowe2000 Genetic Algorithms to Optimise CBR Retrieval*. Springer-Verlag Berlin Heidelberg, 136-147. [4] W.M.P. Van der Aalst, 2011. Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer. [5] R. Schmidt S.Montani, R.Bellazzi, L.Portinale and L.Gierl 2001. Cased-Based Reasoning for medical knowledge based systems. International Journal of Medical Informatics (64), 355-367. [6] I. Jurisica, and J. Glawgow 2004. Applications of case-based reasoning in molecular biology. Artificial Intelligence Magazine, Special issue on Bioinformatics, 25(1), 85-95. [7] A. Holt and G. L. Benwell 1999. Applying case-based reasoning techniques in GIS geographical information science, 9-25. [8] J.L. Arcos, R. Lopez de mantaras, and X. Serra 1998. A case based reasoning system for generating expressive musical performance. Journal of New Music Research. [9] N.Arshadi and I.Jurisica, 2005. Data Mining for Case-Based Reasoning in High-Dimensional Biological Domains. IEEE Transactions on Knowledge and Data Engineering, 1127-1137. [10] G.R. Beddoe, S. Petrovic 2006. Selecting and weighting features using a genetic algorithm in a case-based reasoning approach to
[11]
[12]
[13]
[14]
[15]
[16]
[17] [18]
252 234 251
personnel rostering. European Journal of Operational Research 175, 649-671. M.Jans, Werf, J.M.E.M. van der, N. Lybaert, K. Vanhoof, 2011. A business process mining application for internal transaction fraud mitigation. Expert Systems with Applications, 38(10), 13351-13359. R.S. Mans PhD Thesis 2011. Workflow Support for the Healthcare Domain. Technische Universiteit Eindhoven, Eindhoven, The Netherlands. W.M.P. van der Aalst, J. Nakatumba, A. Rozinat, and N. Russell 2008. Business Process Simulation: How to get it right? in BPM Center Report BPM-08-07, BPMcenter. org W.M.P. van der Aalst, M.H. Schonenberg, and M. Song. 2009. Time Prediction Based on Process Mining. BPM Center Report BPM-0904, BPMcenter.org. F.Benaben, C.Hanachi, M.Lauras, P.Couget and V.Chapurlat 2008. Metamodel and its Ontology to Guide Crisis Characterization and its Collaborative Management. Proceedings of the 5th International ISCRAM Conference Washington, DC, USA. S.H.Othman, G.Beydoun 2010. Metamodelling Approach To Support Disaster Management Knowledge Sharing". Proceedings on the 21st Australasian Conference on Information Systems ACIS.Y. J.Kolodner 1993. Case-Based Reasoning, San Mateo, California: Morgan Kaufmann. W.M.P. van der Aalst, Weijters, A.J.M.M. and Maruster (2004). "Workflow Mining: Discovering Process Models from Event Logs". IEEE Transactions on Knowledge and Data Engineering. 16(9): 1128-1142 2004.