Comparison of the learning algorithms for ... - Semantic Scholar

Safety, Reliability and Risk Analysis: Beyond the Horizon – Steenbergen et al. (Eds) © 2014 Taylor & Francis Group, London, ISBN 978-1-138-00123-7

Comparison of the learning algorithms for evidence-based BBN modeling: A case study on ship grounding accidents Mazaheri Arsham & Sormunen Otto-Ville Edvard

Aalto University, School of Engineering, Department of Applied Mechanics, Marine Technology, Finland

Hyttinen Noora

Aalto University, School of Science, Department of Mathematics and System Analysis, Finland

Montewka Jakub & Kujala Pentti

Aalto University, School of Engineering, Department of Applied Mechanics, Marine Technology, Finland

ABSTRACT: The ship grounding accidents account for more than 50 percent of all maritime accidents in the Gulf of Finland. In other sea areas this number may change, however it is still predominant, thus this type of accident poses certain risks to the maritime transportation systems all around the world. These risks can be evaluated utilizing numerous models through the evaluation of probability and consequences of an accident. However when comes to the next step, namely risk management, these models are not suitable as in most cases they can handle only pre-defined solutions, though the systematic and optimal decision making of the risk in the transportation system cannot be made. This paper proposes a methodology for evidence-based BBN risk modeling which makes an attempt to fill this gap. The model gathers data coming from various sources on the grounding accidents that have happened in the Gulf of Finland, organized in the form of chain of events. Thereby, the accident reports of the ship grounding accidents have been used as well as the experts’ knowledge elicitation sessions have been held to structure the model. The paper then compares the available algorithms for learning the structure of a BBN from the datasets that is extracted from the accident reports. Although “Bayesian Search” algorithm is shown to work better for the studied problem and with the utilized data, it is recommended not to rely only on a single algorithm while making BBN risk models based on the evidence extracted from the accident reports.

1 InTroduction Those who have seen the documentary series “Seconds from disaster” by National Geographic Channel will probably recall the famous opening sentence as “disasters don’t just happen; they are triggered by chains of critical events”. Maritime disasters, like ship grounding accidents, are following quite the same logic that is also presented in Heinrich’s Domino theory (Heinrich, 1950) and is highlighted by the well-known Reason Swiss cheese theory (Reason, 1990). There are critical unbroken chains of events that come sequentially one after each other that at the end cause the accident. In order to prevent the accident, these causal chains of events need to be broken by implementing the barriers as Risk Control Options (RCO). However, what those barriers would be and where they should be placed in order to break the chains and prevent

the accident is the key question for the decision makers and the risk managers. Traditionally, risk managers rely on the risk models that model the desired systems and mimic their behaviors under different conditions to define what kind of barriers or RCO should be defined and where they should be placed to protect the system from the accident. Nevertheless, the effectiveness of the implemented RCO highly depends on the accuracy and reliability of the utilized risk models for analyzing the system. Therefore, in order to effectively mitigate the risk, the implemented risk model should firstly reflect the knowledge on the analyzed system with satisfying accuracy (Aven, 2013), and secondly be able to suggest the feasible and meaningful measures for lessening the involved risk (Montewka et al., 2013a). Bayesian Belief Network (BBN) models are considered as proper tools that are able to suggest

193

the risk control measures by backward reasoning (Bobbio et al., 1999). However, in order to deliver the reliable and effective results for risk management process, the Bayesian networks should be based on the evidence, and not just intuitive opinions (Stirling and Gee, 2002). Using the accident reports as a rich and valuable source of data for evidence-based risk modeling and constructing BBNs is shown as a promising method for maritime risk modeling (Kristiansen, 2010); see Figure 1. However, there are quite many issues that need to be addressed when using the accident reports as evidence to build the risk models in order to not jeopardize the reliability of the resulted evidencebased risk model. The issues includes, but not limited to, the biases that may exist in the accident reports because of the accident investigators; the biases that may be added to the extracted information from the accident reports by the person who reviews the reports; the taxonomy and the categorization of the causal factors extracted from the reports in order to catch statistically meaningful frequent events in each category (Kristiansen, 2010); the level and quality of the human intervene as background knowledge when a network is constructed; and the learning algorithms that are used for constructing the network and learning the parameters (Montewka et al., 2013b). All of the above mentioned concerns need to be studied and discussed in order to be addressed. In this study, we are trying to open up the discussion

Figure 1. Evidence-based BBN modeling.

about one of the many issues, which is the suitable algorithm for learning the structure of a BBN for the studies related to maritime safety. This study is based on the learning algorithms that are implemented in the open source software entitled GeNIe™ (Graphical Network Interface) that is developed by the Decision Systems Laboratory of the University of Pittsburgh. The study focuses only on the grounding accident as one of the maritime accident types, by using the accident reports on ship grounding accidents. 2 DATA The data that can be used for a study, depends on their content can be divided into two groups as primary and secondary. Normally, secondary data are synthesized data from the primary sources; therefore they may have some level of biases. Thus, in order to have more reliable data sources, the utilized data should preferably come from or be accompanied by the primary sources. However, accessing to the primary sources is not always an option; especially for the events that have happened in the past. For maritime safety analysis, the accident reports normally present valuable information regarding why and how an accident happens. The most important weakness of most of the accident reports is that they have focused mostly on factual information regarding the vessel, times and places, and then the consequences, though not so much on the causal factors (Kristiansen, 2010). Nevertheless, since obtaining the primarily data about the accidents that have happened in the past is near the impossible, using the accident reports as secondary source of data is unavoidable. Accident reports are categorized as secondary source of data, in where the reports are prepared previously from the primarily data that the investigator obtained first-hand by interviewing the crew and analyzing the evidences after the accident. Therefore, two matters should be taken into account when using the previously prepared accident reports: first is to consider the existence of possible bias by the accident investigator in the way that the report is prepared, and second is the involved bias by the person who reviews the reports as the secondary source of data and try to extract the causal factors from them. However, these issues are out of scope of this study and have not been addressed for the secondary data that are utilized in this study. The historical data that has been utilized for this study is the accident reports of the grounding accidents prepared by Accident Investigation Board of Finland (Onnettomuustutkintakeskus). Accident Investigation Board of Finland (AIB)

194

has the policy to run investigation on marine accidents that happen in the Finnish water territories no matter which flag vessels are involved. It also runs accident investigation on the marine accidents happen outside the Finnish water territories if at least a Finnish flag vessel is involved. However, not all of the accidents falling in the above criterion will be investigated by AIB, rather only those accidents that the acquired knowledge can enhance the safety awareness in marine traffic (AIB, 2012). In the publicly available database of AIB, there exist 73 reports of grounding accidents happened between the years 1995–2009, which are utilized in this study as the secondary source of data. Since the use of accident reports was unavoidable, in order to mitigate the inherent weaknesses of the accident reports as the secondary source of data, another source of data as accompanying data is used in the form of expert knowledge. The expert knowledge for this study is acquired as a primary source via expert workshop, unstructured interviews, and questionnaire; which the acquiring processes are explained in the next Chapter. 3 methodology The methodology of this study is integrated into four main stages. The first stage is the context analyzing of the accident reports, in order to extract the causal chain of critical events that have previously caused grounding accidents. At the second stage, quite the same knowledge is acquired from the domain experts using expert workshop, electronic questionnaire, and unstructured interviews. The third stage is building the BBNs based on the primary and secondary data extracted in the first two stages. The GeNIe™ software, which is a graphical editor for creating, modifying, and calculating Bayesian Networks, is used to create the BBNs based on the extracted data. In GeNIeTM, the operator can create the networks and put the probabilities or equations of the parameters manually, or the preprocessed datasets can be fed to the software as the inputs for learning the structure or parameters of the network or both. The final stage is the comparison of the constructed BBNs to find the most proper algorithm that can be used for learning BBNs for similar studies.

The context analysis of the accident reports were done in four stages. First the related parts in the reports were reviewed and the causal chains of events that lead the vessels to run aground were visualized into timeline stories using the words and sentences used in the reports (see Fig. 2). The related parts that were reviewed in the reports were mostly the Summary, the Accident, the Analysis, and the Conclusion. However, in some of the reports some other parts needed to be reviewed in order to get the better understanding of the event. Second, based on the extracted timeline stories of the accidents, some general causal factors are defined and the extracted causal events are assigned to those general causal factors (see Fig. 3). Since every single accident was caused by a unique chain of events, in order to capture frequent events in each defined causal factor, the detected causes have been generalized into 35 comprehensive groups as shown in Table 1. Third, a frequency table for the defined causal factors is prepared in MS-ExcelTM based on the extracted timeline story of each accident. The frequency table is later used as the input data to feed the GeNIeTM software for creating the BBN. Fourth, using the defined causal factors (Table 1) as the nodes and looking at the extracted timeline stories, a BBN is constructed by the researcher as an expert in maritime safety domain in the way that it represents the causality in all the reviewed accident reports.

3.1 Accident reports The accident reports of AIB (73 reports) regarding the grounding accidents all are reviewed in order to extract the causal chain of events that previously led to a grounding accident.

Figure 2. An example of the extracted timeline stories from the accident reports.

195

Table 1. Factors that cause grounding accidents based on the reviewed accident reports.

Table 2. Properties of the created BBNs in the study.

Figure 3. Assigning the keywords in the reports (see Fig. 1) to more general factors.

3.2 Expert knowledge 3.2.1 Interviews The constructed BBN in the previous stage is revised by performing two unstructured interviews with two marine experts. The interviewed experts both have nautical education and experience and

BBNs

Nodes

Arcs

Parameters

BBN_1 BBN_2 BBN_3 BBN_Exp BBN_GTT BBN_BS

36 34 41 36 36 36

68 73 114 110 146 109

932 1220 98130 15980 17890 13004

they both also work as researcher in maritime safety field. The interviews are performed in unstructured way and mainly as an open discussion. At the beginning of each interview session, the aim of the study is explained for the expert, and then the context analyzing process is explained. Thereafter, the constructed BBN is shown to the expert, and he is asked to comment or modify the nodes and the edges based on his expertise. After performing the interviews, the BBN is modified and finalized by the researcher and based on the results of the interviews. The result of this process is a BBN that here is referred to as BBN_1 (see Table 2). 3.2.2 Workshop After defining the affecting factors based on the accident reports, one day expert workshop is

196

organized with six experts in the domain. Three of the experts had the research experience in the maritime safety field, and three of them had the nautical education and experience. At the beginning of the workshop, the aim of the workshop is explained and then basics of BBN is explained for the participants in order to be sure that all the experts have the basic knowledge of BBN. Thereafter, the participants were divided into two groups of three, where a group had two mariners and one researcher (G1) and other group had one mariner and two researchers (G2). The groups were given Post-It™ sheets that the defined causal factors from the context analyzing stage (Table 1) were written on them. The experts have been asked to construct a BBN for grounding accident by using the given causal factors. However, the participants have been asked to remove/add factors if they find it necessary. The groups were asked to start the BBN from the given starting point that is shown in Figure 4. Each group had a supervisor, who had previously participated in the accident reports reviewing process. Nevertheless, the supervisors did not intervene with constructing process, and their roles were only to answer the rise up questions about the causal factors, and to manage the timing of the session. At the end of the workshop, two BBNs are constructed and delivered by the experts (G1 delivered BBN_2 and G2 delivered BBN_3; see Table 2). 3.2.3 Questionnaire The two networks from the workshop (BBN_2 and BBN_3) together with BBN_1 were then integrated into one network, here referred to as BBN_Exp (see Table 2). During the integration process, the differences between the three networks (i.e. missing

arcs, extra arcs, incorrectly oriented arcs, and missing nodes) as well as the detected cycles in BBN_2 and BBN_3 were addressed via an electronic questionnaire. The questionnaire was prepared based on the differences and cycles detected in the networks and the same experts that participated in the workshop were contacted again to answer the questionnaire and clarify the detected problems. Thereafter, the answers were collected and they have been ranked based on the background of the experts. The ranking was done in this way that the experts with nautical experience are ranked twice over the researchers. At the end, the structure of BBN_Exp is delivered based on the results of the workshop and the questionnaire. 3.3 GeNIeTM (Graphical Network Interface) software The frequency table that was created in the context analyzing stage is used as the input for the GeNIeTM software to learn the structure of a BBN for grounding accident. GeNIeTM has six learning algorithms that can be used to learn the structure of a BBN. To learn more about the algorithms the readers are refer to e.g. (Dash and Druzdzel, 2003, Friedman, 1998, Tsamardinos et al., 2006). However, from among the available alternative algorithms, only the Greedy Thick Thinning (GTT) and Bayesian Search (BS) can be used within the constraints of our study. The Naïve Bayes-based algorithms do not support input from the experts as the background knowledge; PC algorithm requires two steps intervention by human experts; and the Essential Graph Search algorithm does not converge within the constraints of our study and thus gives the runtime error. The software is thus fed by the frequency table from the first stage (context analysis) as the input data and then fed by the integrated BBN_Exp from the second stage (expert knowledge) as the background knowledge. Using GTT and BS algorithms, two BBNs are structured by the software. They are here referred to as BBN_GTT and BBN_BS (see Table 2). It is worth to mention that the parameters for the constructed BBNs mentioned in Table 2 are counted based on the fact that all the nodes in the BBNs (affecting factors in the frequency table) had only two states as Present and Absent. 3.4 Comparison of the BBNs

Figure 4. The given starting point to the group of experts for constructing BBNs.

The structure of any two BBNs can be compared with regard to different criteria in order to find how differ the structure of the learned network is from the considered to be the original network (Jongh and Druzdzel, 2009). The most used

197

criteria are: Missing and Extra edges, which show how many arcs, irrespective of the direction, are missed in one network while they are present in the other. The lower these values are the better; Matching edges, which show how many arcs, irrespective of the direction, are common between the two networks. The higher this value is the better; Matching oriented edges, which shows how many similar arcs have also the same direction in the networks. The higher this value is the better; Non-matching oriented edges, which show how many similar arcs have opposite directions in the networks. The lower this value is the better; and Hamming Distance (Acid and deCampos, 2003, Tsamardinos et al., 2006, Perrier et al., 2008), measures the number of the changes that need to be made to a network in order to turn it into the original one. The lower this value is the better. In this study, the BBN_Exp is considered as the original network and BBN_GTT and BBN_BS are compared against the BBN_Exp with the above mentioned criteria. The results of the comparison are shown in Table 3.

by BS algorithm. Since in order to run the BBN, the conditional probabilities of the parameters need to be defined at the next stage, the lower number of parameters is an advantage. Thus, from this perspective also the BS is prioritized over the GTT algorithm. Nevertheless, although from the structure point of view, the BS algorithm shows better results, the final decision on choosing an algorithm over another for learning a network from a dataset depends on the algorithm performance on both learning the structure and learning the parameters. Moreover, choosing an algorithm is most probably a problem dependent question. Therefore, the conclusion of this study is valid for grounding accident and similar studies only. Thus, what can be recommended based on the results of this study is not to rely only on one algorithm when learning a BBN from a dataset; rather always try different algorithms and choose the best one within the criteria of the problem by comparing the final results of the network and validating it with different methods. 5 DISCUSSION AND conclusion

4 RESULTS Table 3 shows that GTT algorithm results a better structured BBN for this study in compare with BS algorithm based on three tests as Extra Arcs, Matching Arcs, and Correctly Oriented Arcs. However, looking at the differences in the test results between GTT and BS, one can see that although GTT works better than BS in three tests, the performance of BS does not have huge difference with GTT in those tests. While, in the two tests that BS performs better than GTT, i.e. Missing Arcs and Hamming Distance, the differences between the two algorithms are significant. This makes us to conclude that based on the performed tests, BS algorithm results better structure than GTT within the criteria of our study. Moreover, looking at Table 2, one can see that the number of parameters for the structured BBN by GTT algorithm is about 37% higher than the one structured Table 3. Comparison between the constructe BBNs by GeNIeTM and BBN_Exp as the original network. Method

BBN_BS

BBN_GTT

Missing arcs Extra arcs Matching arcs Correctly oriented arcs Incorrectly oriented arcs Hamming distance

4 5 105 105 0 9

40 4 106 106 0 44

There are many models available for assessing the risk of different accidents in the maritime field. However, not all the available models are useful for risk managers to mitigate risk, as they are not evidence-based models. BBNs are shown as a useful and reliable tool for risk management, as they can suggest proper risk control options. Nonetheless, they cannot be reliable enough if they are not based on the evidence. Using accident reports as one of the rich sources of evidence for risk modeling is shown beneficial for maritime risk assessment. However, using the accident reports as the data source for risk modeling will raise its own concerns, which from among them is the learning algorithm that will be used for constructing the BBN. In this study, we show that from among the implemented learning algorithms in the GeNIeTM software, the Bayesian Search algorithm works better for grounding risk modeling (see Table 3). This finding may be generalized over the problems that have the same criteria, which in general is the risk modeling in the maritime safety field. Notwithstanding, we do not recommend to only rely on one learning algorithm when building the BBN risk models based on the accident report; rather we recommend to use different algorithms as each algorithm may result better model depends on how the problem is formulated, how the causal chains are extracted, and how the causal factors are coded into more general categories. Moreover, although the test methods for comparing the structure of the BBNs can show the

198

better algorithm for learning the network from the utilized dataset, the final decision of choosing an algorithm over another should be made based on the final result of the network. It means that when the parameters of the network have also been learned and the final results of the network are calculated and validated. In this study we did not learn the parameters of the network from the utilized dataset as the dataset was built only based on the accident reports. The accident reports only show the chain of events that have ended into an actual accident. Thus, learning the parameters from such dataset will result of having optimistic probabilities with regard to the causal factors. In order to address this issue, one can either implement the expert knowledge to modify the learned parameters, or can implement the incident reports too. Incidents or near-miss cases are those causal chains of events that have not ended into an actual accident as one causal link was missing. Therefore, by using the near-miss reports the more accurate frequency of the events can be extracted, and thus the learned parameters will be more reliable. However, due to the limitation of the available databases, still the input from the experts is unavoidable. Nevertheless, by using the available evidences (accident and incident reports), one can decrease the common existing uncertainty in the expert knowledge. Therefore, it is recommended to perform similar studies by using the datasets consist of both accident and incident reports, to see if the result will be changed. It is also found by this study that the lack of proper communication and cooperation in the bridge, between the pilot, the master, and the crew is the most common causal factor for grounding accident (see Table 1). Since the accident reports that are utilized in this study for obtaining the frequency table of the causal factors on grounding are based on the accident reports of AIB, given the AIB’s policy to run accident investigation, we can conclude that in general the detected trend in the frequencies of the causal factors on grounding is an universal trend; meaning that the factors on top of Table 1 are most probably the most frequent causal factors in the chain of events that cause grounding accident. The next step is to address the other mentioned issues regarding the use of accident reports for BBN evidence modeling. The issues are the biases that may exist in the accident reports because of the accident investigators, the biases that may be added to the extracted information from the accident reports by the person who reviews the reports, the taxonomy and the categorization of the causal factors extracted from the reports in order to catch statistically meaningful frequent events in each category, and the level and quality of the human

intervene as background knowledge when a network is constructed. ACKNOWLEDGEMENT The distance between the different BBN structures with different methods have been measured with the computer code that has been written by Martijn de Jongh from the Decision Systems Laboratory of the University of Pittsburgh. This study was conducted as a part of MIMIC (Minimizing risks of maritime oil transport by holistic safety strategies) and CHEMBALTIC (Risks of Maritime transportation of chemicals in the Baltic Sea) projects. The projects are partly funded by the European Union and the financing comes from the European Regional Development Fund (ERDF), the Central Baltic INTERREG IV A Programme 2007–2013; the Finnish Funding Agency for Technology and Innovation (Tekes); the City of Kotka; Kotka-Hamina Regional Development Company (Cursor Oy); Port of HaminaKotka Ltd; Centre for Economic Development and Transport and the Environment of Southwest Finland (VARELY); Neste Oil Oyj; Vopak Chemicals Logistics Finland Oy; Crystal Pool Ltd; and Finnish Transport Safety Agency (Trafi). Other support also comes from Finnish Port Association and Finnish Ship-owners Association. REFERENCES Acid, S. & Decampos, L.M. (2003) Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs. Journal of Artificial Intelligence Research, 18, 445–490. Aib (2012) Accident Investigation Board of Finland. Aven, T. (2013) A conceptual framework for linking risk and the elements of the data-information-knowledgewisdom (DIKW) hierarchy. Reliability Engineering and System Safety, 111, 30–36. Bobbio, A., Portinale, L., Minichino, M. & Ciancamerla, E. (1999) Comparing Fault Tree and Bayesian Networks for dependability analysis. In Kanoun, K. (Ed.) Computer Safety, Reliability and Security. Berlin/ Heidelberg, Springer. Dash, D. & Druzdzel, M.J. (2003) Robust independence testing for constraint-based learning of causal structure. In kaufman, M. (Ed. 19th Annual Conference on Uncertainty in Artificial Intelligence (UAI-03). Friedman, N. (1998) The Bayesian structural EM algorithm. In kaufman, M. (Ed. 14th Annual Conference on Uncertainty in Artificial Intelligence (UAI-98). Heinrich, H.W. (1950) Industrial Accident Prevention, McGraw Hill. Jongh, M.D. & Druzdzel, M.J. (2009) A Comparison of Structural Distance Measures for Causal Bayesian Network Models. Recent Advances in Intelligent Information Systems, 443–456.

199

Kristiansen, S. (2010) A BBN approach for analysis of maritime accident scenarios. In ale, B.J.M., Papazpglou, I.A. & Zio, E. (Eds.) ESREL. Rhodes, Greece, Taylor & Francis Group. Montewka, J., Goerlandt, F., & Kujala, P. (2013a) On a risk perspective for maritime domain. Journal of Polish Safety and Reliability Association, (4) 1–2, 101–108. Montewka, J., Sinclair, H., Kujala, P., Haapala, J. & Lensu, M. (2013b) Modelling ship performance in ice using Bayesian networks. In the Proceedings of the 22nd International Conference on Port and Ocean Engineering under Arctic Conditions—POAC, June 9–13, 2013 Espoo, Finland.

Perrier, E., Imoto, S. & Miyano, S. (2008) Finding Optimal Bayesian Network Given a Super-Structure. Journal of Machine Learning Research, 9, 2251–2286. Reason, J. (1990) The contribution of latent human failures to the breakdown complex systems. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 327, 475–484. Stirling, A. & Gee, D. (2002) Science, precaution, and practice. Public Health Reports, 117, 521–533. Tsamardinos, I., Brown, L.E. & Aliferis, C.F. (2006) The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning, 65, 31–78.

200

Comparison of the learning algorithms for ... - Semantic Scholar

Comparison of the learning algorithms for ... - Semantic Scholar

Suggest Documents

A Comparison of Two Learning Algorithms for Text ... - Semantic Scholar

Comparison of different algorithms for ... - Semantic Scholar

Comparison of Prognostic Algorithms for ... - Semantic Scholar

A comparison of machine learning algorithms ... - Semantic Scholar

Transfer Learning Algorithms for Image ... - Semantic Scholar

Convergent learning algorithms for potential ... - Semantic Scholar

Reinforcement Learning Algorithms for solving ... - Semantic Scholar

The Use of Machine Learning Algorithms for the ... - Semantic Scholar

Comparison of cluster algorithms for the analysis ... - Semantic Scholar

A Comparison of Genetic Algorithms for the ... - Semantic Scholar

a comparison of algorithms for the efficient solution ... - Semantic Scholar

Comparison of Statistical Algorithms for the ... - Semantic Scholar

Comparison of Adaptive Beamforming Algorithms ... - Semantic Scholar

Efficient Algorithms for Subtrees Comparison of ... - Semantic Scholar

Comparison of algorithms for anisotropic meshing ... - Semantic Scholar

Comparison of color clustering algorithms for ... - Semantic Scholar

A Comparison of Encodings and Algorithms for ... - Semantic Scholar

ACS Comparison of algorithms for anisotropic ... - Semantic Scholar

performance comparison of bug algorithms for ... - Semantic Scholar

performance comparison of bug algorithms for ... - Semantic Scholar

A Comparison of Parallel Algorithms for Multi ... - Semantic Scholar

A Comparison of Algorithms for Connected Set ... - Semantic Scholar

Circular sequence comparison: algorithms and ... - Semantic Scholar

A comparison between optimization algorithms ... - Semantic Scholar