A clinical decision support system for prediction of ... - Semantic Scholar

7 downloads 0 Views 1MB Size Report
of the causes of abortion in SLE patients and better therapeutic ..... abortion. (mean ± SD). Live birth. (mean ± SD). Missing data (%). Platelets before pregnancy.
International Journal of Medical Informatics 97 (2017) 239–246

Contents lists available at ScienceDirect

International Journal of Medical Informatics journal homepage: www.ijmijournal.com

A clinical decision support system for prediction of pregnancy outcome in pregnant women with systemic lupus erythematosus Khadijeh Paydar a , Sharareh R. Niakan Kalhori a , Mahmoud Akbarian b , Abbas Sheikhtaheri c,∗ a

Department of Health Information Management, School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Islamic Republic of Iran Rheumatology Research Center, Tehran University of Medical Sciences, Tehran, Islamic Republic of Iran c Department of Health Information Management, School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Islamic Republic of Iran b

a r t i c l e

i n f o

Article history: Received 2 May 2015 Received in revised form 14 October 2016 Accepted 29 October 2016 Kewords: Artificial neural network Clinical decision support system Pregnancy outcomes Pregnancy complications Premature birth Stillbirth Systemic lupus erythematosus

a b s t r a c t Objective: Pregnancy among systemic lupus erythematosus (SLE)-affected women is highly associated with poor obstetric outcomes. Predicting the risk of foetal outcome is essential for maximizing the success of pregnancy. This study aimed to develop a clinical decision support system (CDSS) to predict pregnancy outcomes among SLE-affected pregnant women. Methods: We performed a retrospective analysis of 149 pregnant women with SLE, who were followed at Shariati Hospital (104 pregnancies) and a specialized clinic (45 pregnancies) from 1982 to 2014. We selected significant features (p < 0.10) using a binary logistic regression model performed in IBM SPSS (version 20). Afterward, we trained several artificial neural networks (multi-layer perceptron [MLP] and radial basis function [RBF]) to predict the pregnancy outcome. In order to evaluate and select the most effective network, we used the confusion matrix and the receiver operating characteristic (ROC) curve. We finally developed a CDSS based on the most accurate network. MATLAB 2013b software was applied to design the neural networks and develop the CDSS. Results: Initially, 45 potential variables were analysed by the binary logistic regression and 16 effective features were selected as the inputs of neural networks (P-value < 0.1). The accuracy (90.9%), sensitivity (80.0%), and specificity (94.1%) of the test data for the MLP network were achieved. These measures for the RBF network were 71.4%, 53.3%, and 79.4%, respectively. Having applied a 10-fold cross-validation method, the accuracy for the networks showed 75.16% accuracy for RBF and 90.6% accuracy for MLP. Therefore, the MLP network was selected as the most accurate network for prediction of pregnancy outcome. Conclusion: The developed CDSS based on the MLP network can help physicians to predict pregnancy outcomes in women with SLE. © 2016 Elsevier Ireland Ltd. All rights reserved.

1. Introduction Systemic lupus erythematosus (SLE) is a chronic autoimmune disease with a worldwide distribution. SLE, as an inflammatory multi-system disease with unknown aetiology, has different clini-

∗ Corresponding author at: Department of Health Information Management, School of Health Management and Information Sciences, Iran University of Medical Sciences, Yasmi St., Valiasr Ave., Tehran, Islamic Republic of Iran. E-mail addresses: [email protected] (K. Paydar), [email protected] (S.R. Niakan Kalhori), [email protected] (M. Akbarian), [email protected], [email protected] (A. Sheikhtaheri). http://dx.doi.org/10.1016/j.ijmedinf.2016.10.018 1386-5056/© 2016 Elsevier Ireland Ltd. All rights reserved.

cal manifestations and symptoms, laboratory signs, and prognosis in patients [1,2]. According to studies, the prevalence of SLE varies worldwide. A meta-analysis of 16 studies carried out in North America and Europe indicated that the prevalence of SLE is 23.8 per 100,000 [3]. In Iran, SLE is relatively common with a prevalence of 40 per 100,000 [1]. The overall prevalence of SLE in women is much more than in men, and is mostly common in the reproductive age [4]. Pregnancy in SLE-affected women is a major challenge. Young women with SLE who want to become pregnant encounter a number of risks to their and their unborn baby’s health [5]. Pregnancies in SLE-affected women may lead to foetal loss in the form of a spontaneous abortion (before 20 weeks of gestation) or intrauterine foetal death (still birth after 20 weeks of gestation) [3]. Foetal

240

K. Paydar et al. / International Journal of Medical Informatics 97 (2017) 239–246

loss has decreased significantly from 43% in the 1960s to 17% in the early 2000s in SLE-affected women [6]. Although in recent years foetal survival has increased due to better understanding of the causes of abortion in SLE patients and better therapeutic approaches [4,7], foetal loss continues to be higher among SLEaffected women than the normal population. Overall, about 20% of pregnancies in SLE-affected women end up in foetal loss [6]. A study conducted in 2013 shows a pregnancy loss rate of 35.8% in organ-threatening SLE-affected women compared with healthy women where the pregnancy loss rate is 8.9% [8]. Generally, the incidence of spontaneous abortion is approximately 14–35% among SLE-affected women, while it is 7–12.5% in the normal population [3]. Special gynaecological and rheumatological care is necessary to increase the chance of pregnancy success. Regarding the importance of pregnancy outcomes, pregnancy in SLE patients should be planned in advance, and the patients should be monitored before and during the pregnancy [9]. However, decision-making and prediction of pregnancy outcomes in SLE patients is a complex task for physicians. A variety of factors with different degrees of effectiveness may affect the outcome of pregnancy in SLE-affected women. In addition, non-linear relationships between these factors and the outcome of pregnancy may result in more complexities [2,4,5,7,10–13]. Clinical decision support systems (CDSS) are considered to be useful tools for solving sophisticated problems. A CDSS is designed to help healthcare providers make timely decisions about their patients [14]. These systems can be developed based on the experts’ knowledge, historical data analysis, or clinical evidence [15,16]. In this regard, a variety of algorithms such as neural networks, fuzzy logic, regression, Bayesian networks, or a combination of these algorithms have been successfully applied in different fields of clinical practices [16–20]. Prediction of pregnancy outcomes for SLE patients can noticeably contribute towards providing effective health consulting and therapeutic services, as well as prevent undesirable outcomes and physical and psychological complications resulting from an abortion [21,22]. Considering the importance of influential factors on the outcome of pregnancy (risk factors), many studies have been conducted worldwide to identify these factors. For example, factors such as high blood pressure, lupus nephritis, thrombosis, anti-phospholipid antibodies, flare up of lupus during six months before pregnancy, anti-cardiolipin antibodies, lupus anti-coagulant, anaemia, leucopoenia, age at time of conception, history of abortion, the number of children, and many other factors have been introduced in previous studies [2,5,10,11,13,23–26]. The numerous risk factors and the complexity of relationships among these risk factors necessitate the development of a CDSS. Application of CDSS in maternal care in developing countries has been previously reported [27]; however, to the best of our knowledge, no CDSS has been developed to support physicians in predicting the outcome of pregnancy in SLE-affected women. Therefore, in this study, we developed and evaluated a CDSS based on an artificial neural network (ANN) application.

2. Material and methods 2.1. Data source and data collection In order to identify the potential variables, we first conducted a literature review and an interview with a rheumatologist. We used this search strategy to find related literature: (systemic lupus erythematosus OR SLE) AND (predictor* OR risk factor*) AND (pregnancy outcome OR foetal loss) in PubMed, Scopus, and Google

Scholar. We selected related articles published in English from 2005. In addition, we considered local evidence and rheumatologist consultations. We identified 50 potential variables in this regard; however, five of these 50 variables (anti Ro/ss-A and anti La/ss-B antibodies, anti-b2-glycoprotein I antibody [anti-b2GPI], serum albumin level, and antihypertensive use) were mostly not recorded in the patients’ records. Therefore, we collected data for the other 45 variables. The needed data was retrospectively collected from the medical records of SLE-affected pregnant women who had visited a specialized hospital (Shariati Hospital) and a specialized clinic in Tehran, Iran from 1982 to 2014. We also used the database of these cases in the Rheumatology Research Center to collect some required data elements that were not available in the medical records. A trained researcher (the first author) extracted the data from the medical records and the database. Out of 400 registered pregnancies, the medical records of only 149 of these pregnancies were available and eligible for analysis (104 from the hospital and 45 from the specialized clinic). The final dataset included the 45 possible influential factors and pregnancy outcomes of the cases. 2.2. Pre-processing and feature selection The outcome of pregnancy had four categories: first-trimester spontaneous abortion, second-trimester spontaneous abortion, and preterm and term live birth. We re-classified these categories into two classes: spontaneous abortion and live birth. There was data missing for some variables. One of the approaches for resolving such a problem is using the mean or mode of the variable for all samples belonging to the same class. This approach has been applied in previous studies [18,28]. Therefore, we also replaced missing values by the mean of each class for numeric values and the mode of each class for non-numeric values. Multiplicity of the variables (features) may result in overtraining a model. Therefore, one of the most commonly used pre-processing techniques is dimension reduction or feature selection. In this regard, irrelevant, weakly relevant or less important features are removed [28]. Feature selection may improve the accuracy of the resulting model. Researchers have applied a variety of techniques for this purpose [17,29,30]. For this end, we used the binary logistic regression method through IBM SPSS software (version 20). Sixteen important features (P-value