ICACSIS 2010
ISSN: 2086-1796
Feature and Model Selection on Automatic Sleep Apnea Detection using ECG Sani M. Isa, Mohamad Ivan Fanany, Wisnu Jatmiko, and Aniati Murni Faculty of Computer Science, University of Indonesia Email:
[email protected],
[email protected],
[email protected],
[email protected]
Abstract— The purpose of this study is to find optimal features and classifier's model selection for sleep apnea detection using ECG signals. We want to determine whether a set of unknown ECG signals (test data) is from heavy apnea, mild apnea, or healthy categories. We examine two recent approaches of features selection: an approach proposed by Chazal et al. (2004), which is based on the RR-interval mean and time-series analysis; and an approach proposed by Yilmaz et al. (2010), which is based on the RR-interval median. We also examine cross validation and random sampling method in the classifier's probability model selection. We evaluate the approaches using three classifiers: k-Nearest Neighbor (kNN), Naive-Bayes and Support Vector Machine (SVM). In addition, we use a self organizing map (SOM) clustering or preprocessing to provide better sample that can represent the entire training data. Our experiment using ECG data from PhysioNet shows that classification results using only 3 features as proposed by Yilmaz et al. (2010) gives about 3.59% gain on overall classification accuracy (CA) and 7.5% gain on area under ROC-curve (AUC) on than the classification accuracy using 8 features as proposed by Chazal et al., (2004). I. INTRODUCTION
H
umans spend approximately one-third of his life to sleep. Sleep is one of the basic needs of human beings, just as important as air, food, and beverages. Sleep function is vital because regeneration of damaged body’s cell, growth hormone production, formation of the immune system, refreshing unused memory in the brain, resting on the part of the brain that controls emotions, decision-making process and social interaction occurs during sleep. Someone who experience sleep disorder cannot optimally utilize this sleep function. As a result, people who have sleep disorders will experience decreased immune system, can not concentrate properly, unable to perform Manuscript received October 10, 2010. This work was supported by the Competitive Research Grant University of Indonesia 2010.
physical activity well, also increased the risk of stress[1]. Many studies have shown that sleep disorders can be fatal. According to the National Institute in Neurogical Disorders and Stroke, a person experiencing sleep disorders showed poor driving performance when tested on the simulator. Driving fatigue is one of the main causes of traffic accidents. Data from the National Highway Traffic Safety Administration, USA estimates more than 100,000 traffic accidents and 1,500 deaths annually. According to International Labour Organization (ILO) in 2004, 4% of GDP in the world is spent on handling the problem of work accidents. In 2001, Indonesia has reached 100 thousand work accidents, where 40% of them caused by sleep disorders. To date 84 kind of sleep disorders have been discovered, where insomnia, sleep apnea, narcolepsy, and restless leg syndrome is having the highest prevalence. Sleep apnea is one type of sleep disorder characterized by reduction (hypoapnea) or cessation (apnea), breathing during sleep. Although sleep apnea can be treated, it can cause side several effects and complications in symptoms such as heart attacks, strokes, high blood pressure, reduced productivity, and sudden death in chronic stadium[1], [2]. Polysomnography (PSG) has become standard in diagnosing sleep disorders, including sleep apnea. Polysomnography include recording of breath airflow, respiratory movement, oxygen saturation, electroencephalogram (EEG), electromyogram (EMG), electrooculogram (EOG), electrocardiogram (ECG) for heart activity, as well as body position. PSG performed in the laboratory for a full night sleep under doctors and nurses supervision [3]. Although PSG has been recognized as the golden standard for diagnosing sleep apnea, PSG got many criticisms from some researchers. This is due to several reasons, first: the patient feels uncomfortable because of the large number probe (about more than 16) to be attached to the body during sleep; second: high cost, we need to provide a number of expensive sensors and also pay some expenses for the supervision of doctors and
357
ICACSIS 2010
ISSN: 2086-1796
nurses during patient sleep; three: laboratory availability that has the PSG facility is still limited so that the patient must be willing to wait in line diagnosis for 6 months or more for single diagnosis [4]. Therefore a need for simpler technology that has the same reliability with PSG that doesn’t require special laboratory. ECG recording is one of the simpler and efficient technologies in sleep disorders detection. In 1984, Guilleminault proposed a test against the RR-interval derived from ECG signal could be used for the screening of sleep apnea [5]. In 2000, PhysioNet held a contest to detect and quantify sleep apnea based on the ECG (Detecting and quantifying sleep apnea based on the ECG: A challenge from PhysioNet and Computers in Cardiology 2000). Since then, this issue becomes one of a major topic in sleep research. In 2004, Chazal et al, suggested an obstructive sleep apnea detection using ECG signal. Features used in these studies are the statistical measurement of variables derived from RR-intervals and ECG-derived respiratory signal (EDR). Classification by linear discriminant analysis give 90% accuracy rate [4]. In 2010, Yilmaz et al. conducting research for the sleep stages classification and apnea using a single lead ECG. They used three features derived from the RRinterval, i.e. median, inter-quartile range (IQR), and the mean absolute deviation (MAD) values. Classification is done by: kNN, Quadratic Discriminant Analysis (QDA), and Support Vector Machines (SVM). Results of classification based on QDA and SVM give the best accuracy rate that is 94.5% [5]. The aim of this study is to examine the two approaches to find optimal features and classifier's probability model estimation (k-fold cross-validation and random sampling compared to a test on training data) in validating the classifier training and testing. We use features from extracted RR-interval based on Chazal et al., and also features suggested by Yilmaz et al. Performance assessment of each features and and model selection is done by measuring the the classification performance (CA and AUC) in determining the presence of apnea both on training data and also test data sets. II. METHODOLOGY Schematic diagram of the system used in this study is shown in Fig. 1. To get the RR-interval, the ECG signal is processed by QRS detection using PhysioToolkit Library [10]. The output of the QRS detection in the form of beat to beat annotation provides information regarding when the heart beat is happened. RR-interval can be calculated based on the beat to beat annotation, but not all RR-intervals are processed because the RR-
Fig. 1. Schematic diagram of the system
interval whose value is beyond the limits of human physiology (smaller than 0.5 or greater than 1.5) will be eliminated [6]. The next stage is feature extraction. In this study, we examine two appraoches in selecting the optimal features, first the approaches using the features suggested by Chazal et al. [4], and secondly the features suggested by Yilmaz et al. [6]. Features extraction is performed for each epoch with 1 minute duration. This value was chosen to synchronize each epoch data with the reference data i.e apnea annotation at the end of every minute ECG signal. To test the training data, we use cross validation with the number of folds = 15, and random sampling with repeat train/test = 10 where the proportion of validation test and training data is 2 to 8. Next, we apply self organizing map (SOM) or Kohonen network [11] clustering to select the most representative data from training set. This is done by choosing first five cluster with the largest number of member. Classification is done for each epoch by kNN, Naive-Bayes, and SVM to determine whether the epoch in apnea (A) or non-apnea (N) category. The number of epochs that belong to a class of apnea from each record will be used as the basis for determining whether the subject is a person with sleep apnea, borderline/mild apnea, or normal. Classification assesment is done by calculation of Classification Accuracy (CA) and Area Under the ROC (AUC).
358
ICACSIS 2010
ISSN: 2086-1796
A. Subjects The database of ECG signals used in the 2000 Computers in Cardiology Conference Challenge was used in this study. It consists of 70 recordings, containing a single ECG signal digitized at 100 Hz with 12-bit resolution, continuously for approximately 8 hours (individual recordings vary in length from less than 7 hours to nearly 10 hours). Each recording includes a set of reference annotations, one for each minute of the recording, indicate the presence or absence of apnea during that minute. These reference annotations were made by human experts on the basis of simultaneously recorded respiration signals. The subjects of these recordings are men and women between 27 and 63 years of age, with weights between 53 and 135 kg (BMI between 20.3 and 42.1); AHI ranges from 0 to 93.5. For classification purposes we divide 70 record into two groups, training data (A01-A35) and testing data (X01-X35), each consist of 35 records with 20 records with AHI> 15 (Category A: patients with sleep apnea), 5 records with 5 ≤ AHI ≤ 15 (category B: patients with borderline / mild apnea), and 10 records with AHI