Multiclass classification of obstructive sleep apnea ...

Physiological Measurement

Related content

PAPER

Multiclass classification of obstructive sleep apnea/hypopnea based on a convolutional neural network from a single-lead electrocardiogram To cite this article: Erdenebayar Urtnasan et al 2018 Physiol. Meas. 39 065003

View the article online for updates and enhancements.

- Sleep apnea: a review of diagnostic sensors, algorithms, and therapies Mehdi Shokoueinejad, Chris Fernandez, Emily Carroll et al. - Obstructive sleep apnea detection using spectrum and bispectrum analysis of single-lead ECG signal Roozbeh Atri and Maryam Mohebbi - Automatic classification of apnea/hypopnea events through sleep/wake states and severity of SDB from a pulse oximeter Jong-Uk Park, Hyo-Ki Lee, Junghun Lee et al.

This content was downloaded from IP address 165.132.208.31 on 22/06/2018 at 02:06

Physiol. Meas. 39 (2018) 065003 (9pp)

https://doi.org/10.1088/1361-6579/aac7b7

PAPER

RECEIVED

13 February 2018 RE VISED

18 May 2018

Multiclass classification of obstructive sleep apnea/hypopnea based on a convolutional neural network from a single-lead electrocardiogram

ACCEP TED FOR PUBLICATION

24 May 2018 PUBLISHED

20 June 2018

Erdenebayar Urtnasan, Jong-Uk Park and Kyoung-Joung Lee Department of Biomedical Engineering, College of Health Science, Yonsei University, Wonju-si, Gangwon-do 26493, Republic of Korea E-mail: [email protected] Keywords: obstructive sleep apnea and hypopnea (OSAH), deep learning, single-lead electrocardiogram (ECG), convolutional neural network (CNN)

Abstract Objective: In this paper, we propose a convolutional neural network (CNN)-based deep learning architecture for multiclass classification of obstructive sleep apnea and hypopnea (OSAH) using single-lead electrocardiogram (ECG) recordings. OSAH is the most common sleep-related breathing disorder. Many subjects who suffer from OSAH remain undiagnosed; thus, early detection of OSAH is important. Approach: In this study, automatic classification of three classes—normal, hypopnea, and apnea—based on a CNN is performed. An optimal six-layer CNN model is trained on a training dataset (45 096 events) and evaluated on a test dataset (11 274 events). The training set (69 subjects) and test set (17 subjects) were collected from 86 subjects with length of approximately 6 h and segmented into 10 s durations. Main results: The proposed CNN model reaches a mean F1-score of 93.0 for the training dataset and 87.0 for the test dataset. Significance: Thus, proposed deep learning architecture achieved a high performance for multiclass classification of OSAH using single-lead ECG recordings. The proposed method can be employed in screening of patients suspected of having OSAH.

1. Introduction Obstructive sleep apnea and hypopnea (OSAH) is the most common sleep-disordered breathing (SDB) and is prevalent in general populations (Young et al 1997). OSAH is defined as breathing cessation (apnea) and airflow decrease (hypopnea) with oxygen desaturation or arousal during sleep for 10 s or longer (Berry et al 2012). It can cause repetitive shortness of breath and sleep fragmentation, which decreases the quantity of sleep time and degrades the sleep quality (Engleman et al 2004). Risk factors of OSAH include excessive fatigue, daytime sleepiness, and even drowsy driving, which can result in traffic accidents (Barbé et al 1998, Vgontzas et al 2000). Furthermore, serious consequences, such as heart attacks and sudden death, may occur (Lopez-Jimenez et al 2008, Bhattacharjee et al 2009). Polysomnography (PSG) is a standard tool for objectively evaluating OSAH. PSG can be used to measure various biosignals during sleep. In this approach, an electroencephalogram, electrooculogram, electrocardiogram (ECG), chin-leg electromyogram, oxygen saturation (SpO2), airflow, thoracic and abdominal breathing, and snoring are used to determine sleep parameters, such as the patient’s sleep efficiency and duration, apneahypopnea index, and amount of snoring. However, PSG has several drawbacks, including the requirements of expensive diagnostic equipment, the attachment of multiple sensors, and manual reading by experts. Manual reading by sleep specialists, in particular, is time consuming, labor intensive, and prone to errors. Various methods have been proposed over the past few decades to replace PSG and minimize the number of biosignals needed to detect sleep apnea. Many studies on sleep apnea detection used a single-lead ECG signal (Penzel et al 2002), as well as several types of classifiers (Urtnasan et al 2017), including a support vector machine (Al-Angari and Sahakian 2012, Khandoker et al 2009b, Jafari 2013, Chen et al 2015), neural network (Khandoker et al 2009a), fuzzy logic (Álvarez-Estévez and Moret-Bonillo 2009, Lee et al 2017), k-nearest neighbor (Mendez © 2018 Institute of Physics and Engineering in Medicine

Physiol. Meas. 39 (2018) 065003 (9pp)

E Urtnasan et al

et al 2010, Xie et al 2012), and AdaBoost (Xie et al 2012). Although they achieved a high level of performance in terms of sleep apnea detection, most of these studies did not independently consider the hypopnea events. Hypopnea has a higher or similar occurrence rate as apnea in SDB patients. Even in mild and moderate groups of SDB patients, a wide distribution range was noted (Park et al 2015). In addition, apnea-hypopnea index is counted by not only the number of apnea events, but also the number of hypopnea events. Therefore, accurate detection of hypopnea is as significant as apnea with respect to sleep monitoring and diagnostic systems. Moreover, in Álvarez-Estévez and Moret-Bonillo (2009) and Khandoker et al (2009a), automatic detection algorithms were proposed for multiclass classification of OSAH based on machine learning by fuzzy logic and neural networks. Handcrafted feature extraction from ECG, respiratory, and SpO2 signals, as well as complex classification consisting of two-stage classifiers, were used in those studies. Deep learning is one of the most active fields in machine learning. It has been shown to improve the performance of image and speech recognition (Hinton et al 2012, Krizhevsky et al 2012), natural language processing (Sutskever et al 2014), and biomedical engineering (Angermueller et al 2016, Ravi et al 2016). Recently, deep learning approaches, such as the deep autoencoder (Kaguara et al 2015) and convolutional neural network (CNN) (Dey et al 2017), were used in studies for SDB detection. However, these studies performed binary classification for SDB detection. Therefore, the objective of this study was to design a deep learning architecture for multiclass classification of OSAH using single-lead ECG recordings. The proposed deep learning architecture is based on the CNN model for automatic classification of three classes: normal, hypopnea, and apnea. The proposed CNN model was trained and evaluated on OSAH datasets of SDB patients.

2. Methods 2.1. Populations and data acquisition For this study, we analyzed the standard full-night PSG data of 86 subjects (65 male, 21 female) who were diagnosed as having OSAH. All subjects were randomly divided into respective training and test set groups. The training set group comprised three categories: mild OSAH (21 subjects), moderate OSAH (24 subjects), and severe OSAH (24 subjects). The test set group also comprised three categories: mild OSAH (five subjects), moderate OSAH (six subjects), and severe OSAH (six subjects). These groups are outlined in table 1. The PSG data were acquired using an Embla N7000 amplifier system (Embla System Inc., USA) at the Sleep Center of Samsung Medical Center (Seoul, Korea). The ECG signals were measured by a single-lead transducer at lead II during the nocturnal PSG. The average length of the single-lead ECG was 7.4 ± 0.72 h and stored 200 samples s−1 with 16-bit resolution. The PSG data were annotated by a sleep specialist in accordance with the standards of the American Academy of Sleep Medicine (AASM) guidelines (Berry et al 2012). The exclusion criteria in this study were the following: patients with central sleep apnea, mixed sleep apnea, and cardiovascular disorders. All subjects provided written informed consent, and the study protocol was authorized by the institutional review board (No. 2012-01-063) of Samsung Medical Center. 2.2. Data preprocessing and datasets We analyzed single-lead ECG recordings during total sleep time (approximately 6 h) of the subjects. The singlelead ECG recordings were preprocessed through a bandpass filter (5–11 Hz) to remove undesired noise. They were segmented into 10 s intervals with no overlap; if more than 50% of annotations are the same, it is considered as one/a single event. The distribution of total segments was 50 090 for normal, 18 790 for apnea, and 31 300 for hypopnea. Among them, normal and hypopnea events were randomly selected to match the number of apnea events. Figure 2 depicts the difference between normal breathing, hypopnea, and apnea events of the single-lead ECG recordings. The OSAH datasets contain training and test set events from the subjects of the respective training and test groups. The training set was composed of a balanced number of extracted OSAH events, specifically 15 038 normal breathing, 15 039 hypopneas, and 15 019 apneas, from the training subject group. We used a test set of OSAH events—3752 normal, 3751 hypopneas, and 3771 apneas—from the test subject group to evaluate the performance of the CNN model (table 2). 2.3. CNN-based deep learning architecture To design the deep learning architecture for OSAH multiclass classification using the single-lead ECG, we employ convolution, pooling, and activation layers (figure 1). The whole deep learning model is divided into three sections: the input (figure 1(a)), CNN (figure 1(b)), and classification (figure 1(c)). Detailed explanations of each section are provided below. The input section consists of the input signal and batch-normalization (figure 1(a)). The input signal of the CNN model is a one-dimensional (1D) time series from a single-lead ECG signal. The input signal is segmented by epochs, each having a 10 s period and 1 × 2000 shape. The input signal is normalized by batch-normalization before training the CNN model. Batch-normalization is defined by equation (1): 2

Physiol. Meas. 39 (2018) 065003 (9pp)

E Urtnasan et al

Table 1. Demographic and anthropometric characteristics of the subject groups. All subjects

Total

Mild (5 ⩽ AHI