Comparison of entropy-based regularity estimators: application to the ...

1 downloads 0 Views 386KB Size Report
Application to the Fetal Heart Rate Signal for the. Identification of Fetal Distress. Manuela Ferrario, Maria G. Signorini*, Giovanni Magenes, and Sergio Cerutti, ...
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 53, NO. 1, JANUARY 2006

119

Comparison of Entropy-Based Regularity Estimators: Application to the Fetal Heart Rate Signal for the Identification of Fetal Distress Manuela Ferrario, Maria G. Signorini*, Giovanni Magenes, and Sergio Cerutti, Fellow, IEEE

Abstract—This paper considers the multiscale entropy (MSE) approach for estimating the regularity of time series at different scales. Sample entropy (SampEn) and approximate entropy (ApEn) are evaluated in MSE analysis on simulated data to enhance the main features of both estimators. We applied the approximate entropy and the sample entropy estimators to fetal heart rate signals on both single and multiple scales for an early identification of fetal sufferance antepartum. Our results show that the ApEn index significantly distinguishes suffering from normal fetuses between the 30th and the 35th week of gestation. Furthermore, our data shows that the MSE entropy values are reliable indicators of the fetal distress associated with the presence of a pathological condition at birth. Index Terms—Cardiotocography, fetal heart rate, fetal pathologies, nonlinear parameters, spectral analysis.

I. INTRODUCTION

I

N the last years, many studies tried to demonstrate the nonlinear nature of the heart rate variability (HRV) signal. These attempts encountered many practical problems, such as the limited length of the time series introducing approximations of theoretical methods and consequently a limitation of purposes. An example is the approximate entropy (ApEn) introduced by Pincus [1], [2], which provides a measure of signal regularity. Differently from the original goal of the Kolmogorov entropy, which aimed at investigating the dynamics of the generating system, eventually up to confirm its chaotic nature, the approximate entropy (ApEn) index was conceived to characterize the time series without any hypothesis on the nature of the generating system. This last approach demonstrated its practical usefulness in many experimental conditions especially when the identification of a precise model structure was impossible or inappropriate. Recently, Costa et al. [3], [4] made a further step toward the characterization of the signal structure by itself, by introducing the multiscale entropy (MSE) measurement, which is an indicator of the regularity of the signal at different time scales.

Manuscript received October 12, 2004, revised May 30, 2005. Asterisk indicates corresponding author. M. Ferrario and S. Cerutti are with the Dipartimento di Bioingegneria, University Politecnico di Milano, 20133 Milano, Italy. *M. G. Signorini is with the Dipartimento di Bioingegneria, piazza Leonardo da Vinci 32, University Politecnico di Milano, 20133 Milano, Italy (e-mail: [email protected]). G. Magenes is with the Dipartimento di Informatica e Sistemistica, University of Pavia, 27100 Pavia, Italy (e-mail: [email protected]). Digital Object Identifier 10.1109/TBME.2005.859809

A first goal of this paper is to examine the behavior of MSE for simulated signals, in order to understand the informative content that can be obtained from the signal itself. Moreover, we investigated the sensitivity of MSE values with respect to the time series length. A second goal is to confirm the usefulness of entropy estimators on a real diagnostic problem, consisting of the dichotomic classification of normal and suffering fetuses during pregnancy. In order to achieve this second goal, we applied these regularity estimators to fetal heart rate (FHR) time series recorded during cardiotocographic (CTG) monitoring antepartum. Fetal distress (FD) can occur for several reasons and it can appear during various stages of pregnancy with different severity degrees and symptoms. In general, FD is characterized by the reduction of the maternal-fetal respiratory exchanges, which mainly causes a reduction of fetal blood oxygenation and heap of carbon dioxide. It induces a state of anaerobic metabolism, which is associated with an increase of the acid metabolite concentration [5]. The early identification of a distress status is among the fundamental goals of fetal monitoring antepartum and it becomes more and more important as the advances in neonatal pathology allow newborns surviving at earlier stages of premature deliveries. As a matter of fact, the early detection of a life-threatening condition, such as a severe distress, helps the obstetrician to decide for an immediate action (e.g., to induce delivery by cesarean section). Furthermore, a correct identification of the fetal state allows avoiding both false positives (unnecessary cesarean sections) and false negatives (pregnancy prosecution in case of severe sufferance). Nowadays, in most clinical situations, the obstetrician must rely on his/her ability to evaluate the CTG tracings by eye inspection or with the help of some classical morphological indexes of proven diagnostic weakness [6]. Thus, quantitative reliable clinical tools are required for early detecting the appearance of fetal sufferance. Our final goal is then to investigate whether some regularity estimators, applied to the FHR signal, can provide quantitative indexes revealing the presence of FD antepartum. II. METHODS The development of chaos theory in the last decades supplied the framework to study nonlinear dynamics through a new approach. Many efforts were made to explain the complex behavior of deterministic systems with the presence of nonlinear

0018-9294/$20.00 © 2006 IEEE

120

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 53, NO. 1, JANUARY 2006

features instead of the usual stochastic models. The most direct link between chaos theory and the real world is the analysis of time series from real systems in terms of nonlinear dynamics [7]. In some situations, when the a priori knowledge about the generating system is deep enough, it is possible to precisely characterize the structure of the system itself from the observed time series. In these cases, the study involves a mathematical characterization of the phase space or the estimation of some descriptors of the attractor. This approach leads to the reconstruction of the attractor in an embedding space, estimating some invariants such as fractal dimension and Lyapunov exponents [8]–[11]. Nevertheless, it is often impossible to obtain sufficient insight on biological systems from signals recorded by noninvasive techniques. The knowledge we have on their behavior is limited by their intrinsic complexity, which is the result of numerous interacting mechanisms contributing to the physiological performance. The concept of complexity does not correspond to a unique and rigorous definition. Sometimes it is related to the difficulties we have in describing or understanding a signal. Pincus [1] observed that the classic Kolmogorov-Sinai entropy assumes infinite value for processes with superimposed stochastic noise. Thus, it is unable to distinguish processes that differ in complexity. In this case, the word complexity refers to the unpredictability of the system state location, once the initial conditions are known. The less predictable the states are, the more complex the system is. The association of unpredictability with complexity leads to the conclusion that Gaussian noise is very complex because all points in the phase space are a probable state of the system and the state location is unpredictable. In contrast, a periodic dynamical system with period 2 owns an extremely low complexity level: in fact, all trajectories meet two points with probability 1. Nevertheless most biological systems lie between the two extremes of this complexity scale. Starting from these considerations, Pincus proposed a family of statistics, called Approximate Entropy [2], aimed at measuring the signal regularity i.e., the presence of similar patterns in the time series. ; 1 , the algorithm Given N data points constructs sequences and it computes, for each 1, the quantity number of (1) measures, with a tolerance , the regularity of patterns by comparing them to a given pattern of length ( and are fixed values: is the detail level at which the signal is analyzed, is a threshold, which filters out irregularities). The regularity parameter is defined as ApEn where 1 ApEn is the estimator of this parameter for an experimental time series of fixed length N.

As Pincus himself noticed [1], [2], [12], approximate entropy strongly is affected by a bias effect. In addition, ApEn depends on the record length. The correction of the bias effect is not trivial since the straightforward removal of self-counting could origin a high sensitivity to outliers. Furthermore, Richman and Moorman [13] demonstrated that in some cases ApEn lacks of consistency. Its computation in irregular time series is affected by a bias, which may cause an overestimate of the entropy value, mostly when values are very small. However, the application of ApEn statistics to real data often obtains good results, giving general information about the regularity and the persistence of the signal. This fact has led to an extensive use of ApEn in clinical applications, e.g., the analysis of inter-beat RR signals in the study of cardiovascular diseases [14]–[16]. Entropy methods exploit a symbolic representation of the time series. Despite the severe reduction of information, they are able to enhance relevant features of the signal. ApEn, and more generally coarse-grained entropies, can be useful to track qualitative changes in time series patterns, without precisely characterizing the generating system. Richman and Moorman [13], [17] developed a modification of the ApEn algorithm, named sample entropy (SampEn), in order to remove some of the deficiencies reported here. The differences with respect to ApEn are: 1) self-matches are not counted; 2) only the first N- vectors of length are considered; and 3) the conditional probabilities are not estimated in a template manner. Thus, the probability measure is computed directly as the logarithm of conditional probability instead of the ratio of the logarithmic sums. SampEn shows a relative consistency in cases where ApEn does not. Furthermore the values of SampEn agree with the theoretical values expected for a uniform random noise time series much more than the ApEn values, even for very short time series [13]. Both ApEn and SampEn supply a single index concerning the general behavior of the time series, but they cannot reveal the underlying dynamics of the generating system. Thus, if the signal X has a lower entropy value than the signal Y, we can only say that X is more regular than Y. If the original purpose of entropy was to identify chaotic dynamics, the statistics ApEn and SampEn have changed the perspective by providing a figure related to the regularity of the time series at the original time scale. Several methods have been proposed in the literature, for achieving a thorough analysis at different scales. Among them wavelet-transform modulus-maxima method (WTMM) appears a very promising tool [18]. Other methods, calculating the Hurst exponent estimate long range dependence scaling exponent and statistical self-similarity as well as the presence of long range correlation and “memory” in the process. Costa et al. [3], [4] introduced the so-called MSE method. and conThe procedure considers a time series of N points , as a funcstructs consecutive coarse-grained time series tion of the factor 1

1

(2)

FERRARIO et al.: COMPARISON OF ENTROPY-BASED REGULARITY ESTIMATORS

121

is the original time series, is the length of each coarse-grained time series. An entropy measure is calculated for each sequence and it is plotted as function of the scale factor . The rationale beyond this procedure is an enhancement of time series repetitive patterns as a function of different scales. Differences in the index at different time scales could help understanding the time series in terms of regularity and structure (i.e., short versus long range). III. SIMULATIONS To investigate the information provided by the MSE method regarding signal characterization, we evaluated the behavior of MSE by simulating a set of known signals. We selected an example of stochastic and completely irregular signal (white noise), a correlated and multiscale signal (1/f noise), a corrupted deterministic signal (MIX process) and a chaotic deterministic signal (logistic map). We chose this standard set of signals to allow a comparison with the original works by Costa et al. [3], [4], [19]. At first we considered which measure of entropy is suitable for a multiscale analysis. We used both the sample entropy and the ApEn as entropy measure for the multiscale approach, comparing their performance. 2, 0.15 (SD: The adopted parameters were standard deviation of the original time series), as reported in the original works by Costa et al. [3], [4]. However, the MSE analysis, reported in the cited works, considers series collected from 24-h Holter recordings, noticeably longer than our 1-h FHR tracings. For this reason we applied the MSE to signals of different 1000, 5000, 7000, 10 000, 15 000, and 30 000 length N ( samples) and we studied the index dependence on the time series length. Fig. 1 shows the results of the simulations in time series from a 1/f noise process. Series were built by generating a 1/f power spectral density, with random phases (uniformly distributed on [ 2 , 2 ]) and by applying the inverse fast Fourier transform. Entropy measures, plotted for 1/f noise, show persistent values for all scale factors, with the exception of the shortest 1000 . The persistence is coherent with the natime series ture of 1/f noise, showing strong correlations on long term. Furthermore we obtained similar MSE values by using both ApEn and SampEn estimators. Fig. 2 reports the results of MSE analysis performed on signals generated by completely different dynamical systems (white noise and logistic map), confirming that the multiscale analysis is strongly related to different signal structures. For instance random noise has the highest value at scale factor 1 (original time series), but it shows a decreasing trend for other scale factors. On the contrary, the time series generated by the logistic map has an increasing trend presenting a maximum at scale factor 4. Furthermore, as the figure shows, the ApEn and SampEn values are very similar. Fig. 3 reports the results on time series obtained from MIX process (the definition is reported in the figure caption). The analysis shows that the multiscale approach evidences periodic behaviors in the signal. In fact, Fig. 3 evidences an abrupt decrease of entropy occurring

6

Fig. 1. Results of the MSE analysis (avg. std.) for simulated signals of 1/f noise. Each symbols refers to a samples number N of analyzed time series. All generated signals have the same mean and std.

6

Fig. 2. Results of the MSE analysis (avg. std.) for simulated signals, number of points N = 7000. The higher values refer to random noise, the lower ones refer to sequences generated by the logistic map x + 1 = 4 x (1 x ).

1 1 0

at scale factor 6 and 8, where the scale factor corresponds to a submultiple of the signal period. IV. APPLICATION TO THE FETAL HEART RATE SIGNAL Our database includes more than 800 FHR traces, which were collected from antepartum CTG recordings in 572 pregnant women. The data for each subject consist of the FHR trace, complete anamnestic information and antepartum prognostic indications. Data collection involved university departments and a private company in Italy, in a joint research effort. The women were in a gestational age ranging from 26th and 40th week. Recordings came from four identical devices (HP M1350A), located in various university clinics in Rome. Each recording lasted at least 60 min and it contains both the FHR series and the toco trace. The HP M1350A was set to provide two FHR samples/s, resulting in sequences of more than 7200 points. For a subset of CTG recordings, belonging to women

122

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 53, NO. 1, JANUARY 2006

TABLE I CTG RECORDINGS SELECTED FOR THE ANALYSIS

All signals present a high quality of recording. Complete information both in the antepartum period and at birth was available. (g.w.=gestational week). The subset selected for the MSE analysis is reported in the last row.

6

Fig. 3. Results of the MSE analysis (avg. std.) for simulated signals, number of points N = 7000. The MIX(p,N) is defined as (1 z ) x + z y , where z is a random variable, that assumes value 1 with probability p and 0 with probability 1 p, x is a sequence of period T = 12 generated by the equation x = 2 sin(2j=12) and y is a uniformly distributed variable on [ 3; 3] The lower p is chosen, the more periodic and regular the signal is.

p0

0 1

1

0

p p

who gave birth in those clinics, we were able to acquire outcome data at delivery (weight of the newborn, type of delivery, Apgar score, and final diagnosis). In this paper, we considered two groups of fetuses: a control group of Normal fetuses (absence of pathologies at birth, no antepartum problems and spontaneous delivery) and a group of Suffering fetuses, presenting different conditions of FD at birth, often associated with pathologies. Within the Suffering group we made a further subdivision by separating the recognized suffering fetuses (RecSuf) from the not recognized suffering fetuses (NotRecSuf). The former group showed some risk indications in the antenatal period, at the moment of the CTG recording; the latter contained fetuses that were suffering during labor or at delivery, but were judged as absolutely normal at the moment of the CTG recording. This particular subdivision allows separating chronic distressed fetuses, which are commonly identified in the antepartum period, from fetuses affected by acute or sub acute suffering conditions arising during labor. Each group (RecSuff, NotRecSuff and Normal) was further subdivided by gestational age as shown in Table I. The standard analysis procedure was carried out through the identification of the baseline [20], the detection of accelerations and decelerations, the computation of short-term variability and long-term irregularity (LTI). A further power spectral density (PSD) analysis was performed through software procedures expressly developed within our research project [21], [22]. During CTG monitoring, the fetus can move and breath as well as contractions can occur. These events dramatically change the FHR variability signal, demonstrating its nonstationary nature. Therefore, we calculated the parameters over short sequences as suggested in a recent study [22]. This would allow to capture the development of signal modifications and, eventually, to understand physiological implications. In this paper, we computed both the ApEn(1,0.1) and SampEn(1,0.1) over sequences of 360 samples, corresponding to 3 min

recording. Then we considered, for each recording, the mean value of entropy sequences, both for ApEn and SampEn measures. The resulting ApEn and SampEn indexes were compared for the selected FHR signals referring to the defined groups. The application of the multiscale method considered both 2, 0.15 . The ApEn and SampEn with parameters analysis was limited to scale factor 8 as the length of FHR time series could be only 7200 point (1 h). The MSE approach required a further selection of the available recordings by introducing a threshold on the FHR signal quality. As a matter of fact, sometimes the FHR signal is not (or very badly) detected by the ultrasound Doppler probe, and the HP monitor attributes a zero value to signal loss. Thus, we imposed that FHR signal should contain less than 250 zeros. The reason is that the MSE was computed on the whole FHR signal; then, chunks of successive zeros might be interpreted as repetitive patterns leading to inconsistent results. The introduction of the quality threshold reduced the number of recordings, so we did not maintain the subdivision on the basis of gestational age. Details are reported in Table I. Almost all time domain and morphological parameters are not significantly different among the different classes. Only the 0.05 for the LTI index satisfies the imposed threshold Student t-Test, the Rank Sum Test and the Kolmogorov Smirnov tests in the last class of gestational age ( 39 weeks) as reported in Table II. The PSD analysis does not show any significant difference in the three considered frequency bands. However, we can observe that ApEn and SampEn are statistically significant 0.01 for the 30th–35th gestational weeks period (ANOVA test and Kruskall–Wallis test performed within the three groups; post-hoc comparisons by Scheffè test) and both provide similar results, confirming their potential as a discriminating parameter. Both ApEn and SampEn are higher in suffering fetuses than in normal ones (see detials in Table II). Furthermore the early class of suffering fetuses presents a high percentage of preterm labor (most of them did not reach the 35th week). So the obtained result would confirm that, during this gestational period, the episodes of serious distress often induce dangerous situations for the fetus, since it is not yet completely developed and its regulation system fails in responding to severe hypoxia.

FERRARIO et al.: COMPARISON OF ENTROPY-BASED REGULARITY ESTIMATORS

123

TABLE II STANDARD PARAMETERS FOR THE FETAL SURVEILLANCE IN NORMAL AND SUFFERING FETUSES

Suggest Documents