Evaluation of physiologic complexity in time series using generalized sample entropy and surrogate data analysis Luiz Eduardo Virgilio Silva and Luiz Otavio Murta Citation: Chaos 22, 043105 (2012); doi: 10.1063/1.4758815 View online: http://dx.doi.org/10.1063/1.4758815 View Table of Contents: http://chaos.aip.org/resource/1/CHAOEH/v22/i4 Published by the American Institute of Physics.
Related Articles Chaotic dynamics in cardiac aggregates induced by potassium channel block Chaos 22, 033140 (2012) Pacemaker interactions induce reentrant wave dynamics in engineered cardiac culture Chaos 22, 033132 (2012) Development of an optically pumped atomic magnetometer using a K-Rb hybrid cell and its application to magnetocardiography AIP Advances 2, 032127 (2012) Continuous-waveform constant-current isolated physiological stimulator Rev. Sci. Instrum. 83, 044303 (2012) Four-dimensional ultrasound current source density imaging of a dipole field Appl. Phys. Lett. 99, 113701 (2011)
Additional information on Chaos Journal Homepage: http://chaos.aip.org/ Journal Information: http://chaos.aip.org/about/about_the_journal Top downloads: http://chaos.aip.org/features/most_downloaded Information for Authors: http://chaos.aip.org/authors
Downloaded 30 Oct 2012 to 143.107.137.68. Redistribution subject to AIP license or copyright; see http://chaos.aip.org/about/rights_and_permissions
CHAOS 22, 043105 (2012)
Evaluation of physiologic complexity in time series using generalized sample entropy and surrogate data analysis Luiz Eduardo Virgilio Silva and Luiz Otavio Murta Jr.a) Department of Computing and Mathematics, FFCLRP, University of S~ ao Paulo, Ribeir~ ao Preto, SP, Brazil
(Received 23 May 2012; accepted 28 September 2012; published online 17 October 2012) Complexity in time series is an intriguing feature of living dynamical systems, with potential use for identification of system state. Although various methods have been proposed for measuring physiologic complexity, uncorrelated time series are often assigned high values of complexity, errouneously classifying them as a complex physiological signals. Here, we propose and discuss a method for complex system analysis based on generalized statistical formalism and surrogate time series. Sample entropy (SampEn) was rewritten inspired in Tsallis generalized entropy, as function of q parameter (qSampEn). qSDiff curves were calculated, which consist of differences between original and surrogate series qSampEn. We evaluated qSDiff for 125 real heart rate variability (HRV) dynamics, divided into groups of 70 healthy, 44 congestive heart failure (CHF), and 11 atrial fibrillation (AF) subjects, and for simulated series of stochastic and chaotic process. The evaluations showed that, for nonperiodic signals, qSDiff curves have a maximum point (qSDiff max ) for q 6¼ 1. Values of q where the maximum point occurs and where qSDiff is zero were also evaluated. Only qSDiff max values were capable of distinguish HRV groups (p-values 5:10 103 ; 1:11 107 , and 5:50 107 for healthy vs. CHF, healthy vs. AF, and CHF vs. AF, respectively), consistently with the concept of physiologic complexity, and suggests a potential use for chaotic system analysis. C 2012 American Institute of Physics. [http://dx.doi.org/10.1063/1.4758815] V
Human physiological subsystems interact each other in a nonlinear manner, e.g., in regulatory mechanisms, aiming to supply the body needs. These interactions can produce fast responses under certain circumstances, revealing emergent behaviour that is typical of complex systems. It is known that the physiologic complexity decreases with aging and illness and several methods are used to estimate the complexity from time series. One of the most used, sample entropy (SampEn), is known to be a irregularity metric and an increase on its value is not always associated to an increase in complexity. Here, we describe a generalization of SampEn and introduce a new measurement, qSDiff max , which combines the generalized SampEn with the technique of surrogate data. Results with heart rate variability (HRV) series showed that qSDiff max is a metric more consistent with the concept of physiologic complexity and has potential to be used with other sources of biomedical signals.
I. INTRODUCTION
Nonlinear time series analysis has been the subject of increasing interest in the recent few decades. This is due to inability of linear models in explaining many phenomenon of daily life. Furthermore, it is known that complex and chaotic behaviour frequently emerges from nonlinear processes, which require adequate methods for analysis of time series obtained from such systems.
a)
Electronic mail:
[email protected].
1054-1500/2012/22(4)/043105/7/$30.00
Chaotic dynamics is characterized by sensitivity to initial conditions. Such dynamic systems are strongly dependent on the initial conditions and small changes on this conditions result in large differences in the future of its trajectories. This sensitivity to initial conditions is expressed by the Lyapunov exponent. In case of one-dimensional systems, there is one exponent that summarizes the exponential growth in trajectories differences in phase space. There is no globally accepted definition for time series complexity. However, it is possible to name properties that complex systems show. For example, such systems are composed by many interdependent subsystems, which interact nonlinearly. Complex time series reveal multifractal structures, i.e., complex structures in different scales, and they are capable of emerging non-trivial dynamics.1 The emergent behaviour can be considered as the most distinguishing feature of complex systems.2 Emergence arises when global dynamics of a system cannot be predicted or understood only by dynamics of its parts. Nonadditive statistics3 is a novel theoretical formulation, which is more suitable for complex systems, where the classical Boltzmann-Gibbs statistics is not quite adequate. This formulation generalizes Boltzmann-Gibbs entropy, introducing a nonadditive parameter (q). In the nonadditive formalism, the entropy of composed systems is not the simple sum of each subsystem entropy, which is in agreement with complex systems, once the composed systems interact among themselves. Physiologic systems are complex in nature and physiologic complexity is strongly related to healthy condition. It is known that aging and being ill cause complexity loss. It is intuitively explained because healthy systems present long
22, 043105-1
C 2012 American Institute of Physics V
Downloaded 30 Oct 2012 to 143.107.137.68. Redistribution subject to AIP license or copyright; see http://chaos.aip.org/about/rights_and_permissions
043105-2
L. E. V. Silva and L. O. Murta, Jr.
term correlations and are more able to adapt to adverse conditions than ones in illness conditions.4 The cardiovascular system is a good example of this. The HRV of healthy subjects is different from, for instance, subjects with atrial fibrillation (AF). Figure 1 shows two examples of HRV series, obtained from a healthy subject and a subject with atrial fibrillation. The HRV series represent the values of consecutive interbeat intervals, which can be calculated from the eletrocardiogram (ECG). Many methods have been used to evaluate HRV series, using both linear and nonlinear modelling. Methods based on linear modelling are widely known and standardized.6 However, as a complex system, the cardiovascular system might be better assessed by nonlinear modelling. There are various methods derived from nonlinear dynamical modelling, and no method alone can adequately describe complex physiological systems.7 SampEn is a nonlinear metric used to measure the irregularity of a time series.8 Increases in SampEn is often associated to increases in complexity. However, some authors have shown that higher SampEn values are not always associated to high complexity.9,10 It is a strong limitation of SampEn as a complexity metric. For example, healthy subjects HRV series are often assigned with lower values of SampEn than atrial fibrillation ones. Moreover, a shuffled version of the HRV series is often assigned with higher values of SampEn than the original HRV series, quite the opposite expected from a complexity measurement. Surrogate data generation is a useful technique for nonlinearity hypothesis testing for time series analysis.11 In surrogate data tests, a simple null hypothesis is that the series is described by an independent and identically distributed (IID) random variable. Here, the surrogate data are generated by simply shuffling the original time series, yielding in a time series with exactly the same time distribution but with no time correlations. Then, a discriminating statistic is chosen to test if the null hypothesis will be rejected or not. In this paper, we introduce a generalized form of sample entropy (qSampEn), as parametric statistics, consistent with
Chaos 22, 043105 (2012)
the nonadditive formalism. Consistence as a complexity measurement was investigated and tested for both real HRV and simulated signal. Surrogate data were used to compute entropy differences between original dynamics and surrogate series. Therefore, qSampEn plays the role of discriminating statistics in surrogate analysis, being used to discriminate original from uncorrelated dynamics. II. GENERALIZED SAMPLE ENTROPY (qSampEn)
Tsallis proposed a generalization of Boltzman-GibbsShannon (BGS) entropy.3 This generalized form of entropy is suitable for complex and multifractal systems, which exceeds domain of applicability for classical BGS entropy.12 In such complex systems, emergent behaviours arise and the global system properties cannot be analyzed from its subsystems separately because of their interactions. The discrete form of Tsallis entropy (Sq ) is given by 1 Sq ¼
W X i¼1
q1
pqi ;
where pi is the probability that the system is in state i, W is the number of possible states of the system, and q is the entropic index. In the limit q ! 1; Sq recovers the classical BGS entropy. One can also write Eq. (1) as X pi logq ð1=pi Þ; (2) Sq ¼ i
and derive the general form for logarithm function, namely, q-logarithm, defined as logq ðxÞ ¼
x1q 1 ; ðx 2 Rþ ; q 2 R; log1 ðxÞ ¼ logðxÞÞ: 1q (3)
In dynamic systems, it is useful to calculate the mean rate at which the entropy grows. Consider a system with D-dimensional phase space. The system probability distribution is estimated by partitioning the phase space into boxes, cubes, or hypercubes (in case of D > 3) of edges and taking the system state at intervals of time s. The KolmogorovSinai (KS) entropy can be expressed as KS ¼ lim lim limðSnþ1 Sn Þ; n!1 !0 s!0
FIG. 1. Example of HRV series from (a) healthy subject and (b) subject with atrial fibrillation, obtained from the Physionet database.5
(1)
(4)
where s is the interval of time at which the system is measured and n is the number of systems measurements. KS entropy measures the mean rate at which information is produced by the system. In this case, the pi values represent the joint probabilities for the system to be in the state k1 at time t ¼ s, in k2 at time t ¼ 2s, and in kn at time t ¼ ns. Pesin equality13 relates KS entropy with the sum of positive Lyapunov exponents. Therefore, a positive constant value of KS entropy indicates chaos. Definition of KS entropy cannot be used for finite length time series analysis due to infinity limits in its equation. Some
Downloaded 30 Oct 2012 to 143.107.137.68. Redistribution subject to AIP license or copyright; see http://chaos.aip.org/about/rights_and_permissions
043105-3
L. E. V. Silva and L. O. Murta, Jr.
Chaos 22, 043105 (2012)
approximations for it were proposed. Grassberger and Procaccia14 proposed an approximation based on K2 Renyi entropy, arguing that it is a lower bound of KS entropy. Eckmann and Ruelle15 also proposed a KS estimation, with a basic principle similar to Grassberger and Proccacia formulation. A more appropriate entropy rate measurement for noisy and short time series was introduced by Pincus,16 called approximate entropy (ApEn) family. ApEn is a biased estimate, taking into account self occurrences of patterns. Furthermore, it is inconsistent for different parameter values and time series length.9,17 SampEn eliminates the self-count bias, being more consistent and less dependent on series length.8 Similar to definition of KS entropy, SampEn evaluates the rate of increase in entropy. The higher the SampEn value, the more irregular and more unpredictable is the series. SampEn is fundamentally a regularity statistic and is not always associated to the physiologic complexity.10 In a shuffled version of a HRV signal, i.e., no time correlations, the time series complexity is lost. However, higher values of SampEn are assigned to the shuffled time series than to original one.18 For example, Figs. 2(a) and 2(b) show that qSDiff is negative at q ¼ 1, indicating that signal classical SampEn is lower than its mean surrogate SampEn. Moreover, time series obtained from some pathological states, such as atrial fibrillation, are also assigned with higher values of SampEn when compared to healthy group,9 as can be seen in Table III. However, healthy systems are more capable to adapt to adverse situations, revealing long-term correlation and a more complex structure than ill systems.4 Therefore, combining nonadditive statistics and SampEn, we propose qSampEn, a generalized form for the latter. Consider time series u(1), u(2),…, u(n). Let xm ðiÞ be the set of points u from i to i þ m 1, i.e., xm ðiÞ ¼ ½uðiÞ; uði þ 1Þ; uði þ 2Þ; …; uði þ m 1Þ. Reminding previous definition for logq in Eq. (3), qSampEn could be defined in two forms qSampEn0 ðm; r; NÞ ¼ logq ½U m ðrÞq U mþ1 ðrÞ ¼ logq U m ðrÞ logq U mþ1 ðrÞ;
(5)
qSampEn00 ðm; r; NÞ ¼ logq ½U m ðrÞ=U mþ1 ðrÞ m
¼ logq U ðrÞq logq U
mþ1
ðrÞ;
(6)
where U m ðrÞ ¼ Uim ¼
m 1 NX Um N m i¼1 i
½# of xm j d½xm ðiÞ; xm ðjÞ r Nm1
(7)
(8)
and U mþ1 ðrÞ ¼
Uimþ1 ¼
X 1 Nm U mþ1 N m i¼1 i
½# of xmþ1 j d½xmþ1 ðiÞ; xmþ1 ðjÞ r : Nm1
(9)
The generalized algebra and its properties can be found in previous study.19 Although both Eqs. (5) and (6) could be used to define qSampEn, the latter does not bring new information about the signal once we have a constant ratio within q-logarithm. Changes over q values are due only to q-logarithm equation and not to signal dynamics. Therefore, qSampEn is defined with Eq. (5). In both Eqs. (8) and (10), we must have j 6¼ i to exclude self-matches. The distance function d is given by d½xm ðiÞ; xm ðjÞ ¼ max ðjuði þ k 1Þ uðj þ k 1ÞjÞ: k¼1;…;m
(11) The r parameter defines the tolerance for time series patterns match and is usually defined as a fraction of time series standard deviation. N is the time series length and m is the pattern length to be considered. Just as for KS entropy, SampEn measures the rate of information growth, but here for specific values of m and r. Equation (7) estimates the probability of pattern i, length m, in matching other patterns given the tolerance r. The logarithmic average for all time series patterns is then calculated. The procedure is repeated for pattern length m þ 1 [Eq. (9)]. Although there is no established method to determine optimal values of m and r, there are several studies proposing optimum values for them.20,21 Generally, for HRV time series with 100 to 5000 points length, values of m ¼ 1 or m ¼ 2 and 0:1 < r < 0:25 are successfully used.16,17 The SampEn was generalized through the nonadditive paradigm, replacing classical logarithm to q-logarithm function [see Eq. (5)]. III. ANALYSIS METHODS AND DATA
In all analysis, qSampEn parameters were set to m ¼ 2 and r ¼ 0.15% of signal standard deviation. To evaluate the influence of the entropic index of qSampEn, we defined qSDiff, calculated as the difference between time series qSampEn and average qSampEn for their surrogate series: 1. For each given time series, 100 surrogate series are generated. Each surrogate series is generated by shuffling the original signal. An alternative surrogate data analysis22 was also evaluated and presented very similar results. 2. qSampEn is calculated for each surrogate series and their mean qSampEn is calculated (qSampEnsurr ). 3. qSampEn is calculated for original time series (qSampEnorig ). 4. qSDiff is defined as: qSDiff ¼qSampEnorig qSampEnsurr . The qSDiff was obtained vs. q entropic index in which we can assess the contribution of this index in SampEn. Experiments with qSDiff were performed with both simulated and real time series. The former were originated from the discrete chaotic maps and noise generators described below. •
(10)
•
Gaussian White Noise: A series of uncorrelated (random) values, with Gaussian distribution. 1/f Noise (Pink Noise): A series which power spectrum is the function 1/f.
Downloaded 30 Oct 2012 to 143.107.137.68. Redistribution subject to AIP license or copyright; see http://chaos.aip.org/about/rights_and_permissions
043105-4 • • • •
L. E. V. Silva and L. O. Murta, Jr.
Logistic Map:23 xnþ1 ¼ axn ð1 xn Þ Henon Map:24 xnþ1 ¼ yn þ 1 ax2n and ynþ1 ¼ bxn Cubic Map:25 xnþ1 ¼ axn ð1 x2n Þ Spence Map:25 xnþ1 ¼ jlnðxn Þj
Stochastic processes are present in this procedure for method performance evaluation, as simulated dataset, to demonstrate that the proposed method is able to differentiate situations of low-dimensional deterministic chaos from stochastic processes. We believe that the ability to distinction from these two situation is due to the use of surrogate data series into method fundamentals. All simulated signals were generated with 16 385 samples (214 ). Discrete maps parameters were set to chaotic regime: a ¼ 1.4, b ¼ 0.3 for Henon map, a ¼ 3.0 for cubic map and a ¼ 4.0 for logistic map. Besides, logistic map was also used to create a period-2 series (a ¼ 3.2), resulting in a total of seven simulated signals. Real data consist of HRV signals from three groups, all obtained from the Physionet database:5 70 healthy subjects (34 men, aged 54:85616:15 and 36 women, aged 53:72616:35), where 18 were obtained from the MIT-BIH Normal Sinus Rhythm Database and 52 from the Normal Sinus Rhythm RR Interval Database, both with data sampled at 128 Hz; 44 congestive heart failure (CHF) patients, (19 men, aged 55:72611:94, 6 women, aged 55:6769:16 and 19 subjects with unknown gender, aged 55:26612:13), 15 of them obtained from the BIDMC Congestive Heart Failure Database, where signal was sampled at 250 Hz, and the remaining (29 subjects) from the Congestive Heart Failure RR Interval Database, with signal sampled at 128 Hz; 11 patients with AF (with unreported age and gender) obtained from MIT-BIH Atrial Fibrillation Database, acquired with sampling rate of 250 Hz. All time series were obtained considering normal beats only, according to the annotation files regarding the signals. Time series included in this study were truncated to 2 104 beats length, considering only awake time. In addition, time series were processed to exclude artefacts and missed beat detections. For each signal, its baseline was calculated with a sliding windowed average of 2000 points length and values greater or lower than a percentage of this baseline were removed. At this processing stage, 22 signals had points removed and the maximum number of removals in a signal was 20. We extracted three parameters from qSDiff curves and compared to four standard HRV measurements: SDNN (standard deviation of RR intervals), triangular index (geometric measurement based on time series distribution), RMSSD (square root of the mean squared differences of RR intervals), and pNN50 (percentage of pairs of RR intervals that differ by more than 50 ms). Detailed definition of each one can be found in previous study.6 The Kruskal-Wallis nonparametric one-way analysis of variance was used to test the significance among HRV groups. Values of p < 0:01 were considered significant. IV. RESULTS AND DISCUSSION
Figure 2(a) shows qSDiff curves for simulated signals and Fig. 2(b) for HRV time series in the q range [2, 2],
Chaos 22, 043105 (2012)
which has been empirically determined to be optimal interval. For HRV signals, the curves represent the mean qSDiff 6 standard error of each group. Results show a very interesting profile. For negative q values, qSDiff is close to zero. Increasing q, qSDiff reaches a maximum between 0 < q < 1, then decreases abruptly. This pattern is not present for random and periodic time series, and therefore can be a marker for complexity. We can observe the maximum value for qSDiff vs. q value which is a good descriptor for signal dynamics. In classical limit (q ! 1), the qSDiff is negative, indicating a higher qSampEn for surrogate series when compared to original series. As stated above, this is a limitation of SampEn as a complexity measurement because an uncorrelated version of the signal cannot be more complex than the original one. However, for some q values lower than 1, we note that the opposite occurs. Higher qSampEn is found for original series than to its surrogate series. In addition, qSDiff showed that AF series are closer to uncorrelated dynamics than healthy and CHF ones. It is worth noting that white noise presented two curve profiles. In some simulations, after reaching a maximum, its qSDiff curves monotonically decrease with q, similar to the graphics on Fig. 2(a); in others, its qSDiff curves reach a negative minimum, increasing afterwards. In both cases, the maximum and minimum absolute values are very low compared to the other signal results (see Table I). There is no statistical difference between white noise and its surrogate series. Therefore, the expected result for its qSDiff is zero. However, since we deal with finite series, qSDiff might detect low differences between them, indicating that qSampEn interchangeably detects white noise and its surrogates as slightly more complex one than another at different instances. To compare and also quantitatively distinguish the results, we extract three parameters from the qSDiff curves: the maximum qSDiff value (qSDiff max ), the q value at maximum qSDiff (qmax ), and the q value at qSDiff ¼ 0ðqzero Þ, i.e., when the qSampEn is equal for both original and surrogate series. These parameters for simulated signals are presented in Table I. As the period-2 logistic map qSDiff curve decreases monotonically, the three parameters do not apply to this signal. Results for white noise, as mentioned above, presented two qSDiff profiles. For illustration, Table I shows values for the case when its qSDiff curve had a maximum instead of a minimum. Disregarding white noise, Henon map obtained the lowest values for all the three evaluated parameters. Cubic map presented the highest qSDiff max and 1/f noise the highest qmax and qzero values. The parameters were calculated for HRV groups and mean values are shown in Table II. It is important to note that these values would be different if taken directly from qSDiff mean curves. For example, the maximum entropy difference (qSDiff max ) of healthy subjects occurs at different q values (qmax ) and not always at q ¼ 0.30 [see Fig. 2(b)]. qmax fluctuates around 0.33 for healthy group, whereas qSDiff max occurs at q ¼ 0.30 for mean qSDiff curve. The same fact is observed for CHF and FA groups.
Downloaded 30 Oct 2012 to 143.107.137.68. Redistribution subject to AIP license or copyright; see http://chaos.aip.org/about/rights_and_permissions
043105-5
L. E. V. Silva and L. O. Murta, Jr.
Chaos 22, 043105 (2012)
FIG. 2. qSDiff data plotted versus q parameter for (a) the simulated series and (b) HRV signals. The qSDiff is the difference between the qSampEn of the original time series and the average qSampEn of surrogate series, estimated from the original time series. For each signals, 100 surrogates data were instanced. All simulated p signals were generated with 16 384 points length. The curves of HRV signals represent the mean qSDiff 6 standard error of each group ffiffiffi (SE ¼ SD= n, where n is the number of subjects).
Healthy and CHF groups presented very close results. Comparing healthy and CHF to AF group, the major difference concerns qSDiff max parameter. AF group presented lower values for this parameter, which indicates that even for q ¼ qmax , AF series are closer to uncorrelated dynamics than the other groups, pointing to loss of complexity. It reinforces the idea that AF statistical properties resembles uncorrelated noise,9 although its three parameters are quite different from white noise, which suggests that despite losing complexity, AF series still exhibit some regulatory mechanism. The significance of qSDiff max ; qmax , and qzero values for HRV groups was tested with Kruskal-Wallis analysis. For qmax and qzero , there was no significant difference between groups. For qSDiff max , we obtained p ¼ 3:02 108 for the test with the three HRV groups at once. Comparing in pairs healthy to CHF, healthy to AF, and CHF to AF, it was obtained p ¼ 5:10 103 ; p ¼ 1:11 107 , and p ¼ 5:50 106 , respectively. Therefore, in all situations, qSDiff max values differences are significant among HRV groups (p < 0:01). To compare qSDiff max to standardized HRV measurements, we calculated four time-domain variables for the three evaluated groups: SDNN, triangular index, RMSSD, and pNN50. For 24 h period (long-term) signals, these variables are correlated to the frequency-domain components. The first two are related to total power and the last two to high freTABLE I. Obtained values for the three parameters extracted from qSDiff curves of simulated signals: the maximum qSDiff value (qSDiff max ), the q value at maximum qSDiff (qmax ), and the q value when qSDiff ¼ 0 ðqzero Þ. Simulated signal Gaussian white noisea 1/f noise Logistic map (a ¼ 3.2) Logistic map (a ¼ 4.0) Henon map (a ¼ 1.4, b ¼ 0.3) Cubic map (a ¼ 3.0) Spence map a
qSDiff max ð102 Þ
qmax
qzero
0.090 1.576 … 1.718 0.611 2.273 1.849
1.05 0.50 … 0.20 0.05 0.30 0.15
1.20 0.67 … 0.42 0.27 0.53 0.40
quency.6 In addition to these variables, classical SampEn was also calculated. The results are illustrated in Table III. For all the variables shown in Table III, we found significant differences between groups. However, one can note that for all but qSDiff max , AF group presented the highest values. More than distinguish groups, qSDiff max was able to classify them according to the level of physiologic complexity they exhibit in the signal. Results from simulated signal showed that a simple periodic time series does not show the same profiles observed for chaotic series. To verify whether these parameters can be used to distinguish chaotic from periodic dynamics, the three parameters were calculated for logistic map within the range 2:8 < a < 4:0 and step 5 104 , where there are different dynamical regimes and also different levels of chaos. For each value of a, a 214 points length time series were generated and qSDiff max ; qmax , and qzero were calculated. The parameters were plotted and compared to logistic map Lyapunov exponents in the same range. These graphics are shown in Figure 3. When a logistic map qSDiff curve did not have a peak, e.g., when a ¼ 3.2 which can be seen in Fig. 2(a), the value 2.0 was assigned to qmax . Therefore, in such cases qSDiff max is negative and there is no qzero value associated. As we know, chaos occurs when the systems Lyapunov exponent is positive (k > 0) and the greater its value the greater is the sensitivity to initial conditions in that system. Although it was not observed a clear law to distinguish chaos with the parameters extracted from qSDiff, they showed TABLE II. Mean values 6 standard deviation of the three parameters extracted from qSDiff curves of HRV series: the maximum qSDiff value (qSDiff max ), the q value at maximum qSDiff (qmax ), and the q value when qSDiff ¼ 0 ðqzero Þ. The differences between qSDiff max are significant (p < 0:01). Group Healthy CHF AF
qSDiff max 6SD ð102 Þ
qmax
qzero
2:01660:371 1:75760:506 0:29060:197
0:3360:11 0:3360:12 0:3260:11
0:5360:08 0:5360:09 0:5160:10
This values were taken for the case when the qSDiff had a maximum.
Downloaded 30 Oct 2012 to 143.107.137.68. Redistribution subject to AIP license or copyright; see http://chaos.aip.org/about/rights_and_permissions
043105-6
L. E. V. Silva and L. O. Murta, Jr.
Chaos 22, 043105 (2012)
TABLE III. Mean values 6 standard deviation of HRV measurements: SDNN, triangular index, RMSSD, pNN50, SampEn and qSDiff max . Values were calculated for the three HRV groups. All variables presented significant differences between groups (p < 0:01). Group Healthy CHF AF
SDNN
Triang. index
RMSSD
pNN50
SampEn
72:95619:93 42:57622:20 148:59637:63
18:8666:67 11:3266:55 33:86614:26
24:63611:06 18:12616:88 191:99656:48
5:3866:09 2:7568:50 69:29610:89
1:0860:35 1:1960:39 1:9060:43
qSDiff max ð102 Þ 2:01660:371 1:75760:506 0:29060:197
FIG. 3. Values of (a) qSDiff max , (b) qmax , and (c) qzero for the logistic map, plotted vs its parameter a, ranging from 3.5 to 4.0 with step 5 104 . For each value of a, a 214 points length series were generated and the three parameters were calculated. For comparison, logistic map Lyapunov exponents are shown in (d) for the same range. Plot (d) is an adapted version originally obtained with Chaos for JAVA Software (CFJChaos).26
consistency with the increasing sensitivity to initial conditions. Moreover, at the islands of stability, where for a certain range of control parameter the dynamics jumps back from chaotic to periodic regime, the qSDiff curves did not present a peak. The three parameter plots, although different, showed a very similar profile to Lyapunov exponents, indicating that they can be related to chaos. Qualifying a signal as chaotic or not still generates controversy in nonlinear science, so that led to a special issue from Chaos to discuss about the presence of chaos in heart rate variability signals.27 V. CONCLUSIONS
The generalized form proposed here seems to better extract the physiologic complexity from time series. In the classical approach (q ¼ 1), qSampEn values associated to surrogate series are greater than the original time series, pointing to the limitation of SampEn as a complexity mea-
surement. However, for approximately 0 < q < 1, qSDiff is positive, indicating that there are greater values of qSampEn associated to the original signal than to its surrogate series. Furthermore, within this range there is a specific q value where qSampEn is optimal, being, on average, 0.32 for AF group and q ¼ 0.33 for healthy and CHF groups. This value varies for simulated signals. In addition, qSampEn makes it possible to assign AF group as the group with lower complexity dynamics. The AF group showed the lowest qSDiff values among the HRV groups, indicating that its time series are less complex and more similar to uncorrelated noise. Despite that, the three calculated parameters for this group are quite different from the obtained for white noise. These findings are consistent with the concept of loss in physiologic complexity in AF, indicating that HRV regulation in this group is more affected by random components. qSDiff max evaluated here confirms this characteristics. Although cardiovascular regulation is degraded in AF, there still exist remaining regulatory
Downloaded 30 Oct 2012 to 143.107.137.68. Redistribution subject to AIP license or copyright; see http://chaos.aip.org/about/rights_and_permissions
043105-7
L. E. V. Silva and L. O. Murta, Jr.
mechanism measured by qSDiff max , which is different from purely white noise. In summary, the qSampEn is a metric more consistent with the concept of physiologic complexity for 0 < q < 1. qSDiff max can be a useful measurement to asses physiologic complexity and distinguished the different groups and levels of complexity. In addition, the three parameters extracted from qSDiff (qSDiff max ; qmax and qzero ) might also be associated to the presence of chaos. However, this link with chaos as well as the physiological interpretation of these parameters must be investigated in further studies. We evaluated this method using a reliable public access database retrospectively, in which we have no control over acquisition. Several records have no information about the patient age and gender, which can significantly change HRV. As future study, we suggest a prospective controlled acquisition to address this limitation. The concept of complexity is ubiquitous in nonlinear dynamics and especially in physiological systems. This study showed that the irregularity statistics consistent with the nonadditive formalism can provide additional information about complex systems. The complexity level of time series is an important measurement to assess the state of the system. The state characterization for physiological systems, e.g., cardiovascular system, can be supported by the method proposed here. Although HRV is a representative example of physiologic complexity, qSDiff still deserves validation for other sources of biological signals. ACKNOWLEDGMENTS
The authors would like to thank Professor Alexandre S. Martinez and colleagues Antonio C. S. S. Filho and Isaıas J. A. Soares for the valuable reviewing and discussions. We also thank CAPES and FAPESP agencies for their financial support. 1
M. Baranger, “Chaos, complexity, and entropy: A physics talk for nonphysicists,” New England Complex Systems Institute, 2001, see http:// necsi.edu/projects/baranger/cce.pdf. 2 N. Boccara, Modeling Complex Systems (Springer-Verlag, New York, 2004). 3 C. Tsallis, “Possible generalization of boltzmann-gibbs statistics,” J. Stat. Phys. 52, 479–487 (1988). 4 A. L. Goldberger, C.-K. Peng, and L. A. Lipsitz, “What is physiologic complexity and how does it change with aging and disease?,” Neurobiol. Aging 23, 23–26 (2002). 5 A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E.
Chaos 22, 043105 (2012) Stanley, “Physiobank, physiotoolkit, and physionet: Components of a new research resource for complex physiologic signals,” Circulation 101, e215–e220 (2000). 6 M. Malik, J. T. Bigger, A. J. Camm, R. E. Kleiger, A. Malliani, A. J. Moss, and P. J. Schwartz, “Heart rate variability: Standards of measurement, physiological interpretation, and clinical use,” Circulation 93, 1043– 1065 (1996). 7 A. Voss, S. Schulz, R. Schroeder, M. Baumert, and P. Caminal, “Methods derived from nonlinear dynamics for analysing heart rate variability,” Philos. Trans. R. Soc. London, Ser. A 367, 277–296 (2009). 8 J. S. Richman and J. R. Moorman, “Physiological time-series analysis using approximate entropy and sample entropy,” Am. J. Physiol. Heart Circ. Physiol. 278, H2039–H2049 (2000). 9 M. Costa, A. L. Goldberger, and C.-K. Peng, “Multiscale entropy analysis of biological signals,” Phys. Rev. E 71, 021906 (2005). 10 S. M. Pincus and A. L. Goldberger, “Physiological time-series analysis: What does regularity quantify?,” Am. J. Physiol. Heart Circ. Physiol. 266, H1643–H1656 (1994). 11 J. Theiler, S. Eubank, A. Longtin, B. Galdrikian, and J. Doyne Farmer, “Testing for nonlinearity in time series: The method of surrogate data,” Physica D 58, 77–94 (1992). 12 C. Tsallis, Introduction to Nonextensive Statistical Mechanics (Springer, 2009). 13 Y. B. Pesin, “Characteristic lyapunov exponents and smooth ergodic theory,” Russ. Math. Surveys 32, 55 (1977). 14 P. Grassberger and I. Procaccia, “Estimation of the kolmogorov entropy from a chaotic signal,” Phys. Rev. A 28, 2591–2593 (1983). 15 J. P. Eckmann and D. Ruelle, “Ergodic theory of chaos and strange attractors,” Rev. Mod. Phys. 57, 617–656 (1985). 16 S. M. Pincus, “Approximate entropy as a measure of system complexity,” Proc. Natl. Acad. Sci. U.S.A. 88, 2297–2301 (1991). 17 D. E. Lake, J. S. Richman, M. P. Griffin, and J. R. Moorman, “Sample entropy analysis of neonatal heart rate variability,” Am. J. Physiol. Regulatory Integrative Comp. Physiol. 283, R789–R797 (2002). 18 M. D. Costa, C.-K. Peng, and A. L. Goldberger, “Multiscale analysis of heart rate dynamics: Entropy and time irreversibility measures,” Cardiovasc. Eng. 8, 88–93 (2008). 19 E. P. Borges, “A possible deformed algebra and calculus inspired in nonextensive thermostatistics,” Physica A 340, 95–101 (2004). 20 C. Y. Liu, C. C. Liu, P. Shao, L. P. Li, X. Sun, X. P. Wang, and F. Liu, “Comparison of different threshold values r for approximate entropy: Application to investigate the heart rate variability between heart failure and healthy control groups,” Physiol. Meas. 32, 167–180 (2011). 21 S. Ramdani, B. Seigle, J. Lagarde, F. Bouchara, and P. L. Bernard, “On the use of sample entropy to analyze human postural sway data,” Med. Eng. Phys. 31, 1023–1031 (2009). 22 T. Schreiber and A. Schmitz, “Improved surrogate data for nonlinearity tests,” Phys. Rev. Lett. 77, 635–638 (1996). 23 R. M. May, “Simple mathematical models with very complicated dynamics,” Nature 261, 459–467 (1976). 24 M. Henon, “A two-dimensional mapping with a strange attractor,” Commun. Math. Phys. 50, 69–77 (1976). 25 J. C. Sprott, Chaos and Time-Series Analysis (Oxford University Press, 2003). 26 B. Davies, Exploring Chaos—Theory and Experiment (Westview, 2004). 27 L. Glass, “Introduction to controversial topics in nonlinear science: Is the normal heart rate chaotic?,” Chaos 19, 028501 (2009).
Downloaded 30 Oct 2012 to 143.107.137.68. Redistribution subject to AIP license or copyright; see http://chaos.aip.org/about/rights_and_permissions