Quantification of hormone pulsatility via an approximate entropy ...

6 downloads 0 Views 3MB Size Report
Approximate entropy (ApEn) is a recently developed formula to quantify the amount of regularity in data. We exam- ine the potential applicability of ApEn to ...
Quantification an approximate STEVEN

of hormone pulsatility entropy algorithm

M. PINCUS

AND DAVID

via

L. KEEFE

Department of Obstetrics and Gynecology, Yale University School of Medicine, New Haven, Connecticut 06510 Pincus, Steven M., and David L. Keefe. Quantification of hormone pulsatility via an approximate entropy algorithm. Am.

J. Physiol.

262

(Endocrinol.

Metab.

25): E741-E754,

1992.-Approximate entropy (ApEn) is a recently developed formula to quantify the amount of regularity in data. We examine the potential applicability of ApEn to clinical endocrinology to quantify pulsatility in hormone secretion data. We evaluate the role of ApEn as a complementary statistic to widely employed pulse-detection algorithms, represented herein by ULTRA, via the analysisof two different classesof modelsthat generateepisodic data. We conclude that ApEn is able to discern subtle system changesand to provide insights separate from those given by ULTRA. ApEn evaluates subordinate as well as peak behavior and often provides a direct measureof feedback between subsystems.ApEn generally can distinguish systemsgiven 180 data points and an intra-assay coefficient of variation of 8%. This suggestsApEn as applicable to clinical hormonesecretiondata within the foreseeablefuture. Additionally, the modelsanalyzed and extant clinical data are both consistent with episodic, not periodic, normative physiology. formula; regularity; endocrine systems PAST 15 YEARS, endocrinologists have determined that episodic hormone secretion is a widespread phenomenon (2, 7-9, 11, 16, 19, 31, 32, 39). The discovery of the link between abnormal pulsatility and certain hormonal disorders (15, 36) has prompted the recognition that a greater understanding of hormone secretion patterns, statistics to analyze hormone secretion data, and underlying system models could be of keen importance. To date, a number of pulse-identification algorithms have been developed to analyze hormone level data (3, 17,20, 24,30,34,35). These methods have been useful in detecting abnormal secretory patterns in some instances, and the expectation is that refined versions of these algorithms, applied to increasingly accurate and numerous data, will detect further abnormalities in hormonal secretion earlier in the course of disease. In an entirely different setting, the entropy of a transformation was defined to quantify a measure of apparent randomness or unpredictability given by the transformation (14). Formulas have recently been developed to compute this entropy for time-series data (12, 27) for deterministic, chaotic systems. We query, can these formulas, or some variant, be applied to more general time series to quantify the regularity of the series and with what data requirements? If such a formula were applicable to hormone secretion data, it also could potentially IN THE

0193-1849/92

detect abnormal hormonal secretion. It appears most likely that the formulas to compute entropy provide little hope for applicability to endocrine data for reasons discussed below. However, a related measure, approximate entropy (ApEn) (21, 22), holds significant promise as a statistic potentially applicable to hormone secretion data. In this paper, we examine the applicability of ApEn to endocrine data and the role of ApEn as a complementary statistic to the current pulse-identification algorithms. Given the presence of a nontrivial amount of noise, there are two steps in performing pulse analysis. The first is separating the “true” secretion time series from the noise. The second step is in evaluating the resulting true time series. Although these two steps are typically comingled in each algorithm, we expect a complementarity between ApEn and the pulse-identification algorithms due to their different approaches to the second step. ApEn summarizes the time series by a single number, whereas the pulse-identification algorithms identify peak occurrences and amplitudes. ApEn will discern changes in underlying episodic behavior that do not reflect in changes in peak occurrences or amplitudes, whereas the pulse-identification algorithms ignore such information. Implicit to current models of hormone release is a periodicity assumption, with deviations attributed to “noise.” In this paper, we present two models capable of generating by themselves episodic, but not periodic, data. In each model, we have several parameters that we vary to generate a variety of data sets. For each model, we evaluate the ability of ApEn and a widely used pulseidentification algorithm, ULTRA (30)) to distinguish among the data sets generated by these models. We do not suggest that these models represent known physiological systems but rather offer them as representative of alternative hypotheses to be considered when explaining observed episodic hormonal secretion. The present focus is not to propose a model that best mimics physiological reality but rather to propose a new use of a statistic that gives different insights than pulse-counting algorithms do . Glossary

chaos

Aperiodic, seemingly random behavior in a bounded, deterministic system that exhibits sensitive

$2.00 Copyright 0 1992 the American Physiological

Society

E741

E742

QUANTIFYING

entropy, K-S entropy

episodic

phase space plot poincare section

reconstructed dynamics

regularity stochastic

EPISODIC

HORMONE

HORMONE

PULSATILITY

dependence on initial conditions Concept that addresses system randomness and predictability, with greater entropy often associated with more randomness and less system order. There are many different entropy formulations in physics, information theory, and other branches of mathematics, both of a metric (distance) nature and of a probabilistic nature. Most entropy formulas contain a logarithmic weighting, and confusingly, not all entropy definitions can be related to each other. K-S entropy is a particular entropy measure, developed by Kolmogorov and Sinai and applicable to processes, that allows one to classify deterministic systems by rates of information generation Behavior of a variable that qualitatively exhibits apparently cyclical behavior with time. We distinguish this from periodic, where there is a fixed time interval (period) over which the variable repeats itself exactly Graph of state space that has as its axes either different system variables or various combinations of states of the system Sequence of points in phase space, typically generated stroboscopically as trajectories cross some region of interest. This often allows closer scrutiny of relation of trajectories to one another. Procedure by which an output scalar time series u(l), u(2), u(3), . . . , is converted into a sequence of vectors x(l), x(2), x(3), . . . , in Rm, real m-dimensional space, defined by x(i) = [u(i), . . . , u(i + m - l)]. This allows examination of structure and correlation of { x(i), i > O] , often by dimension and entropy algorithms Tendency that patterns within data recur in exactly same manner throughout the data Nondeterministic; random from a probabilistic point of view SECRETION

Episodic or pulsatile secretion of hormones is an increasingly general finding in endocrinology. With the availability of sensitive radioimmunoassays, which require only small sample volumes, protocols employing frequent sampling became possible. Furthermore, meth-

VIA

APPROXIMATE

ENTROPY

ods that help distinguish assay noise from biological variability make pulse detection a more rigorous endeavor. Studies employing such techniques in humans and diverse animal species have characterized pulsatile secretion of a large number of hormones, including luteinizing hormone (LH) (39), insulin (16), progesterone (7), glucagon (1 l), growth hormone (8), adrenocorticotropic hormone (32)) cortisol (3 1), prolactin (2)) aldosterone (9)) and human chorionic gonadotropin (19). Elucidating the secretory patterns of hormone release has not only shed light on endocrine physiology but also clarified the pathophysiology and improved the treatment of some diseases. For example, derangement in the episodic secretion of LH underlies some common disorders in humans, such as polycystic ovary syndrome (36) and hypogonadotropic hypogonadism (4). Administration of LH-releasing hormone (LHRH) in a periodic fashion, designed to produce a normal LH secretory pattern, improved the pharmacological therapy of these disorders (4). Similarly, elucidation of pulsatile insulin secretion in normal subjects (15) laid the groundwork for the discovery of abnormal insulin secretory patterns in diabetics (15) and improved the efficacy of insulin replacement therapy by administration of the hormone in a periodic fashion (4). CURRENT

PULSE-IDENTIFICATION

ALGORITHMS

The tools currently employed by endocrinologists to analyze the pulsatility of hormone secretion data fall under the aegis of peak-identification algorithms. The philosophy of these methods is to identify the “true” peaks in the data, distinct from apparent peaks generated by the random variations due to assay imprecision. Once these true peaks are identified, one can then determine normal and abnormal ranges of pulse frequency, amplitude, and duration and hence potentially identify abnormal secretion. There are considerable differences among the algorithms due to a variety of approaches in handling the intra-assay noise. This intra-assay variation typically has a coefficient of variation (CV) of between 6 and 14% (e.g., see Ref. lo), an amount of noise that can in some instances make true peak detection very difficult. Nonetheless, it appears that for all of these algorithms in the absence of noise, I) one achieves identical peak detection, and 2) changes in subordinate patterns that do not result in new or altered peaks are apparently ignored. The following eight pulse-detection programs are among those most widely available and extensively employed: Santen and Bardin (24), modified Santen and Bardin, ULTRA, PULSAR (17), cycle detector (3), regional dual threshold (35), cluster (34)) and detect (20) [see Urban et al. (29) for descriptions and comparisons of these programs]. The similarity of the pulse-identification algorithms in the presence of negligible noise, the apparent relative robustness to nontrivial CVs, the usefulness with 50-200 data points, and the philosophy of peak analysis as the means to evaluate pulsatility bond this class of algorithms together. We have chosen ULTRA as representative of these algorithms in performing the comparisons with ApEn below. Dr. E. Van Cauter, developer of ULTRA, provided us with the computer code for this algorithm. We expect that another choice of

QUANTIFYING

HORMONE

PULSATILITY

pulse-detection algorithm, for the purpose of comparison with ApEn, would give quite similar conclusions. On the basis of published time series of hormonal concentration levels, it appears that there is the need for an added dimension in the analysis of episodic hormone release beyond monitoring the pulse count and related statistics. Lang et al. (15) concludes that brief irregular oscillations in plasma insulin levels in maturity-onset diabetics are superimposed on longer term oscillatory fluctuations commonly observed in the nondiabetic. A quantification of the regularity of these data, which ApEn provides, seems relevant to distinguishing the diabetic’s insulin secretion patterns from those of the nondiabetic. Furthermore, episodic variation in hormones often has revealed complex patterns, challenging existing programs to characterize and then differentiate the “diseased” pattern from the healthy one. Finally, frequency distributions of discrete LH pulse properties, given by Urban et al. (29) and based on nearly 200 pulses, show significantly non-Gaussian distributions for both pulse frequencies and amplitudes. The asymmetry of these distributions is not consonant with the typical assumption of periodic pulses in the presence of symmetrically distributed noise. One thus either concludes a lack of periodicity in these LH pulses or at least must entertain the possibility of such* aperiodicity in constructing algorithms to analyze . such series.

VIA APPROXIMATE

ENTROPY

E743

been keen interest in the development of related formulas (12,27) in the last 10 years, since entropy has been shown to be a parameter that characterizes chaotic behavior (5). . These Grassberger-Procaccia and Takens formulas, however, were developed with chaos applications in mind and cannot be sensibly applied to arbitrary, possibly stochastic, medium-sized time-series data sets (21). The crucial difficulty is that hormone secretion data are relatively few in number (at most, several hundred data points), whereas an accurate entropy calculation for an underlying system of dimension d typically requires from 10d to 30d data points (37). This is key, since there is no reason to anticipate, and no evidence to show, that data typically encountered from such complex interacting systems of glands and hormones that form endocrine systems be low dimensional. Furthermore, we can hardly assume that hormonal secretion is correctly modeled by deterministic chaos, as opposed to a stochastic model. ApEn was constructed along thematically similar lines to the K-S entropy, although with a different focus: to provide a widely applicable formula for the data analyst that will distinguish data sets by a measure of regularity.1 The intuition motivating ApEn is that if joint probability measures for reconstructed dynamics that describe each of two systems are different, then their marginal distributions on a fixed partition are likely different. In contrast, the K-S entropy, which the Grassberger and Takens formulas are based on, was developed by Kolmogorov APPROXIMATE ENTROPY (14) to resolve the theoretical mathematical question of ApEn is a statistic that assigns a nonnegative number whether two Bernoulli shifts are isomorphic and is prito a time series, quantifying a notion of regularity of the marily applied by ergodic theorists to well-defined transdata (21, 22). We contrast two time series to illustrate formations with no noise and an infinite amount of “data” what we are trying to measure. Series 1 is given as 90, 70, available. The ApEn formula requires that two input 90, 70, 90, 70, 90, 70, 90, 70, 90, 70, 90, 70, 90, 70, . . . . This long-term series alternates 90 and 70 in sequence. parameters, nz and r, be set; ry1is the “length” of compared runs, and r is effectively a filter. It must be emphasized Series2 is given as 90, 70, 70,90,90,90, 70, 70,90,90, 70, that m and r, as given in Eq. A3, are fixed for a given 90,70,70,90,70,.... Each term in this series has a value application of ApEn. ApEn values can vary significantly of 90 or 70, randomly chosen with probability 1/2of either. with M, and r for a given system. A valuable property of Moment statistics, such as mean and variance, will not ApEn is that it is finite for stochastic processes, whereas distinguish between these two series, nor will rank order K-S entropy is usually infinite; thus ApEn can potenstatistics, such as the median. However, series 1 is “pertially distinguish versions of stochastic processes from fectly regular”; knowing that one term has the value 90 each other, whereas entropy would be unable to do so. allows one to predict with certainty that the next term The input data for ApEn is a scalar time series, with will have value 70. Series 2 is randomly valued; knowing typically between 100 and 5,000 numbers. Fewer than 100 that one term has the value 90 gives no insight into numbers will likely yield a less meaningful computation, whether the next term will have value 90 or 70. Both K-S especially for nz = 2 or 172= 3. We generally choose nz = 1, entropy and ApEn distinguish between these series, with entropy and ApEn having a value of 0 for series 1, and a 2, or 3 (see Ref. 22 for guidelines). Values of r between 10 and 25% of the standard deviation of the data usually are value of ln2 for series 2. The algorithm that defines ApEn effective to distinguish data sets, as seen both in theoret(APPENDIX) not only allows one to distinguish between such obviously different series but also to determine sub- ical analysis (21) and in clinical heart rate applications (13, 22). Noise in the data much smaller than r is effectler differences in regularity. It is this property that tively filtered out in the ensuing calculation. promises the potential significance examined in this ApEn appears to have many of the characteristics paper. Heuristically, ApEn measures the (logarithmic) deemed important for effective characterization of epilikelihood that runs of patterns that are close remain sodic hormone release as described by Urban et al. (29). It close on next incremental comparisons (21). This quanis objective, simple to use via existing FORTRAN and tification is performed over all subpatterns of the data, both those that contain peaks and those that are C-language computer programs, and it produces a single subordinate. l As an important linguistic aside, we note that the Ap (approxiThe historical development of mathematics to quantify mate) in ApEn is due to the similarity between the entropy and ApEn regularity has centered around various types of entropy algorithms, not necessarily the corresponding entropy and ApEn measures (see Ref. 22 for a brief discussion). There has values.

E744

QUANTIFYING

HORMONE

PULSATILITY

number. It has minimal dependence on the specific type of signal or noise present in the underlying data. The program is versatile; it can be used for any time-series data analysis to compute a measure of regularity. For hormone pulse detection, it is readily adaptable to differences in sampling frequency and duration, assay performance, and signal-to-noise ratios. ApEn has been demonstrated to be very stable to small changes in noise characteristics, infrequent and significant data artifacts, and changes in sampling frequency. ApEn is concordant with visual inspection. It accounts for a variety of dominant and subordinate patterns in data; notably, it will be affected by changes in underlying episodic behavior that do not reflect in changes in peak occurrences or amplitudes. Additionally, ApEn provides a direct barometer of feedback system change in some coupled systems. Thus ApEn might shed insight into interactions among hormones, indicating a source of underlying physiological deviations, such as a breakdown in the normal system feedback process. MODEL

COMPARISON

FRAMEWORK

Below, we discuss results from ApEn and ULTRA calculations for test data from two models. To calculate ApEn and ULTRA for these data, we must specify certain inputs in each algorithm. For ApEn, we set nz = 2 throughout the models and choose r, fixed for each model, to equal -20% the standard deviation of a typical data set. This results in choices of r = 0.4 for the Ueda model and r = 0.1 for the Rossler model, consistent with guidelines given by Pincus et al. (22). For ULTRA, we choose three CVs as the threshold for the Ueda model and two CVs as the threshold for the Rossler model. This is consistent with Van Cauter’s (30) guidelines (applied to the “predominant” pattern in each instance). To determine concentration ranges and CV values for each range, we work backwards from the noise standard deviation data given in each version of each model. In each version, we are given a model in which Gaussian noise of a fixed standard deviation (SD) is superimposed on all the data to model the inaccuracy of assay measurements. We divide the output concentration ranges for each time series into eight pieces, each with a mean 122.For each range, we set the CV to be SD/m. The output of ApEn is a number, whereas the output of ULTRA is an identified set of peaks of pulses in the data. From each ULTRA output, we calculate the number of pulses, the average and standard deviation of pulse lengths, and the average and standard deviation of pulse amplitudes. For each model, we summarize the runs in a table. Each line in the tables lists the run number, number of data points in the time series, and input model characteristics: parameter choices and standard deviation of superimposed Gaussian noise. The output data include the mean and standard deviation of the time series, ApEn value, number of pulses, and mean and standard deviation for both pulse frequency and pulse amplitude. UEDA

DIFFERENTIAL

EQUATION

ENTROPY

is a differential equation that has received considerable attention in recent years, due in great part to studies by Ueda (28) showing that the long-term dynamics of the solution represent steady-state chaotic behavior for parameter values A = 0.05 and B = 7.5. This equation, where the overdots denote differentiation with respect to time t, describes the behavior of the variable x over time; for each time, one can (via numerical methods) calculate the corresponding value of x to deduce a time plot of x as illustrated in Fig. 1A. Equation 1 might be used in mechanical engineering, e.g., to model the motion of a sinusoidally forced structure undergoing large elastic deflections. The solution is bounded, episodic, yet nonperiodic. Here we analyze Eq. 1 for five (A, B) pairs: (0.05, 7.5), (0.05, 8.5), (0.05, 12.0), (0.09, 7.5), and (0.21, 7.5). For each pair, we solve Eq. 1 as a function in time by an explicit time step method, At = 0.002. We extract a time series from the solution by sampling every 0.5 t unit. This sampling rate was chosen to yield -12 data points per episode and generates the baseline series. This is consistent with Yates (38), where samples of at least six times the expected frequency are seen as necessary to deduce periodicities, and with Veldhuis et al. (33), who notes the clinical need for intensified sampling rates. We then “postprocess” the solution time series by converting x to x + 6.0 for all data values. This is done to ensure positive values in the range 3.0-9.0 to mimic endocrine data. We add uniform white noise to each baseline value to deduce the final series. For each pair, we analyze two different A r-

8 6 4

.

MODEL

Description and rationale. The following ii:+Air+x3=Bcost

VIA APPROXIMATE

formula

Fig. 1. Ueda differential equation model time-series output for three pairs of parameter values: A = 0.05,B = 7.5(A); A = 0.05,B = 8.5(B);

A = 0.09.B = 7.5(0.

QUANTIFYING

HORMONE

PULSATILITY

lengths of series, 180 and 900 points. For (0.05, 7.5) and (0.05, 8.5), we also analyze the series with 2,000 points. We analyze this model for two primary reasons. First, it exemplifies a simple system that gives rise to highly nontrivial, putatively pulsatile behavior. Second, it forces us to carefully examine what we mean by pulsatility to insure that the quantitative tools that we use reasonably correspond to our intuition. The crucial property of the solution to the Ueda equation is that it is episodic but truly nonperiodic. Its recurrent nature is evidenced by the fact that certain patterns in the waveform repeat themselves at irregular interv ,als , but there is never exact repetition. There is an apparent baseline frequency Per episode (pulses), although there is tempora 1 variation ofa nonperiodic nature. Furthermore, there are second-order, irregularly varying wiggles in the episodes that are generated by the model itself. Suppose this system were an appropriate model for hormone secretion, with normal secretion given by a model with A and B as stated above fixed, with A = 0.5, and B between 8.5 and 12.0. Could we detect, on the basis of time-series data alone, that certain data came from an “abnormal” system for which A = 0.5 and B = 15.O? Results. A first conclusion, verified by Table 1 (runs 1-IO), is that for noiseless systems; the pulse count is given as one-half the number of sign changes. This property is likely common to many of the current pulse-identification algorithms in which a pulse is flagged as a measured rise and fall, with both the rise and fall indicated by some percentage rise and fall times the noise level. In the absence of noise, any rise is considered a pulse ascent and any fall is considered a pulse descent. So to evaluate ULTRA as a pulse-counting algorithm, it suffices to examine the statistical properties of the algorithm that counts the number of sign changes. This statistic has Table 1. Summa~ No. of Points

K

B

180 180 180 180 180 900 900 900 900 900 900 900 900 900 900 900 900 900 900 900 900 900 2,000 2,000 Model is of the

0.05 0.05 0.09 0.05 0.21 0.05 0.05 0.09 0.05 0.21 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.09 0.09 0.09 0.09 0.05 0.05 form

7.5 8.5 7.5 12.0 7.5 7.5 8.5 7.5 12.0 7.5 7.5 7.5 7.5 7.5 8.5 8.5 8.5 8.5 7.5 7.5 7.5 7.5 7.5 8.5 i + K=i: + x3 =

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 16 19 20 21 22 23 24

APPROXIMATE

E745

ENTROPY

been extensively examined (25) and provides useful information. It does not, however, utilize any information contained in the magnitudes associated with the sign changes so that a tiny wiggle counts as much as a large wave. An instance of an improved measure is given by the Wilcoxon signed-rank statistic (25), a standard nonparametric statistical test. In this context, ranks would be given to the sign changes, with the largest rank to the greatest sign change. Hence big pulses would “count” more than little pulses, quite possibly a desired characteristic in the goal to distinguish normal from abnormal behavior. A central issue for this model is apparent on examination of Fig. 1. Time-series output is shown for three pairs of parameter values: K = 0.05, B = 7.5; K = 0.05, B = 8.5; and K = 0.09, B = 7.5 (Fig. 1, A-C, respectively). These series are apparently different, but quantitative tools to distinguish them are not a priori apparent. These series have a mean of -6 and a standard deviation of - 1.6. Each series has a “period” of 2~, but no two periods are identical; there are different peak amplitudes, shapes, and subordinate wiggles throughout. Both ApEn and ULTRA distinguish versions of this model, but the results require scrutiny, as they apparently are in disagreement. We commence by considering runs l-10, summarized in Table 1. They represent runs for the five K, B pairs specified above for two series lengths, 180 and 900 points. According to ApEn, these versions rank (from most random to least random, in descending order) as (0.05,12.0), (0.05, 7.5), (0.21, 7.5), (0.09, 7.5), and (0.05, 8.5). This order is maintained for both 180 and 900 points, although several distinctions are sharper for 900 than for 180 points. For this model, 900 points yields good convergence for ApEn; comparing run 6 to run 23 (900 vs. 2,000 points, K = 0.05, B = 7.5), ApEn changes from 0.894 to

of Ueda model runs

Ueda Parameters Run No.

VIA

Input Noise SD

ULTRA Mean

0.0 5.808 0.0 6.401 0.0 6.074 0.0 5.992 0.0 6.174 0.0 5.973 0.0 6.533 0.0 6.034 0.0 6.068 0.0 5.908 0.05 5.971 0.1 5.968 0.2 5.963 0.4 5.953 0.05 6.530 0.1 6.528 0.2 6.523 0.4 6.513 0.05 6.031 0.1 6.029 0.2 6.024 0.4 6.014 0.0 6.075 0.0 6.559 B cos t. ApEn,

SD

1.593 1.612 1.650 1.781 1.549 1.597 1.567 1.637 1.758 1.550 1.597 1.599 1.609 1.647 1.567 1.569 1.577 1.614 1.638 1.640 1.650 1.687 1.597 1.556 approximate

ApEn

0.677 0.543 0.574 0.762 0.676 0.894 0.466 0.590 1.153 0.666 0.904 0.953 1.091 1.336 0.473 0.510 0.742 1.196 0.602 0.634 0.909 1.292 0.871 0.443 entropy.

Statistics

No. of sign changes

No. of pulses

Average frequency

SD frequency

56 50 66 81 46 275 213 333 401 227 281 287 299 367 213 253 289 379 333 335 339 365 588 460

28 25 33 40 23 137 105 166 200 112 117 104 97 84 77 76 78 77 165 155 121 92 294 229

6.185 6.792 5.406 4.410 7.818 6.463 8.452 5.388 4.462 7.991 7.578 8.534 9.156 10.59 11.57 11.72 11.41 11.57 5.421 5.773 7.408 9.769 6.782 8.715

2.450 3.400 1.292 1.044 3.390 2.839 4.03 1 1.447 1.131 3.361 3.492 3.694 3.675 3.425 2.568 2.408 2.769 2.806 1.486 1.881 3.245 3.715 3.116 4.077

Average amplitude

SD amplitude

7.368 7.771 7.363 7.406 7.624 7.476 8.028 7.335 7.531 7.635 7.731 7.925 8.098 8.417 8.311 8.320 8.305 8.479 7.332 7.371 7.789 8.340 7.544 8.069

1.834 1.276 1.596 1.692 1.338 1.576 0.769 1.570 1.540 1.550 1.486 1.393 1.250 0.975 0.611 0.603 0.613 0.400 1.575 1.582 1.390 1.029 1.469 0.639

E746

QUANTIFYING

HORMONE

PULSATILITY

VIA

0.871. Similarly, comparing run 7 to run 24 (900 vs. 2,000 points, K = 0.05, B = 8.5), ApEn changes from 0.466 to 0.443. According to ULTRA, these versions rank (from most random to least random) as (0.05, 12-O), (0,09, 7.5), (0.05, 7.5), (0.21,7.5), and (0.05,8.5). This order is nearly maintained for both 180 and 900 points, although the last two versions reverse order in the 180- and 900-point cases. Furthermore, with the exception of the (0.05, 8.5) case, a fivefold increase in point count corresponds to virtually a fivefold increase in pulse number. This ratio of pulses to points is maintained in the two 2,000-point runs, hence the 900-point runs are sufficiently long to extract the salient pulse information here. However, there is an apparent conflict: is (0.05, 7.5) more random (unpatterned) than (0.09, 7.5) or conversely? The Poincare section (18) is a tool to resolve this impasse. First, a phase space plot is generated (for each series), plotting the trajectory of x versus its time derivative, &/dt. To insure a sequence of strictly comparable points, the trajectory is marked strobocopically at times that are an integer multiple of the forcing period 2~. The resulting plot, in the x&/dt plane, shows only the 8'

APPROXIMATE

ENTROPY

strobed points as the Poincare section. If the motion of the system were strictly periodic with the frequency of the forcing, the strobe point would all be the same point, repeating indefinitely. If the true motion were multiply periodic, one would then see a sequence of n dots, repeated indefinitely. More complicated dynamics are represented by more filled out Poincare section portraits, which correspond to greater ApEn. We can now resolve the above question by examining Fig. 2. The most complicated dynamics appear to be given by Fig. 2A, next by Fig. ZC, and least by Fig. 2B. This corresponds to a greatest randomness for (0.05, 7.5), then (0.09, 7.5), followed by (0.05,8.5), the order given by ApEn. Furthermore, the respective ApEn values, 0.894, 0.590, and 0.466, seem to correspond to the intuition that the (0.09, 7.5) case is closer to the (0.05, 8.5) case in randomness than to the (0.05, 7.5) case. The apparent inconsistency in ULTRA is explained by its equal weighing of each of many tiny wiggles and the larger sign changes. The (0.09, 7.5) case has the greatest number of sign changes of the three cases examined in Fig. 1, but these sign changes, particularly the small wiggles, tend to occur near similar locations in each major

DA

6-

6

42u

.. 3..

l

T

o-

::

l

-2

-

-4

-

-6

-

-

..

-6

7

8

9

10

8' ' c 64-

Fig. 2. Phase portraits

(x,dx/dt

plane)

for Ueda

A = 0.05, B = 8.5 (B); A = 0.09, B = 7.5 (C).

-4

-

-6

-

differential

equation

model:

A = 0.05, B = 7.5 (A);

QUANTIFYING

HORMONE

PULSATILITY

“pulse.” This is visually expressed by the three areas of darker clustering in Fig. 2C. Greater randomness would be marked by a greater spread of these dark clusters, as in Fig. 24. This last point reemphasizes the foibles of the sign change algorithm, as opposed to a weighted sign change algorithm. Runs 11-22 further illustrate the difficulties that these small wiggles pose for ULTRA. For each of the versions (0.05, 7.9, (0.05, 8.5), and (0.09, 7.5), we looked at four different noise levels, standard deviations of 0.05,0.1,0.2, and 0.4, corresponding to CVs of - 1,2,4, and 8%. In the (0.05,7.5) case, ULTRA noted 137 pulses at 0 noise compared with 117 pulses at 0.05 noise and 104 pulses at 0.1 noise. This represents a computational loss of -15% of pulses at 1% CV. In the (0.05, 8.5) case, ULTRA noted 105 pulses at 0 noise compared with 77 pulses in the presence of at least 0.05 noise. These 77 pulses represent, almost solely, the large pulses of approximate duration 2~. Virtually all the small wiggles were effectively ignored in the presence of the noise levels noted above. This represents a computational loss of -27% of pulses at 1% CV. In the (0.09, 7.5) case, ULTRA behaved more robustly at low noise levels, with 166 pulses at 0 noise, 165 pulses at 0.05 noise, and 155 pulses at 0.1 noise. ApEn performs more robustly at low noise levels. In the (0.05, 7.5) case, ApEn is 0.894 at 0 noise, 0.904 at 0.05 noise, and 0.953 at 0.1 noise. In the (0.05,8.5) case, ApEn is 0.466 at 0 noise, 0.473 at 0.05 noise, and 0.510 at 0.1 noise. In the (0.09, 7.5) case, ApEn is 0.590 at 0 noise, 0.602 at 0.05 noise, and 0.634 at 0.1 noise. These all represent about a l-2% change at 1% CV and a 7-9% change at 2% CV. At each noise level ApEn maintains the order of randomness of these versions, although system distinction is much less marked at 0.4 noise level (Fig. 3) at which ApEn values are 1.336 for the (0.05, 7.5) case, 1.196 for the (0.05, 8.5) case, and 1.292 for the (0.09, 7.5) case. ULTRA4 also maintains its order of ranking these versions, with pulse counts of 84, 77, and 92 in the same three cases at 0.4 noise standard deviation. It is not surlprising that the distinctions among the versions are muddied at this noise level; some of the small wiggles in the base physiological cases are accentuated, some are eliminated, and some new small wiggles emerge with 0.4 level noise. From analysis of this model, one deduces that in the presence of noise, ULTRA tends to smooth out the time series data, in effect eliminating some small wiggles in the process. In some contexts that may be desirable, but in instances such as this model, in which numerous small, subordinate pulses are present, ULTRA is discarding physiological information. ROSSLER

FEEDBACK

MODEL

Description and rationale. This is a coupled system of

three variables, represented by three ordinary differential equations. We consider this as a putative model for the male reproductive endocrine system, with variables the pituitary portal concentration of LHRH and the serum concentrations of luteinizing hormone (LH) and testosterone (T). These concentrations are modeled by a coupled feedback system: the LHRH secretion rate is

VIA APPROXIMATE

E747

ENTROPY

A 8 6 4

i

tl...I...I...l...l..l Q

20

40

60

80

60

80

Time 0 I--

Ll...l..-.1.-..I...r..J 0

20

40

Time C 8 6

4

I 0

I I 28

Y 40

I 60

80

lime

Fig. 3. Ueda differential equation model time series output, Gaussian noise superimposed, mean intensity 0.4:A = 0.05,B = 7.5(A);A = 0.05,

B = 8.5 (B); A = 0.09,B = 7.5 (C).

given as a function of the local concentrations of LH and serum T. The LH secretion rate is given as a function of the concentration of LHRH plus a rate proportional to its own concentration. The T secretion rate is given as a rate proportional to its own concentration plus a term proportional to the product of the LHRH and T levels. We represent this feedback system as follows, with K to be specified LHRH = -(LH + T) LjH = LHRH + 0.2 LH (2) T = 0.2 + K[(LHRH)T - 5T] For each time and each value of K, we calculate the corresponding concentration levels by an explicit time step method, At = 0.005. We extract a time series from the solution by sampling every 0.5 t unit. For suitable choices of K, the solutions have many of the qualitative features seen in clinical endocrine data. Here, each version is defined by a choice for K. Changes in K can be thought to mirror the intensity of interaction between T and LHRH levels. We analyze this system for coupling levels K = 0.4,0.7, 0.8,0.9, and 1.0. All this is done in a posttransient setting in which we omit the first 90 t units from consideration. We then postprocess the solution time series as follows to insure positive values: we convert LHRH to 0.1 LHRH + 3.0, LH to O.lLH + 3.0, and T to O.lT + 3.0. We add white noise to each baseline value for each of LHRH, LH, and T to deduce the time-series solution to our coupled system. For each noiseless version, we analyze two different lengths of series, 180 and 900 points. For the 900-

E748

QUANTIFYING

HORMONE

PULSATILITY

point series, we analyze three different versions of the model, K = 0.7, 0.8, and 0.9, each under four different noise levels, noise standard deviations of 0.02, 0.05, 0.1, and 0.2. For K = 0.8 and K = 1.0 we also analyze the series with 2,000 points. This model is similar to one examined by Rossler (23) as an example of a system that produced chaotic behavior for certain parameter values of K. It is also thematically similar to models by Smith and co-workers (1,26), which are meant to plausibly model the male endocrine system and are shown to capture some of the essential physiological dynamics of the true reproductive system. We analyze the Rossler model, rather than the Smith model, for pedagogic reasons: distinctions among versions are sharper for the Rossler model than for the Smith model, although qualitatively quite similar. In any case, we analyze this model for some of the reasons given by Smith. Relatively simple versions of this system can explain a number of possible qualitative modes of hormonal dynamics: serum concentrations that are constant in time, periodic in time, or chaotic in time. Most importantly, different behavioral modes can result solely from changes in defining system parameters, or internal interactions among the system subcomponents, and need not be produced by an external driving force. For example, the onset of puberty, in one version of Smith’s model, is seen to be generated simply by an appropriate change in certain system parameters, without an external switch or component entering into the fray. Furthermore, the Rossler model is substantially different from the Ueda model: it is a function of several variables and is an explicit feedback system. Thus we can query if either ApEn or ULTRA can detect changes in the feedback (coupling) rate, as seen by varying K. The model that we analyze here was chosen to give either periodic, multiply periodic, or chaotic output for the behavior of LH with time, depending on K. In general, with increasing K, we see increasing system complexity: the LH behavior evolves from periodic to multiply periodic to chaotic. Results. Figure 4 illustrates time-series output for five coupling parameters, K = 0.4, 0.7, 0.8, 0.9, and 1.0, in a noiseless environment. We see virtually the identical pulse count in each of these systems. For K = 0.4, the system is strictly periodic, whereas for K = 0.7, the system is “twice periodic,” with a higher pulse always followed by a smaller pulse. The system is “four-times periodic” for K = 0.8 (high-low-highest-lowest) and chaotic for K = 0.9 and K = 1.0. In these last two instances, no pattern of multiple pulses forms a fundamental period of its own. The increase in system complexity with increasing K is further confirmed by phase space plots, shown in Fig. 5, A-E, the parts of which correspond to their counterparts in Fig. 4. Phase space plots serve a similar purpose to Poincare sections to geometrically capture complexity via an appropriate perspective on the data. In Fig. 5, we plot the trajectory of LHRH vs. LH so that each point represents a single LHRH-LH pair of values at a fixed instant. Increased complexity manifests itself in more complicated phase space portraits, seen here with increasing K.

VIA APPROXIMATE

ENTROPY

A t-

:/

f.

,

0

20

40

60

80

60

80

flme

Tlme

0

20

40 Time

0

LE

,

0

.

.

.

,

20

_

,

40

_

_

,

60

_

.

1

,

80

Fig. 4. Rossler coupled differential equation model, luteinizing hormone (LH) time-series output for different parameter values: K = 0.4 (A); K = 0.7 (B); K = 0.8 (C); K = 0.9 (D); K = 1.0 (E).

If the motion of the LH system was singly periodic, the portrait would be a simple closed curve, as in Fig. 5A. Multiple periodicity is seen by multiple loops in a closed curve. Figure 5, B and C, illustrates this, with two- and four-times periodic behavior, respectively. Figure 5, D and E, does not represent closed curves and illustrates chaotic behavior. Fine system structure in these versions is apparent with phase space portraits produced from much longer time-series input than we have considered. We consider ULTRA’s evaluation of the respective noiseless model versions in runs 1-10 (Table 2). Runs l-5 and 6-10 are 180 and 900 points long, respectively, with each set of five runs arranged in increasing K. Runs l-5 give either 15 or 16 pulses for each series, and runs 6-10 give between 77 and 82 pulses for each series, indicating little version distinction based on pulse count. For the other statistics there is a distinct difference between the K = 0.4 case and the other four versions, all of which produce quite similar values. In runs 7-10, the average frequency ranges from 11.28 to 11.63, the standard deviation of the frequency ranges from 0.636 to 0.650, the average amplitude ranges from 3.50 to 3.56, and the standard deviation of the amplitude ranges from 0.130 to 0.175. Only the last of these statistics, the amplitude standard deviation, shows any spread among the four versions, and even for this statistic, the lowest value is achieved for K = 0.9, with both K = 0.8 and K = 1.0 versions being slightly higher. This conflicts with intuition, which suggests that a lowest value for each of these statistics should be for either the least or the most complex system. For runs l-5, in increasing K, ApEn values are 0.165, 0.266, 0.442, 0.472, and 0.489, monotonically increasing with K. For the 900-point runs (runs 6-lo), corresponding ApEn values are 0.165,0.262,0.431,0.495, and 0.510,

QUANTIFYING

HORMONE

2 LHRH

PULSATILITY

3

4

VIA APPROXIMATE

ENTROPY

E749

5

LHRH

Fig. 5. Rossler coupled differential equation model, LHreleasing hormone (LHRH)-LH phase portraits for different parameter values: K = 0.4 (A); K = 0.7 (B); K = 0.8 (C); K = 0.9 (D); K = 1.0 (E).

LHRH

b-

-



2

-

-

-

l



-

-

-

-



4

3

-

-

-

-



l

J

5

LHRH

again steadily increasing with K. With these longer runs, distinction is sharper between the K = 0.8 case and the K = 0.9 and K = 1.0 cases. Furthermore, ApEn is (slightly) larger for the K = 1.0 case than for the K = 0.9 case, establishing system distinction despite the presence of chaos in both instances. It also appears that the ApEn values remain nearly constant for run lengths greater than 900 points, as indicated by runs 23 and 24. For K = 0.8, no noise, ApEn = 0.431 with 900 points, whereas ApEn = 0.430 with 2,000 points; for K = 1.0, no noise, ApEn = 0.510 with 900 points, whereas ApEn = 0.505 with 2,000 points. Hence for this model, 1) ApEn distinguishes all the versions from each other, 2) ApEn, via monotonic increase, directly verifies the growing complexity and increased feedback with increased K; and 3) this model establishes 1 and 2 with no more than 180 points necessary. Runs 11-22 indicate the effects of noise on the ULTRA and ApEn computations. For each of the three versions, K = 0.7, K = 0.8, and K = 0.9, we looked at four different noise levels, standard deviations of 0.02,0.05,0.1, and 0.2, corresponding to CVs of -1, 2, 4, and 8%, respectively, for 900-point time series. ULTRA maintained its pulse count of -80 total pulses throughout these runs, increasing slightly with noise level of 0.2 to 83,83, and 84 pulses

for the K = 0.7,0.8, and 0.9 cases, respectively. As above, this provided little distinction among these three systems. At 0.02 noise, ApEn maintained increasing. order with complexity (0.323 vs. 0.503 vs. 0.565); it was similar for the 0.05 and 0.1 noise levels (0.633 vs. 0.761 vs. 0.822, 1.112 vs. 1.167 vs. 1.187). With the 0.1 noise level, the system distinctions were becoming blurred, and with 0.2 noise, the system distinctions were obliterated, especially in the K = 0.8 vs. K = 0.9 cases (1.453 vs. 1.544 vs. 1.503) in which complexity is slightly reversed (due to “realization” and finite sample size issues). This blurring is evident in Fig. 6, which compares the K = 0.8 and K = 0.9 cases at 0.2 noise. Note the comparisons to the analogous Fig. 5, C and D; the four-times periodicity for K = 0.8 is not even hinted at in Fig. 6A, and the central region in phase space previously empty is partially covered by scattered points. Figure 7 compares the LHRH, LH, and T time series from the K = 1.0 version of this model and raises an important issue. The LHRH and LH time series are visually similar; both have 16 pulses, similar amplitudes, and general pulse characteristics. One could expect that these hormones belonged to a single autonomous system. The behavior of T, however, is visually discordant with the behavior of LHRH and LH; there are 12 pulses, long

E750

QUANTIFYING

Table 2. Summary Run No.

No. of Points

1 2 3 4 5 6 7 6 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

180 180 180 180 180 900 900 900 900 900 900 900 900 900 900 900 900 900 900 900 900 900 2,000 2,000 Model

HORMONE

PULSATILITY

Coupling Parameter

ENTROPY

ULTRA Mean

SD

K

Input Noise SD

0.4 0.7 0.8 0.9 1.0 0.4 0.7 0.8 0.9 1.0 0.7 0.7 0.7 0.7 0.8 0.8 0.8 0.8 0.9 0.9 0.9 0.9 0.8 1.0

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.02 0.05 0.2 0.1 0.02 0.05 0.2 0.1 0.02 0.05 0.2 0.1 0.0 0.0

2.774 2.871 2.882 2.902 2.918 2.772 2.872 2.887 2.897 2.910 2.872 2.872 2.874 2.873 2.887 2.888 2.890 2.888 2.897 2.898 2.900 2.898 2.888 2.908

0.692 0.529 0.499 0.476 0.453 0.689 0.527 0.495 0.475 0.454 0.529 0.532 0.571 0.540 0.495 0.497 0.535 0.505 0.474 0.475 0.505 0.478 0.495 0.453

is of the form

AND

APPROXIMATE

of Rossler model runs

dx/dt = -(y + z), dy/dt

= x + 0.2y,

ApEn

No. of sign changes

No. of pulses

32 32 31 30 30 164 159 157 155 154 159 161 354 227 157 161 365 213 155 167 395 247 349 343

16 16 15 15 15 82 79 78 77 77 79 79 83 79 78 78 83 79 77 77 84 80 174 171

0.165 0.266 0.442 0.472 0.489 0.165 0.262 0.43 1 0.495 0.510 0.323 0.633 1.453 1.112 0.503 0.761 1.544 1.167 0.565 0.822 1.503 1.187 0.430 0.505 dz/dt

= 0.2 + K(zx

stretches of flat tracings, spiked pulses, and three pulses that are much greater in amplitude than the others. We thus conclude that dissimilar pulsatile characteristics of hormonal plasma concentrations do not eliminate the possibility that the hormones may be derived from a single system, with no external influences. DISCUSSION

VIA

GENERAL

CONCLUSIONS

We infer some general conclusions from the above runs. ApEn and ULTRA provide different and complementary information from the data. ULTRA gives a first-order measure of the pulsatility of the system via the pulse count and related statistics. It can be applied to data with 10% CV with 180 data points, typical values for current studies. For some systems, such as those defined by the Ueda and Rossler models above, ULTRA will be relatively ineffective at distinguishing distinct versions of the systems and may possibly give counterintuitive results. Subordinate pulses create difficulties for ULTRA, as do models in which pulse timing is reasonably constant, where the variation is in the patterned versus random behavior of the respective pulse amplitudes. The firstorder, as opposed to finely tuned, behavior of ULTRA is further evidenced by the observation that in noiseless systems, ULTRA is statistically equivalent to a sign change identifying algorithm. This algorithm was noted earlier to be useful but to lack the greater versatility that appropriately weighted versions maintain. In contrast, at low intra-assay noise levels, with the stated input parameters, ApEn effectively distinguished all the distinct versions of each model from one another. In directly assessing the regularity of the data, ApEn distinguished between versions of episodic behavior, as well as between episodic versus more random behavior. By considering all the time-series data, not just the data

Statistics

Average frequency

SD frequency

Average amplitude

SD amplitude

10.93 11.26 11.29 11.50 11.64 10.91 11.28 11.42 11.55 11.63 11.28 11.28 10.72 11.27 11.42 11.42 10.72 11.27 11.57 11.57 10.59 11.13 11.43 11.62

0.258 0.775 0.727 0.650 0.497 0.283 0.643 0.636 0.641 0.650 0.662 0.804 2.251 1.101 0.695 0.950 2.229 1.429 0.680 0.854 2.833 1.957 0.639 0.652

3.764 3.540 3.512 3.513 3.498 3.765 3.560 3.535 3.528 3.500 3.560 3.563 3.650 3.587 3.537 3.548 3.636 3.583 3.527 3.533 3.606 3.544 3.537 3.498

0.015 0.227 0.187 0.142 0.138 0.014 0.175 0.160 0.130 0.139 0.178 0.184 0.242 0.196 0.162 0.168 0.269 0.178 0.132 0.141 0.273 0.191 0.156 0.143

- 52).

that make-up the pulse acmes, ApEn evaluates subordinate behavior. There is a significant increase in ApEn with increasing CV, although it is still possible to compare systems with identical intra-assay CVs, even as high as 8%, via ApEn to discern system distinction. Such analyses produce ApEn values that are much larger than the corresponding values in noiseless systems; in a few cases, systems that are distinguished by ApEn at low CV are no longer distinguished at 8% CV. From the Ueda model, we see that there may be important regularity information in time-series data that can be effectively extracted only in the presence of a small intraassay CV. For such purposes, ApEn is well suited, with a finer focus than that of the pulse-detection algorithms currently employed. The required decrease in intra-assay CV from current levels is consistent with the direction in which endocrinologists are actively moving. To validate the above claim of effective distinction of model versions by ApEn, we need an estimate of ApEn standard deviation. We now provide this (Monte Carlo estimates, 100 replications per computed standard deviation) for two quite different processes: the MIX(P) model introduced in Ref. 21 and a paradigm for chaos, the parametrized logistic map, f&) = ax(1 - x), 3.5 < a < 4.0. We first define MIX(P): fix 0 5 P 5 1. Define = fl sin(2rj/l2) for all j; q = IID (independent, identically distributed) uniform random variables on[-fl, fl] and Zj = IID random variables; Zj = 1 with probability P, Zj = 0 with probability 1 - P. Then define MIX(P)j = (1 - Zj)Xj + Zjyj. This is a family of stochastic processes that samples a sine wave for P = 0, is IID uniform for P = 1 and intuitively becomes more “random” as P increases. For m = 2, r = 20% of the process standard deviation, and 900 points, the standard deviation of ApEn [MIX(P)], calculated for each of 40 Tj

QUANTIFYING

5

r

HORMONE

PULSATILITY

VIA APPROXIMATE

ENTROPY

E751

A

. . .

.

4

:

.

. .

Id

.

.:

.

.

*. -. ’ . ‘. . .. * . .

.

.

.

. .

.

.-

, . 4. :* . . ’ .:..:* . ‘*:.“.. . *. . *.. *. .a *. ’. . :. _. -. *. .:

*. : . . 4..

.

! -*

.

-

:-

.

**.

.

.-r’.*

:

l

.

.,

.

. L”

*t .

,

t

.

* =.

,.

l *‘::.,Q

-

.

.

. . .

.; .

‘*

.

8

*,..

‘a y,:..

.I

*

. .

.

. . . ,

.-

.

.

.

.

.



3

?.* *

‘,

.

: .

.

.a

.

--..

. :

,’

i..

:*

:

,

*

*. ** . . .

.

.

*

,

*.’

. . 0;.

.

.

.

. . .

. . .

. . .

.

. . ,.a.**. . * I..

*. .

.

. .

.*



,*.. .

a

a*

.a..

.

.

. .

*

’ -.

8

.

.

. .

.

. . i’

f.5

-.

.

-.

:

**

. . ..\

. .

**-.*. . ..

*.*

.*

.*,

.

.

. I .

-e

**.

.

. .

.



a..*. *.

**

-

** ’

..-

,.-.

..*

**

-:,*

. .

.

a..**.

.

.

.

.‘..

.

-**

&..

.

l

*

*,

.

*” .*

.

.

- . *.

,.::*,..

.

.

rl

.

. .

.*.’

. . .* .

.‘:.. .

*

::

1’

l

.-

.

. . ...

...*

.

.

*:

. .

.*

‘. . .

.

-

‘-

:.

. .

.

*.

‘.‘.

.

*

:*

.

a.

:

I .:

.

-1

.

*

:. -..

‘.

-

*.

.*

-

l

:.

.

*.

.‘:

.,*.; . .

:*

-

.;

-*

.

.

1 .

*

*

* .

.

.-a

-

l

:.

.

.

*

*

.

.**:

..*

.

. .

.\*

.

. :..

.

.

*.

.

:, -.

.

.

I ,,

.

*’

.



.’ *

.*

:



.

.-

.

.*.

.

2

.

‘.

. .

-*

. .

, (

.

.>



**

.

.

:’

*;.

.

-. .

.

.

.

*

*. :..

.

. .

. -

,.-

‘.

, ,

*.‘.

.

a*

. .

* *

. -. ,

:

*.

,

‘. :

. I

‘. :

. -

.

.

.

‘. .:

u:,

- ,

“.

.

*

*.. ‘: :,

.

:; *

. .

..\

-

.

..

. .*’

. I.,

2

.

.‘I..

.

3

.

I..

.

4

5

LHRH

5

.I..

Fig. 6. Rossler coupled differential equation model, LHRH-LH phase portraits, Gaussian noise superimposed, mean intensity 0.2: K = 0.8 (A); K = 0.9 (B).

B

4 I d 3 ‘.

.

2 2

3

4 LHRH

values of P equally spaced between 0 and 1, is ~0.055 for all P. For 180 points, ApEn (same uz and r) standard deviation is ~0.07 for all P. For the logistic map, the “randomization” needed to make this deterministic map fit a Monte Carlo scenario is given by different choices for the initial condition. For m = 2, r = 20% of the process standard deviation, and 900 points, the standard deviation of ApEn [f&)1, calculated for each of 50 values of a equally spaced between 3.5 and 4.0, is co.015 for all a. For 180 points, ApEn (same m and r) standard deviation is ~0.035 for all a. Thus ApEn values of a = 1.1 and b = 0.9 would have very high probability of coming from differing processes for either of these two model classes. The MIX process computation is appealing in that the process is nearly IID (uncorrelated iterates) for P near 1. Because a larger ApEn standard deviation generally corresponds to more uncorrelated processes, we can expect that the standard deviation bounds for ApEn for MIX(P) will provide bounds for a large class of deterministic and stochastic processes. Given the ApEn sensitivity to intra-assay CV, we must note several caveats to insure appropriate application of this algorithm. If the same process is analyzed in two different laboratories, one with 2% CV and the other with 8% CV, the ApEn values can be significantly different.

Also, if the same process is analyzed under two very different sampling regimens (e.g., samplings every 5 vs. every 20 min), ApEn values can be quite different; in effect, the relative noise levels can be dissimilar. Thus until CVs and other noise levels that vary from system to system are markedly reduced from present values, we recommend the comparison of ApEn values be restricted to data sets produced from similar settings (e.g., same laboratory and sampling frequency), which should insure a relatively constant CV across samples. The comparisons done for the two models above, at a fixed CV level, model such a “homogeneously noisy” environment and, as already noted, show valid ApEn distinction, given CVs at presently observed levels. Along the same lines, it is critical to distinguish between the comparison of ApEn (with fixed m and r) values for two data sets, given N data points, from the question of convergence of ApEn for a specific system. The results from the two models analyzed above indicate that ApEn typically needed on the order of 900 points for convergence. In comparing systems with 180 data samples, ApEn distinguished most systems that it distinguished with 900 points, occasionally less sharply. Thus we suggest that a fixed sample length be used for all data sets under study. The models analyzed above were chosen to illustrate

E752

QUANTIFYING

c 1 * 1.

1 . . .

0

20

I .

HORMONE

. .

40

I .

PULSATILITY

..I

..I

60

80

60

80

lime

3 2 0

20

40 lime

C f-

2

I 0

.

.

.

1 20

.

.

.

I 40

.

.

.

1 60

..,

I 80

.

.

Time

Fig. 7. Rossler coupled differential equation model, comparison series for K = 1.0 LHRH data (A); LH data (B); testosterone

of time data (C).

different types of physiologically plausible behavior, and although there was no substantial effort to model a particular endocrine system, it would seem likely that a true endocrine system would be at least as mathematically complex as either of these models. Thus it is imperative that statistics, meant to evaluate pulses generated by true endocrine system hormones, be capable of effective discrimination of versions of the above models. A key observation from these models is that nonlinear systems can produce highly nontrivial, episodic, yet nonperiodic output behavior from equations that are simple in appearance. Output that appears as a sequence of identical sine wave-like pulses is usually associated with uncoupled linear systems. Such linear systems have been extensively studied because they readily yield exact analytic mathematical solutions. There is no a priori reason to anticipate that true endocrine systems be either linear or devoid of feedback. Hence we must consider the likelihood that episodicity (no exactly repeating patterns) is physiologically normative. In addition to those considered above, we could have analyzed stochastic models such as Markov processes and networks of queues. We would anticipate similar qualitative conclusions to those realized herein, and it may be worthwhile to ascertain ApEn and ULTRA performance for such models in a separate context. In complex systems of glands and hormones, a direct barometer of feedback or interaction between systems would likely be insightful. Either a breakdown in or an excessive amount of feedback mav mark the onset of

VIA

APPROXIMATE

ENTROPY

disease, and an algorithm that could directly mark such a change in feedback has added value. For the Rossler model, as we increased the coupling parameter K, ApEn steadily increased, thus providing a direct measure of increasing system complexity. In general, ApEn appears to increase with greater system coupling and greater attendant complexity. Although coupled systems currently must be individually analyzed to insure this increase of ApEn with feedback parameter, this property holds significant potential utility in practical applications. Above, we indicated potential near-term applicability by observing that with 180 points, or with 8% CV, ApEn still was useful in drawing distinctions between most model versions. On the basis of a preliminary mathematical development, it also appears that a randomized version of ApEn could be applied to hormone level data. This randomized version of ApEn has the advantage that it can be coupled with bootstrapping methods (6) to yield a statistic that distinguishes data sets of 100 points with high probability (via a small variance) in the presence of nontrivial noise. Hence greater applicability of ApEn to hormone level data can be achieved both by more accurate and numerous clinical data and by statistical advances outside the clinical setting. We have not investigated the possibility of smoothing raw endocrine data prior to ApEn or ULTRA analysis. As Yates (38) has noted, prefiltering techniques such as the Butterworth digital filter and the linear mean-squared estimator are underutilized in endocrine research. Investigations should be made of the effects that smoothers have on ApEn calculations. In summary, we have examined the potential use of a recently developed statistic, ApEn, to quantify regularity in endocrine hormone data. We have found substantial promise for new insights in the detection of abnormal behavior, especially given modest increases in the number of data samples and in the accuracy of the serum concentration level of each sampling. APPENDIX:

ApEn

DEFINITION

Step 1. Form a time seriesof data u(l), u(2), . . . , u(N), given IV raw data values from measurementsequally spacedin time. Step 2. Fix m, an integer, and r, a positive real number. The value of m representsthe length of comparedruns (a window), and r effectively representsa filter. Step 3. Form a sequenceof vectors x(l), x(2), . . . , x(N) in R*, real m-dimensionalspace,by x(i) = [u(i), . . . , u(; + m - l)]. Step 4. Use the sequencex(l), x(2), . . . , x(N) to construct, for eachi, 1 5 i 5 IV - m + 1, Cr(r) = [number of x(j) suchthat

d[x(i), x(.dl

5 r}/(N

+ m - 1).

We must define d[x(i), x(j)] for vectors x(i) and x(j). We follow Ref. 27 (the Takens modification of the formula as given in Ref. 12) by defining

d[xG), x(j)1 = max 1u(i

+ k - 1) - u(j + k - 1) 1

for k = 1,2, . . . , m. d representsthe distance between the vectors x(i) and x(j), given by the maximum of their respective scalar components. Step 5. Next, define N-m+1

a*(r)

= (N - m + 1)-l

1 i=l

In Cl(r)

QUANTIFYING

HORMONE

PULSATILITY

VIA APPROXIMATE

E753

ENTROPY

Through Step 5, the K-S entropy and ApEn algorithms are 11. Goodner, C. J., B. C. Walike, D. J. Koerker, A. C. Brown, J. W. Ensinck, E. W. Chideckel, J. Palmer, and C. W. identical. As an important observation that provides insight Kalnasy. Insulin, glucagon, and glucose exhibit synchronous into thesetwo algorithms, on unravelling definitions, we deduce sustained oscillations in fasting monkeys. Science Wush. DC 195: that 177-179,

P., and I. Procaccia. Estimation of the Kolmogorov entropy from a chaotic signal. Phys. Reu. A 28: 2591-

= averageover i of

2593,

ln[probability that 1u(j + m) - u(i + m) 1 5 r

(Al)

given that 1u(j + lz) - u(i + k) 1 5 r for rlz= 0, 1, . . . , m - l] The next step distinguishesbetween K-S entropy and ApEn. Step 6 (K-S’. For K-S entropy lim lim lim [am(r) - ‘P”+I(r)] r-0

1977.

12. Grassberger,

cpm+l(r)- am(r)

m-w

= At h(p)

w

N-00

where h(p) is the K-S entropy. In the absenceof a natural system time unit, chooseAt = 1. This is a formula, rather than an algorithm, sincewe have not indicated an explicit procedure to take the limits, nor guaranteed the existence of a convergedlimit. For ApEn, we can explicitly complete the algorithm. Step 6 (ApEn). We define approximate entropy by ApEn = am(r) - am+‘(r)

(fw

for m and r fixed as in step 2. Address for reprint Guilford, CT 06437.

requests: S. M. Pincus, 990 Moose Hill Rd.,

Received 23 April 1990; accepted in final form 8 November 1991.

1983.

Kaplan, D. T., M. I. Furman, S. M. Pincus, S. M. Ryan, L. A. Lipsitz, and A. L. Goldberger. Aging and the complexity of cardiovascular dynamics. Biophys. J. 59: 945-949, 1991. 14. Kolmogorov, A. N. A new metric invariant of transient dynamical systems and automorphisms in Lebesgue spaces. Dokl. Akad.

13.

-- Nuuk 15*Lang,

SSSR

23.

335-345,

1991.

0. E. An equation for continuous chaos. Phys. Lett. A

1. Abraham, R. H., H. Kocak, and W. R. Smith. Chaos and intermittency in an endocrine system model. In: Chaos, Frczctals, and Dynamics, edited by P. Fischer and W. R. Smith. New York: Dekker, 1985, p. 33-70. 2. Cetek, N. S., and S. S. C. Yen. Concomitant pulsatile release of prolactin and luteinizing hormone in hypogonadal women. J. Endocrinol.

Metab.

56: 1313-1315,

1983.

Clifton, D. K., and R. A. Steiner. Cycle detection: a technique for estimating the frequency and amplitude of episodic fluctuations in blood hormone and substrate concentrations. Endocrinology 112: 1057-1064,

1983.

4.

Crowley, W. F., M. Filicori, D. Spratt, and N. Santoro. The physiology of gonadotropin-releasing hormone (GnRH) secretion in men and women. Recent Prog. Horm. Res. 41:

5.

Crutchfield, J. P., and N. H. Packard. Symbolic dynamics of one-dimensional maps: entropies, finite precursor, and noise. Int.

471-531,

J. Theor.

1976.

Santen, R. J., and C. W. Bardin. Episodic luteinizing hormone secretion in man. Pulse analysis, clinical interpretation, physiologic mechanisms. J. Clin. Inuest. 52: 2617-2628, 1973. 25 Sen, P. K. Signed-rank statistics. In: Encyclopedia of Statistical ’ Sciences, edited by S. Kotz and N. L. Johnson. New York: Wiley, 1988, vol. 8, p. 461-466. 26 Smith, W. R. Qualitative mathematical models of endocrine systems. Am. J. Physiol. 245 (Regulatory Integrative Comp. Physiol. 14): R473-R477, 1983. 27 Takens, F. Invariants related to dimension and entropy. In: Atas do 13. Col. brasiliero de Matematicas. Rio de Janerio: 1983. 28. Ueda, Y. Steady motions exhibited by Duffing’s equation: a picture book of regular and chaotic motions. In: New Approaches to Nonlinear Problems in Dynamics, edited by P. J. Holmes. Philadelphia, PA: SIAM, 1980, p. 311-322. 29. Urban, R. J., W. S. Evans, A. D. Rogol, D. L. Kaiser, M. L. Johnson, and J. D. Veldhuis. Contemporary aspects of discrete peak-detection algorithms. I. The paradigm of the luteinizing hormone pulse signal in men. Endocr. Rev. 9: 3-37, 1988. 30. Van Cauter, E. Quantitative methods for the analysis of circadian and episodic hormone fluctuations. In: Human Pituitary Hormones: Circadian and Episodic Variations, edited by E. Van Cauter and G. Copinschi. The Hague: Nijhoff, 1981, p. l-25. E. W., R. Leclercq, L. Vanhaelst, and J. 31. Van Cauter, Golstein. Simultaneous study of cortisol and TSH daily variations in normal subjects and patients with hyperadrenalcorticism. 24.

REFERENCES

3.

1958.

Rossler, 57: 397-398,

Clin.

119: 861-864,

D. A., D. R. Matthews, and R. C. Turner. Brief, irregular oscillations of basal plasma insulin and glucose concentrations in diabetic men. Diabetes 30: 435-439, 1981. D. R., B. A. Naylor, R. G. Jones, G. H. Ward, 16. Matthews, and R. C. Turner. Pulsatile insulin has greater hypoglycemic effect than continuous delivery. Diabetes 32: 617-621, 1983. G. R., and K. W. Wachter. Algorithms for the 17. Merriam, study of episodic hormone secretion. Am. J. Physiol. 243 (Endocrinol. Metab. 6): E310-E318, 1982. New York: Wiley, 1987, p. 47-56. 18. Moon, F. C. Chaotic Vibrations. Pulsatile secretion of human 19. Odell, W. D., and J. Griffen. chorionic gonadotropin in normal adults. N. EngZ. J. Med. 317: 1688-1691, 1987. 20. Oerter, K. E., V. Guardabasso, and D. Rodbard. Detection and characterization of peaks and estimation of instantaneous secretory rate for episodic pulsatile hormone secretion. Comput. Biomed. Res. 19: 170-191, 1986. 21. Pincus, S. M. Approximate entropy as a measure of system complexity. Proc. Natl. Acad. Sci. USA 88: 2297-2301, 1991. A 22. Pincus, S. M., I. M. Gladstone, and R. A. Ehrenkranz. regularity statistic for medical data analysis. J. C&n. Monit. 7:

1985.

Phys.

21: 433-465,

1982.

6.

Efron,

B. The Jackknife, the Bootstrap, and Other Resampling Plans. Philadelphia, PA: SIAM, 1982, p. 27-36. 7. Filicori, M., J. P. Butler, and W. F. Crowley. Neuroendocrine regulation of the corpus luteum in the human: evidence for pulsatile progesterone secretion. J. Clin. Inuest. 73: 8.

Finkelstein, J. W., H. P. Roffwarg, R. M. Boyar, J. Kream, and L. Hellman. Age-related changes in the 24-hour spontaneous growth hormone. J. Clin. Endocrinol. Metab. 35: 665-

1638-1647,

1984.

670, 1972.

Fraser, R., J. J. Brown, and A. F. Lever. Control of aldosterone secretion. CZin. Sci. Lond. 56: 389-399, 1979. 10. Fuchs, A. R., K. Goeschen, and P. Husslein. Oxytocin and the initiation of human parturition. III. Plasma concentration of oxytocin and 13,14-dihydro-15 keto-prostaglandin FZcuin spontaneous and oxytocin-induced labor at term. Am. J. Obstet. Gynecol. 9.

147: 497-502,

1983.

l

l

J. Clin. 32.

Endocrinol.

Metab.

39: 645-652,

1974.

Van Cauter, E. W., E. Virasoro, R. Leclercq, and G. Copinschi. Seasonal, circadian, and episodic variations of human immunoreactive beta-Msh, ACTH, and cortisol. Int. J. Pept.

Protein

Res. 17: 3-13,

1981.

33. Veldhuis, J. D., W. S. Evans, A. D. Rogol, C. R. Drake, M. 0. Thorner, G. R. Merriam, and M. L. Johnson. Intensified rates of venous sampling unmask the presence of spontaneous, high frequency pulsation of LH in men. J. Clin. Endocrinol. Metab.

59: 96-102,

1984.

E754

QUANTIFYING

HORMONE

PULSATILITY

34. Veldhuis, J. D., and M. L. Johnson. Cluster analysis: a simple, versatile, and robust algorithm for endocrine pulse detection. Am. J. Physiol. 250 (Endocrinol. Metab. 13): E486-E493, 1986. 35. Veldhuis, J. D., J. Weiss, N. Mauras, A. D. Rogol, W. S. Evans, and M. L. Johnson. Appraising endocrine pulse signals at low circulating hormone concentrations: use of regional coefficients of variation in the experimental series to analyze pulsatile luteinizing hormone release. Pedicztr. Res. 20: 632-637, 1986. 36. Waldstreicher, J., N. F. Santoro, J. E. Hall, M. Filicori, and W. F. Crowley. Hyperfunction of the hypothalmic-pituitary axis in women with polycystic ovary disease: indirect evidence for partial gonadotroph desensitization. J. Clin. Endocrinol.

VIA

APPROXIMATE

ENTROPY

Metab. 66: 165172, 1988. 37. Wolf, A., J. B. Swift, H. L. Swinney, and J. A. Vastano. Determining Lyapunov exponents from a time-series. Physica D 16: 285-317, 1985. 38. Yates, F. E. Analysis of endocrine signals: the engineering and physics of biochemical communication systems. Biol. Reprod. 24: 73-94, 1981. 39. Yen, S. S. C., C. C. Tsai, F. Naftolin, G. Vandenburg, and L. Ajabor. Pulsatile patterns of gonadotropin release in subjects with and without ovarian function. J. CZin. Endocrinol. Metab. 34:671-675, 1972.

Suggest Documents