Combining local and global models to capture fast ... - Semantic Scholar

19 downloads 0 Views 258KB Size Report
panel shows iterated simulations from both (1) (light dot-dashed lines) and (3) (dark dot-dashed lines) .... Michael Small, Kevin Judd, and Alistair Mees. Modeling ...
Combining local and global models to capture fast and slow dynamics in time series data Michael Small Hong Kong Polytechnic University, Hong Kong [email protected], WWW home page: http://small.eie.polyu.edu.hk/ Abstract. Many time series exhibit dynamics over vastly different time scales. The standard way to capture this behavior is to assume that the slow dynamics are a “trend”, to de-trend the data, and then to model the fast dynamics. However, for nonlinear dynamical systems this is generally insufficient. In this paper we describe a new method, utilizing two distinct nonlinear modeling architectures to capture both fast and slow dynamics. Slow dynamics are modeled with the method of analogues, and fast dynamics with a deterministic radial basis function network. When combined the resulting model out-performs either individual system.

1

Fast and slow dynamics

Scalar time series often exhibit deterministic dynamics on very different time scales (see Fig. 1). For example, sound waves exhibit fast intra-cycle variation and slow inter-cycle fluctuations. It is difficult for a single model to describe both behaviors simultaneously. A standard method for treating such data is to first apply some statistical de-trending and to then model the residuals. In financial data analysis, one sees long term (cyclic) fluctuations and inter- or even intraday variability. Analysts will usually focus on either the long term trend or the rapid fluctuations, but not both. In this paper we describe an alternative approach that is capable of capturing both fast and slow dynamics simultaneously. To model the slow dynamics we embed the scalar time series [1] and use the method of analogues [2]. The method of analogues essentially predicts the future by using the temporal successors of the preceding observations that are most like the current state. Such techniques have found wide application and have been seen to capture long term dynamics well [3]. The short term dynamics are now nothing more than the model prediction errors of the method of analogues prediction. We model the short time dynamics with a deterministic and parametric model structure. The choice of model structure is arbitrary, but we choose minimum description length radial basis function networks [4] because this is what we are familiar with [5, 6]. We find that the result of this technique outperforms either of the standard methods for both experimental and artificial time series data. Moreover, this combined methodology allows us to produce realistic simulations of experimental time series data. In the next section, we describe the model structure. Following this, we present results for experimental and simulated time series data.

One step predictions (RMS(final)=2.4267 & RMS(linear)=2.5437) 50 40 30 20 10 0 100

200 300 400 500 600 700 true (blue) data; linear (green) and final (black) model; and, error (red)

800

60 50 40 30 20 10 0

0

100

200

300 400 500 600 700 true (blue) data; linear (green) and final (black) model;

800

900

1000

0

100

200

300 400 500 600 700 true (blue) data; noisy model simulation (red)

800

900

1000

60 50 40 30 20 10 0

Fig. 1. Out-of-sample model predictions and simulations for the “Lorenz+Ikeda” system. The top panel shows the original data (solid line), out-of-sample one step model prediction (dot-dashed lines) and prediction errors (dotted). Both predictions via method of analogues (Eq. (1)) and the combined approach (Eq. (3)) are shown, but the difference is very slight (root-mean-square of 2.54 and 2.43 respectively). The middle panel shows iterated simulations from both (1) (light dot-dashed lines) and (3) (dark dot-dashed lines). The new combined model approach performs qualitatively better. This is also apparent from the bottom panel showing a noisy simulation (iterated model predictions with Gaussian random variates added after each prediction time step) of (3) (dot-dashed) and the original data (solid). We used v (A) = (0, 1, 2, 3, 4, 5, 6, 7, 8, 66, 132) and v (B) = (0, 1, 2, 3).

2

The model

Let xt denote the observed state at time t. We build two predictive models A and B using the method of analogues and some parametric model structure (we

choose to use radial basis functions). Let v = (`1 , `2 , . . . , `n ) denote an n dimensional embedding and yt = xt−v = (xt−`1 , xt−`2 , . . . , xt−`n ) the corresponding embedded point. According to various embedding theorems [1], the correct choice of v will ensure that the dynamic evolution of yt will be equivalent to that of the underlying dynamical system that generated xt . Of course, the “correct” choice of v is problematic, and has been the subject of considerable study [7]. In general one finds that there is no genuinely “correct” choice for systems with both fast and slow dynamics. Instead one must choose the v which captures the dynamics of interest best. Let v (A) be an embedding suitable for modeling the long term dynamics (in general this means that the corresponding lags `i will be large), and v (B) be an embedding suitable for modeling the short term dynamics (and probably containing relatively small lags). Let yt = xt−v(A) be the long term embedding and zt = xt−v(B) be the short term embedding. Our implementation of the method of analogues is the following. For the current state xt the prediction of the next state xt+1 is given by A(yt ) =

1 X xi+1 k

(1)

i∈Nk

where Nk = {a1 , . . . , ak } is the set of k nearest (Euclidean norm) neighbors of yt among {y1 , . . . , yt−1 }. There are some computational issues which we will gloss over at this point. Primarily, one needs to choose k and often one takes a weighted mean instead of (1). Moreover, it is common practice to exclude the temporal neighbors from the sum of spatial neighbors (i.e. |t − ai | must exceed some threshold). For the current study we simply employ (1) and exclude all points ai if |t − ai | is less than one half the pseudo-period of the time series. Regardless, schemes such as Eq. (1) are excellent at capturing dynamics, but only on one time scale: usually related to the pseudo period of the time series. Denote x ˆt+1 = A(yt ) and let et+1 = x ˆt+1 − xt+1 denote the model prediction error from the method of analogues. We now build a model B that predicts the model prediction error of A from the prediction made by A and the current model state according to a second embedding v (B) . That is, B(ˆ xt+1 , zt ) =

m X

λj φ j

j

µ

k(ˆ xt+1 , zt ) − cj k rj



(2)

= eˆt+1 where m is selected according to the minimum description length principle [4]; λj , rj ∈ R; and, φj are the radial basis functions (in our case, these are Gaussian). An obvious and useful extension of (2) is to incorporate linear and constant terms into the model, which is what we do. For m fixed by minimum description P et+1 − et+1 k. length, the remaining parameters are selected to minimize t kˆ

long term prediction 50 40 30 20 10 0 −10 −20 100

200

300

400

500

600

700

800

900

1000

300 400 500 600 700 true (black), predicted (red), and error (blue)

800

900

1000

noisy simulation, std = 1.5518 50 40 30 20 10 0 −10 −20 100

200

Fig. 2. Out-of-sample model prediction and simulation for the “Lorenz+Ikeda” system. The top plot is the model simulation using only the parametric modeling approach (2) (i.e. the method of analogues has k = 0). The bottom plot is noisy iterated simulations. Data is shown as a solid line, model simulations are dot-dashed and the model prediction error is dotted. The performance is inferior to that in Fig. 1. We used v (B) = (0, 1, 2, 3, 4, 5, 6, 7, 8, 66, 132).

ˆˆt+1 from the current Hence the final prediction of the next model state x model state xt is given by ˆˆt+1 = A(xt−v(A) ) + B(A(xt−v(A) ), xt−v(B) ). x

3

(3)

Applications and examples

We first apply the modeling procedure to an artificial system, the sum of a chaotic flow and a chaotic map. We integrate the Lorenz equations x˙ = s(y − x), y˙ = rx − y − xz,

(4)

z˙ = xy − bz, (s = 10, r = 28, and b = 83 ) with an integration time step of 0.003, and add the resultant time-series zn = z(0.003n) to the x component of iterates of the Ikeda map ut = 1 + µ(ut cos θt − vt sin θt ),

One step predictions (RMS(final)=0.46237 & RMS(linear)=3.8527) 50 40 30 20 10 0 −10 −20

0

200

400 600 800 1000 1200 1400 true (blue) data; linear (green) and final (black) model; and, error (red)

1600

1800

60 50 40 30 20 10 0 −10 −20

0

200

400

0

200

400

600 800 1000 1200 1400 1600 true (blue) data; linear (green) and final (black) model;

1800

2000

1800

2000

50 40 30 20 10 0 −10 −20

600 800 1000 1200 1400 true (blue) data; noisy model simulation (red)

1600

Fig. 3. Out-of-sample model prediction and simulation for pulse pressure waveform data (8000 data points sampled at 250 Hz). We adopt the same convention as in Fig. 1. Again, we see that the new technique performs best. We used v (A) = (0, 18, 36, . . . , 252) and v (B) = (0, 1, 2, 3, 4, 9).

vt = µ(ut sin θt + vt cos θt ), 6 , θt = 0.4 − 1 + u2t + vt2

(5)

where µ = 0.7. Hence the time series under study zn + un exhibits both fast dynamics (thanks to the Ikeda map) and slow dynamics (from the Lorenz system). The data together with nonlinear model predictions using the scheme described in the previous section are shown in Fig. 1.

In Fig. 3 we illustrate results for time series data of finger tip pulse pressure wave of the author. This data represents the pulse pressure wave form measured at the finger tip and is therefore closely related both to ECG waveform data and the pulse measurements made in traditional Chinese medicine.

4

Conclusions

Provided the modeling algorithm to fit (2) is well designed, this new method will fit the data better than (1) alone. In our examples, we found that the long term behavior of the combined model (3) was also better. Moreover, we observed that applying only the nonlinear modeling routine (2), with the same range of embedding lags did not perform well. Primarily this is due to algorithmic problems in the nonlinear fitting procedure. There are more nonlinear parameters than can be practically selected. Nonetheless, this new method appears to avoid these problems and offers a new approach to modeling long term deterministic dynamics on various time scales. In addition to the data presented in this paper we have repeated our analysis with human EEG time series and daily temperature records (over two decades). All experimental systems exhibited both quantitatively and qualitatively better results with the new modeling method. Although we have used specific modeling procedures (1) and (2), there is nothing special about this choice. Quite probably the method of analogues could be used for both the fast and slow dynamics, thereby removing the dependence on parametric modeling. We will consider this problem in future work.

Acknowledgments This work was supported by the Hong Kong University Grants Council’s Competitive Earmarked Research Grant (No. PolyU 5235/03E) and the Hong Kong Polytechnic University Direct Allocation (No. A-PE46).

References 1. Floris Takens. Detecting strange attractors in turbulence. Lecture Notes in Mathematics, 898:366–381, 1981. 2. G. Sugihara and R.M. May. Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series. Nature, 344:734–741, 1990. 3. Michael Small, Dejin Yu, and Robert G. Harrison. A surrogate test for pseudoperiodic time series data. Physical Review Letters, 87:188101, 2001. 4. Kevin Judd and Alistair Mees. On selecting models for nonlinear time series. Physica D, 82:426–444, 1995. 5. Michael Small, Kevin Judd, and Alistair Mees. Modeling continuous processes from data. Physical Review E, 65:046704, 2002. 6. Kevin Judd and Michael Small. Towards long-term prediction. Physica D, 136:31– 44, 2000. 7. Michael Small and C.K. Tse. Optimal embedding parameters: A modelling paradigm. Physica D, 2003. To appear.