Conventional methods of hypothesis testing for nonlin- earity in a time-series employ the method of surrogate data which makes use of the Fourier transform (FT) ...
Detection of Nonlinearity in a Time-Series : by the Synthesis of Surrogate Data Using a Kolmogorov-Smirnoff Tested, Hidden Markov Model C.P. Unsworth, M. Cowper, S. McLaughlin, B. Mulgrew Charles.Unsworth @ee.ed.ac.uk Signals & Systems Group, Electrical Engineering Dept, Edinburgh University, Scotland, UK.
Abstract Conventional methods of hypothesis testing for nonlinearity in a time-series employ the method of surrogate data which makes use of the Fourier transform (FT).As various authors have shown, this can lead to artqacts in the surrogaies-and spurious detection of nonlinearity can result. This paper 4ocuments a new method to synthesize surrogate data using a 1st order hidden Markov model ( H M M ) combined with a Kolmogorov-Smirnofftest (KS-test), to determine the required resolution of the HMM. The method provides a way to retain the dynamics of a time-series and impart the null hypothesis ( H o ) onto the synthesized surrogate which avoids the FTand its associated artifact. Signijicance test results for a sinewave, Henon map and Gaussian noise time-series are presented. It is demonstrated through 'significance testing' that KS-tested, HMM surrogates can be successfully used to distinguish between a deterministic and stochastic time-series. Then by applying a simple test for linearity, using linear and nonlinearpredictors, it is possible to determine the nature of the deterministic class and hence, conclude whether the system is linear deterministic or nonlinear deterministic. Furthermore, it is demonstrated that the method works for periodic functions too, where FT surrogates break down.
1. Introduction Conventional methods to generate surrogate data for nonlinear hypothesis testing make use of the Fourier transform (FT) [7]. Various authors [I], have demonstrated that this can lead to artifacts in the surrogates which result in the spurious detection of nonlinearity. In addition, it has been shown that the FT surrogate particularly breaks down for systems with strong periodic components [3]. What is required is : (i)a method to capture the underlying dynamics and hence the structure of the observed dataset;
(ii)a method that can synthesize a new dataset using the captured dynamics; (iii)a method that avoids generating artifacts. It will demonstrate that the above criteria can be realised by using a 1st order hidden Markov model (HMM) coupled with a Kolmogorov-Smirnoff test (KS-test) to determine the resolution of the HMM and can be successfully applied to a time-series. The paper is structured as follows.
A brief introduction to hypothesis testing, the terminology and its application to nonlinear surrogate testing. An overview of the HMM and how such a model can be generated. An overview of the KS-test and how it can be used to identify the optimum resolution of the HMM. HMM surrogates and 'significant test' results for sinewave, Henon map and Gaussian noise timeseries. Linear & nonlinear predictor results to test for linearity. A summary & conclusion of the work presented.
2. Surrogate data & nonlinear hypothesis testing Hypothesis testing [lo] is a method used in statistical analysis. Essentially, one makes an assumption about the data, known as the null hypothesis (Ho) which one is hoping in general to contradict. The observed dynamical system could fall into one of 4 categories : (A) linear deterministic (e.g. Newtonian, undamped pendulum, sinewave); (B) nonlinear deterministic (e.g. Lorenz, Henon map etc.); (C) linear stochastic (e.g. A linear Markov model); (D) nonlinear stochastic (e.g. A nonlinear Markov model).
695 0-7803-5700-0/99/$10.0001999 IEEE
Our hypothesis is Ho = data is of the stochastic class of which (C) & (D) are subsets. The surrogates developed in this paper are generated by a 1st order, nonlinear HMM in a nonlinear stochastic manner, described later. A linear Markov model, which could be used to generate linear stochastic surrogates, is a subset of a nonlinear HMM. A nonlinear HMM will model a linear Markov model if the data is linear stochastic. Hence, the nonlinear HMM covers both cases (C) and (D) and thus the stochastic class. Therefore, in rejecting Ho we accept an alternative hypothesis ( H I ) such that H 1 = data is of the deterministic class of which (A) and (B) are subsets.
2.1. The method of surrogate data In nonlinear time-series analysis, the hypothesis is tested by the method surrogate data [7] which will now be described. The process involves measuring a statistic ( Q o ) and then generating a set of synthetic time-series, known as surrogates, based around this statistic. In this paper, a noise robust invariant documented by Schouten [6] is used which is known as the maximum likelihood of the correlation dimension ( D M L ) value. However, any higher order statistic could be used in a conventional hypothesis test. As mentioned earlier, there are many methods to generate surrogates. Regardless of the method employed, each method aims to achieve the same two effects. (i)to retain the dynamics of the observed data; (ii)to impart the null hypothesis ( H o ) onto the surrogate.
Hence, the surrogate that is generated is the same as the original, in the sense that it retains the dynamics, but is different since a stochastic element has been written into it from the null hypothesis. Thus, N surrogates of the observed dataset are generated. Next, the same statistic is measured for all the N surrogates which gives one a set of surrogate statistics ( Q I , Q ~ , Q 9 4~, ,....., Q N ) . After which the the surrogate statistics together with the observed statistic ( Q o ) are ranked in order of numerical size. What is important now is the position of the observed statistic ( Q o )in this ranking structure. If Ho is accepted QOwill appear in the main body of the ranked statistics. For Ho to be rejected the observed statistic (Qo) will fall at either endtail of the ranked statistics. Hence, a 'two tailed' test [9]will reject Ho at a particular significance level given by :
Pr(H0rejected) =
RankQo N+1
~
3. A 1st order HMM surrogate A schematic diagram of how to set up a HMM [8] is depicted in (Figure l). The amplitude range of the set of data is 'binned' into (N) smaller equal sized ranges, known as states ( S OSI, , Ss, ..., S N ) . Each state is then analysed in
4'
Figure 1. Schematic of a '1st order hidden Markov model'.
turn and the probability of its transition to the other states is evaluated. This is done by creating a transition state histogram(H), for each particular state. The transition state histogram is built up as follows. For each point that exists in a particular state ( S N ) we , take its corresponding future point, namely the point to its immediate right in the time-series. Determine what state the future point exists in and bin this future state in the transition state histogram for This is repeated for every point that particular state (5"). that exists in the state SN and the transition state histogram for SN is built up in this way. A transition state histogram ( H ) is then generatql for every state that exists. Thus, each SI SZ,... S,) will have its own transition state state (SO, histogram ( H o , H I H z , . .. , H N ) . It is then neccessary to rank the transition states in order of numerical size within each of the transition state histograms. In doing this, the most probable transition states occur first in the histogram. The set of transition state histograms is then normalised to unity which gives rise to a set of transition state pdf's ( P OPI , P2, .. ., PN). From these, a set of cumulative transition state probability functions (CO, C1, CZ .. . Cjv)can be formed. It is then very simple to impart the stochastic ele-
696
ment of the null hwothesis (.H n ) onto the surroeate. This is achieved as follows. Randomise some initial state (Si), where SO 5 Si 5 S N , using a uniform deviate random number generator. Then using the same uniform random number generator, randomise a transition probability value ( P R ) ,where 0 5 PR 5 1. Wherever, the PR value falls in the cumulative transition state probability function locates the new state and so on. Hence, the surrogate is generated by essentially randomising a guided walk within the dynamics of the HMM, for a particular HMM resolution. From this, it is clear to see that the 1st order HMM is responsible for the nonlinear part of the hypothesis(H0) and the presence of the random number generator is responsible for the stochastic part of Ho. The next question is, How many states is sufficient ? This is where the power of the ’ Kolmogorov-Smirnoff test’(KS-Test) is exploited. . I
-I
Y
04
I;
5. Results The KS-tested, HMM was applied to three different time-series. These were a sinewave, Henon map, and Gamsian noise time-series. What follows in an examination of each of the above mentioned time-series in turn.
5.1. A sinewave A sinewave was generated with 2000 points. In addition, 6 surrogates were synthesized using the HMM for increasing resolutions (Shown below in Figure 2). It is evident that as the number of states of the HMM increases then the synthesized surrogate tends to the original observed dataset.
500
OmmWm L 2000
1500
g
g
500
o
~ (OM)
~1500
$ 2000 z
-:Io w w w - T w v -(I OO*&!& *,YvnwOl.r smbru
.-
0
500
lo00
1500
2000
Figure 2. KS-tested, HMM sinewave surrogates.
4. The use of the ’=-Test’ to determine the resolution of the HMM TheXS-test. compares the normalised cumulative distributions functions (cdf’s) for two sets of data. The KS-test returns a prdbabilistic measure of how significantly similar the two cdf’s are. Albano[2], identified that the KS-test can be used to distinguish between different attractors which have the same correlation dimension. The key is to apply the test to obtain a probablistic measure of how similar the dynamics of a HMM surrogate are to the original observed data. This is done by generating a series of HMM surrogates of increasing resolution. Then by comparing the cdf of the observed data to the cdf’s of its surrogate one will obtain a probabilistic measure of the similarity of the dynamics. It is clear that the lowest resolution surrogate which returns a KS-tested probability of unity will capture the dynamics of the observed data and will maximise the stochastic element which is required by the null hypothesis (Ho). Hence, the resolution of the HMM can be determined by a KS-test.
-1
~~~
-o_
The 1000 bin surrogate is the lowest resolution HMM to exist with a KS probability of unity. Hence, 1000 binned surrogates were used for ’significance testing’ of the null hypothesis since at this resolution the dynamics of the original observed data are preserved and the ’random, stochastic’ nature that we require for the null hypothesis ( H o ) is maximised.
5.2. the Henon map A nonlinear Henon map consisting of 2000 points was generated using :
where, the initial value is t, = 0.3 Six surrogates were generated using a HMM ( S e e Figure 3).
As one can see there are very few states that exist within the original observed Henon data. So one would expect a low resolution to be determined by the KS-test. The KStest predicts that 100 bins are rt%qUired to fulfill the Criteria necessary to generate a surrogate. However, a 10 bin Surrogate was used since it had a KS probability very close to unity (= 0.97).
6. Significance test results to determine the appropriate class type A total of 100 HMM surrogates, each consisting of 2000 points, were synthesized at the KS-tested resolution for a sinewave( 1000 bins), Henon map(l0 bins), Gaussian noise( 1000 bins). The results are shown below in Figure 4.
697
%
!
'
-2
0
50
100
150
200
50
100
150
200
i
-o-l
IWbul. P,,SII.l*
50
l
100
150
0
50
200
4
lMn* m,ShTe!.Y
100
e8
8
200
150
wSmimmis 3bms
50
Figure 3. rogates
150
100
The D M L value was measured for the observed statistic (Qo) and the surrogates ( Q I , 9 2 , Q3r 9 4 1 . . . . . I Q N ) . The observed statistic ( Q o ) is represented as the tall vertical line in the diagram and the HMM synthesized surrogates' statistics by small vertical lines. All the statistics were ranked ordered. For the sinewave and Henon map, it is quite clear that the observed statistic ( Q o ) has rank = I and lies in the left tail of the distributed statistics. Since it has rank = 1, we can reject the null hypothesis ( H o ) and accept the 'alternative hypothesis' (111)that the data is of the deterministic class which is indeed the case. For Gaussian Noise, the observed statistic (Qo) falls almost in the middle of the distribution with rank = 42. From inspection, the distribution is approximately normal and (90) is within the body of the distribution and almost central within it. Thus, it is highly probable that the null hypothesis ( H o )is true and the data is of the stochastic class. This is again indeed the case.
200
- KS-tested, HMM Henon map sur-
7. The use of linear & nonlinear predictors to determine the nature of the class Since the significance tests have strongly rejected Ho for the sinewave and the Henon map, one can say with confidence that the observed datasets must be either 'linear deterministic' or 'nonlinear deterministic' which we know to be true. By applying both linear and nonlinear predictors to each of the observed datasets the nature can be inferred.
4- rank 1 a)
7.1. Methodology & predictor criterion for linearity and nonlinearity 1
0
2
3
4
5
6
7
4- rank 1 bl
-
I
-
(4-rank 42 C)
3.5
3.8
Figure 4. Significance test results for a) sinewave, b) Henon map & c) Gaussian noise time-series.
Both linear and nonlinear predictors were applied in turn to the observed dataset. Both the observed datasets consisted of 60,000 data points. The logic is that if the deterministic system has a 'linear' nature then the linear predictor would predict with very little error. Furthermore, the nonlinear predictor would perform just as well. This is because it can be thought of as a linear predictor with additional nonlinear terms. Hence, the nonlinear terms would go to zero and it would predict, in theory, as a linear predictor. This is the result we expect to obtain with the sinewave dataset. If the deterministic system has a 'nonlinear' nature then the linear predictor would perform badly. However, the nonlinear predictor would model the nonlinearities in the timeseries and would predict better. This is the result we expect to obtain with the Henon map dataset. Hence, one should be in the position to now determine the true nature of the sinewave and Henon map datasets from this simple test.
698
7.2. Predictor specification and experimental procedure
8. Summary & conclusions In conclusion, it has been demonstrated that a 1st order
HMM coupled with a K-Stest, to determine HMM resolThe linear predictor used was a 10 tap linear predictor, described in [5]. The nonlinear predictor [4] used was a radial basis function (RBF) network with an embedding dimension of 10. It had 100 centres and used normalised Gaussian kernels with an embedding delay of unity. Results from both predictors, for variable training lengths of (N= 2000 to 20,000) data points were obtained, in steps of 2000 data points. The following plots show training length in data points vs. lOloglo(Norma1ised mean square prediction error) in dB's. For clarity the predictor validation results are presented for each of the observed datasets. These are shown in Figure 5 .
ution, provides a simple method to capture the underlying dynamics of a time-series and impart the null hypothesis (Ho) onto a synthesized surrogate. The method also avoids artifacts associated with the (FT) surrogate. It has also been demonstrated that significance tests using HMM,KS-tested surrogates can be successfully used to distinguish deterministic from stochastic class of time-series. Futhermore, a simple test for linearity, using linear and nonlinear predictors, will successfully identify the nature of the deterministic class. In addition, it has been shown that the KS-tested, HMM surrogates predict for periodic functions too where FI'surrogates break down.
9. Further work & acknowledgements We would like to acknowledge the EPSRC, GMAV, DERA Malvern and the Royal Society in connection with this work. Additional thanks go to Ian Band, Marcus Alphey and John Thompson.
References 0 ,
A.M.Albano and P.E.Rapp. Phase-randomised surrogates can produce spurious identification of non-random structure. Phys.Lerr.A, 192, 1994. A.M.Albano and P.E.Rapp. Kolmogorov-smimoff test distinguishes amactors of similar dimensions. fhys.Rev.E, 52 No.1: 196-206, 1995. C.J.Stam, J.P.M.Pijn, and WSPritchard. Reliable detection of nonlinearity in experimental time series with strong periodic componenets. Physica D,112:361-380,1998. S. Haykin. Neural Networks :A comprehensive foundation, 2nd Edition. Macmillan Press, 1994. S . Haykin. Adaptive Filter Theory, 3rd Edition. Macmillan Press, 1996. J.C.Schouten, ETakens,and C. den Bleek. Estimation of the dimension of a noisy attractor. fhys.Rev.E, Vo1.50, No.3, 1994. J.Theiler and S.Eubank. Testing for nonlinearity in a time series: the method of surrogate data. Physica D,58:77-94, 1992. I. L. MacDonald and W. Zucchini. Hidden Markov Models & Other Modelsfor Discrete-Valued Time Series. Chapman & Hall Publishers, 1997. D.S . Moore and G. P. McCabe. Introduction to the practice of statistics, 3rd edition. pgs.463-466. A. Papoulis. P robabilify, Random Variables & Stochastic Processes, 3rd Edition. McGraw-W - Electrical & Electronic Eng. Series, 1991.
P d i w Training Lnplh (Bhpts.)
Figure 5. Linear & nonlinear predictor results
for fhe-observed a) sinewave & b) Henon timeseries.
The sinewave predictor results of Figure 5, show that both the linear and nonlinear predictors predict equally well with prediction error between (-90dB to - 120dB).This suggests that it is highly likely that the sinewave time-series has a 'linear' nature. Hence, we can now conclude that the sinewave dataset is a 'linear deterministic' system. The Henon map predictor results show that the linear predictor predicts extremely poorly, at the - 1dB level. The nonlinear predictor, however, performs better, between (-6dB to -8db). This suggests that it is highly likely that the Henon time-series has a 'nonlinear' nature. Hence, we can now conclude that the Henon time-series is 'nonlinear deterministic' system.
699