INTERNATIONAL JOURNAL OF CLIMATOLOGY Int. J. Climatol. 32: 1604–1614 (2012) Published online 31 May 2011 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/joc.2367
A new method for abrupt dynamic change detection of correlated time series Wenping He,a Guolin Feng,a * Qiong Wu,b Tao He,c Shiquan Wand and Jifan Choue a
b
National Climate Center, China Meteorological Administration, Beijing, China National Satellite Meteorological Center, China Meteorological Administration, Beijing, China c Jinan Environmental Monitoring Center, Jinan, China d Yangzhou Meteorological Office, Yangzhou, China e Department of Atmospheric Sciences, Lanzhou University, Lanzhou, China
ABSTRACT: On the basis of detrended fluctuation analysis (DFA), a new method, moving cut data-DFA (MC-DFA), was presented to detect abrupt dynamic change in correlated time series. The numerical tests show the capability of the presented method to detect abrupt change time-instants in model time series generated by Logistic map. Moving DFA (MDFA) and approximate entropy (ApEn) can provide some information such as a single time-instant of abrupt dynamic change, but both of them cannot exactly detect all of the abrupt change regions. Some traditional methods, such as moving t-test, Cramer method, Mann–Kendall test and Yamamoto method, even cannot provide any information of abrupt dynamic change in these model time series. Meanwhile, results showed that windows sizes and strong noise have some less effect on the MC-DFA results. In summary, MC-DFA provides a reliable measure to detect the abrupt dynamic change in correlated time series, and perfectively make up the deficiencies of MDFA and ApEn. The applications in daily surface air pressure records further verify the validity of the present method. Copyright 2011 Royal Meteorological Society KEY WORDS
abrupt dynamic change detection; detrended fluctuation analysis; moving cut data-detrended fluctuation; approximate entropy
Received 17 January 2011; Revised 13 April 2011; Accepted 26 April 2011
1.
Introduction
The detection results of traditional methods for detecting abrupt change, such as Moving t-test, Cramer method, Mann–Kendall test (Mann, 1945; Kendall and Charles, 1975) and Yamamoto method (Yamamoto et al., 1985), all highly depend on the length of subseries, namely analysed timescales (He et al., 2008). However, the timeinstant of abrupt dynamic change does not have certain relations with specific timescales. Therefore, traditional methods cannot effectively detect abrupt dynamic changes. To deal with this problem, moving detrended fluctuation analysis (MDFA) was proposed (He et al., 2008). The tests on different model time series indicate that MDFA can be used to effectively detect abrupt dynamic changes. But MDFA is not accurately valid in those time series displaying positive-correlations or anticorrelations. Many observational data, such as precipitation records, daily temperature fluctuations, return times of extreme event, heartbeat intervals, and so on (Cao, 1993; Liu et al., 2000; Bunde et al., 2000; Peter and Rudolf, 2000; Maraun, et al., 2004; Bunde et al., 2005; Livina et al., 2005; Abaimov, et al., 2007; Cao, 2007; Jan et al., 2007; Blender et al., 2008), exhibit complex * Correspondence to: Guolin Feng, National Climate Center, China Meteorological Administration, Beijing, China. E-mail: feng
[email protected] Copyright 2011 Royal Meteorological Society
behaviour characterised by long-range power-law correlations. Besides, anti-persistence in the global temperature anomaly field is discovered (Carvalho et al., 2007). Therefore, it is very important to have a research on the methods which can be used to detect dynamic changes in correlated series. Fortunately, to deal with the problem mentioned above, a new method is proposed in this paper - moving cut datadetrended fluctuation analysis (MC-DFA). It is based on the characteristic that cutting a segment from a correlated signal generated by a dynamic equation, does not affect the scaling behaviours of original signal, even when up to 50% of the signal is cut. The tests indicate that the MC-DFA can be used to detect abrupt dynamic changes in correlated time series, and can perfectively make up the deficiency of those traditional methods. This is organised as follows: Section 2 outlines briefly the DFA algorithm and the MDFA method, and presents the MC-DFA method in detail. In Section 3, the influences of ‘cutting data’ on the DFA method are investigated. In Section 4, the performances of the MC-DFA method are tested, including the comparisons with the MDFA method, approximate entropy (ApEn), and other traditional methods. Moreover, the effects of strong noise on the MC-DFA are investigated. Section 5 summarises the main results and conclusions of this paper with a discussion.
1605
NEW METHOD FOR ABRUPT CHANGE DETECTION
2. Definition of the MC-DFA To illustrate the new method MC-DFA, we revive the DFA algorithm (Peng et al., 1993, 1994). Considering a time series, x(i) (i = 1, 2, . . . , N). Firstly, we integrate the time series x(i), y(k) =
k [x(i)− < x >],
k = 1, 2, . . . , N
(1)
i=1
where, < x >=
N 1 x(i) N i=1
(2)
stands for the average of x(i). Next, the integrated time series is divided into non-overlapping boxes of equal length n. In each box of length n, we fit the integrated time series by using a polynomial function,yn (k), which is called the local trend. For order l of DFA (DFA1 if l = 1, DFA2 if l = 2, etc.), the l-order polynomial function should be applied for the fitting. In the third step, we detrend the integrated time series, y(k), by subtracting the local trend yn (k) in each box, and the root-meansquare fluctuation of this integrated and detrended time series is calculated by N 1 F (n) = [y(k) − yn (k)]2 . (3) N k=1 This computation is repeated over all time scales (box sizes n) to characterise the relationship between F (n), the average fluctuation, and the box size n. Typically, F (n) will increase with box size. A linear relationship on a log–log plot indicates the presence of power law. Under such conditions, the fluctuations can be characterised by a scaling exponent p, the slope of the line relating log F (n) to log n. If p = 0.5, there is no correlation and time series behaves as a random series (white noise), 0 < p < 0.5 indicates anti-correlations, and 0.5 < p < 1.0 indicates long-range correlations. DFA was developed as a scaling analysis method to investigate the long-range fluctuation correlation in a given time interval, where it is typically assumed the type of correlation is unknown. The DFA method is not an appropriate tool to detect abrupt changes occurring at a specific time, although it is possible to derive some information on the type of correlations contributing to a signal comprised of segments with different statistical properties from the crossover behaviour. However, as all points of complete data-set (from the beginning to the end of the records) are taken into account in every calculation step, there is no progressive time-axis and, consequently, it is impossible to localise the abrupt changes accurately in time (Chen et al., 2002; Staudacher et al., 2005). The scaling characteristic of a correlated time series, which generated by one dynamic system, is not statistically significant changed when randomly cutting a segment from the correlated signals. But if there is an Copyright 2011 Royal Meteorological Society
abrupt change of the dynamic equation at a specific time, the scaling exponents of the correlated time series generated by the equation will change sharply. On the basis of these characteristics above, we present a novel measure, MC-DFA, for the abrupt dynamic change detection of a correlated time series. A detailed description of the MC-DFA algorithm as follows. Step 1. Choosing a window size M; Step 2. Continuously cutting some data with length of M from the ith data (i = 1, 1 + M, . . . , 1 + (n − 1)M, n = [N/M], the symbol [] denotes fetching integer. For example [1000/30] = 33, N is the total number of data) to the i + M − 1th data, and stitch the remaining parts together, then get a new time series. Step 3. Calculating the scaling exponent pi of the new time series (including N − M data) by using DFA; Step 4. Moving the window with a fixed size M in the original series, and repeating the steps 2 and 3 until to the end of the original series; Step 5. Getting a scaling exponent series {pi , i = 1, 2 . . . , n}; Step 6. Calculating the variance contributions of the scaling exponent series in step 5, and then getting the time-instant of abrupt dynamic change. For comparison with the MC-DFA method, let us briefly outline the MDFA method. For a time series, firstly select a subseries, and then calculate a scaling exponent by DFA. Next, move the subseries progressively but keep the length of the subseries unchanged, and repeat above computation until the end of the analysed time series. If there is no abrupt dynamic change in a time series, the scaling exponent by MDFA will have a relative tiny change, which mainly caused by an insufficiency of samples size. Whereas, the scaling exponent will have a very sharp change in the vicinity of abrupt change points if there is an abrupt dynamic change in a time series. On the basis of this, we can detect abrupt change in time series by using MDFA. In this paper, the correlated model time series are generated by Logistic map and all the lengths of these series are 1000. 3. The effects of cutting data from correlated series on DFA In this section, we study the effects of nonstationarity caused by removing segments of a given length from a correlated time series and stitching together the remaining parts – a ‘cutting’ procedure. To deal with this question, a stationary correlated time series x1 (i) (i = 1, 2, . . . , 1000) was generated by Logistic map (May, 1976). Mathematically, the Logistic map is written as xn+1 = uxn (1 − vxn ),
x ∈ [0, 1],
(4)
where xn is a number between 0 and 1, representing the population at year n. x0 represents the initial population (at year 0). The parameter u is a positive number Int. J. Climatol. 32: 1604–1614 (2012)
1606
W. HE et al.
varying from 0 to 4, and represents a combined rate for reproduction and starvation. At u approximately 3.57 is the onset of chaos. Slight variations in the initial population yield dramatically different results over time, a prime characteristic of chaos. Where, x0 = 0.8, u = 3.8, and v = 1.0, respectively. Without specifically pointed out, the parameters of Equation (4) will keep unchanged in below text. One time series {x1 (i)(i = 1, 2, . . . , 1000)} generated by Equation (4), the scaling exponent of which is about 0.16 by using DFA2, which means the model time series posses an anti-correlation characteristic. Next, we consecutively removed some of the data from n = 101 to n = 600, and the removed parts are presented by the bias segment (Figure 1(a)). Finally, we stitch together the remaining parts in the time series, thus obtaining a new time series. When the time series with 10% of the points removed, it is found from Figure 1(b) that the scaling exponent of the anti-correlated time series is almost not affected by the cutting procedure. Surprisingly, this conclusion remains true independently of the length of cutting data even when up to 50% of the points are removed from the time series. Moreover, we consider the case of positively correlated time 0
200
400
600
1.2
800
series {x2 (i), (i = 1, 2, . . . , 1000)} with a scaling exponent p = 0.75, which is generated by using modified Fourier filtering method (Makse et al., 1996), and find that the scaling exponent of the positively correlated time series is not affected by cutting procedure when some of the data from n = 201 to n = 700 (Figure 1(c)) have been removed consecutively. Testing on other correlated series, we get the almost identical conclusions with that of the correlated signals {x1 (i)} and {x2 (i)}, which indicate that cutting procedures do not affect the DFA results of the correlated time series generated by an identical dynamic equation, and the results are consistent with the previous studies (Chen et al., 2002). To investigate the performance of the DFA method to detect abrupt dynamic change in correlated time series, we firstly generate two model time series without abrupt dynamic change (Figure 2(a) and (d)) by Equation (4), and then assume two cases: in the first case, the evolution of population does not meet Equation (4) anymore at a specific time but displays a random behaviour because of an abrupt disaster. After having a rest, it gradually meets Equation (4) again. In the other case, the evolutionary dynamic equation of population changes abruptly from (a)
1000
0.9
removed
(a)
0.6 0.3 0
0.6
1.0 signal
x
0.9
0.3
400
600
800
1000
0.5
n
-0.7
0.0
(b) Log10F(n)
200 (b)
-0.8
-0.9 0.8
1.0
1.2
1.4
1.6
1.8
2.0
605
610
615
620
625
630
(c)
0% removed 10% removed 20% removed 30% removed 50% removed
0.8 0.4 0.0
0
200
400
2.2
600
800
1000
600
800
1000
n
Log10n 0 4
200
400
600
800
(d)
0.9
1000
0.6
(c)
0.3
2
0 signal
-2 -1.4 Log10F(n)
-1.6
removed
1.0
400
0.5
n
(d)
605
610
615
620
625
630
1.2 (f)
-1.8
0% removed 10% removed 20% removed 30% removed 50% removed
-2.0 -2.2 -2.4 0.8
200 (e)
x 0
1.0
1.2
1.4
1.6
1.8
2.0
0.8 0.4 0
400
600
800
1000
n
2.2
Log10n
Figure 1. Effects of ‘cutting’ procedure on the scaling behaviour of correlated time series. (a) An anti-correlated signal with the scaling exponent p = 0.16. The removed parts are presented by the bias segment and the remaining parts are stitched together. (b) Scaling behaviours of the remaining parts after the cutting procedure in the signal with p = 0.16. (c) A positively correlated signal with p = 0.75. (d) Same as (b), but for p = 0.75. Copyright 2011 Royal Meteorological Society
200
Figure 2. Model time series for abrupt dynamic change. (a) Signal is generated by Logistic model for u = 3.8 with p = 0.19, namely Equation (4); (b) Random Gauss noise signal including 30 data; (c) a new time series IS1, in which the data from n = 601 to n = 630 in (a) is substituted by a random series in (b). (d) Signal is produced by Logistic model for u = 3.7 with p = 0.15, namely Equation (5); (e) signal is generated by Equation (6) including 30 data; (f) model time series IS2, in which the data from n = 601 to n = 630 in (d) is substituted by a signal in (e). Int. J. Climatol. 32: 1604–1614 (2012)
NEW METHOD FOR ABRUPT CHANGE DETECTION
Equation (5) to Equation (6) at a specific time period. Equations (5) and (6) can be written as follow xn+1 = 3.7xn (1 − xn ),
x ∈ [0, 1].
xn+1 = 3.7xn (1 − 0.85xn ),
(5)
x ∈ [0, 1].
(6)
Where there is a tiny difference between Equations (5) and (6) for the parameter v change from 1.0 to 0.85 by some unknown causation. Next, based on these hypothesizes above, we can get two model time series with abrupt dynamic change as follows: the data from n = 601 to 630 showed in Figure 2(a), generated by Equation (4), are substituted by a random series (Figure 2(b)). Thus, the new model time series IS1 is formed (Figure 2(c)). Similarly, we can get the model time series IS2 (Figure 2(f)) by means of replacing the data generated by Equation (5) from n = 601 to 630 in Figure 2(d), with the time series generated by Equation (6) (Figure 2(e)). The DFA results of representative time series of two signals with no abrupt dynamic change and two signals with abrupt dynamic change are shown in Figure 3. Notice that the time series without abrupt dynamic change show almost perfect power-law scaling over the analysed time scales. However, for the larger time scales there is an apparent crossover exhibited for the scaling behaviour of the data set with abrupt dynamic change (arrowed in Figure 3). Obviously, this is caused by the abrupt dynamic change in signals. The results are consistent with previous finding that the DFA method can derive some information on the type of correlations contributing to a signal comprised of segments with different statistical properties from the crossover behaviour (Chen et al., 2002). As mentioned in Section 2, as all points of complete data-set are taken into account in every calculation step, there is no progressive time-axis and, consequently, it is impossible to localise the abrupt changes accurately in time by the DFA method. -0.5
Log10F(n)
-0.6
u=3.8, no abrupt change model time series IS1 u=3.7, no abrupt change model time series IS2
-0.7 p=0.19
-0.8
p=0.15
-0.9 -1.0 0.8
1.0
1.2
1.4 Log10n
1.6
1.8
2.0
Figure 3. Plot of log10 F (n) vs log10 n (see description of the DFA algorithm in Section 2) for four different model time series. Namely, two normal correlated time series without abrupt dynamic change, generated by Logistic map for u = 3.8 () and u = 3.7 () respectively, and other two time series IS1 () and IS2 () with abrupt dynamic change. Copyright 2011 Royal Meteorological Society
1607
It is worth noting that for larger parameter v in Logistic map, the scaling exponents is larger, namely, the scaling exponent p = 0.19 for v = 3.8, and p = 0.15 for v = 3.7 (Figure 3). The result shows that the scaling exponents are different for different dynamic equations, which can be used to identify the abrupt dynamic change. In the following text, we will show some examples, which represent a perfect test scenario for detecting abrupt dynamic changes in correlated time series.
4. Abrupt dynamic change detection in correlated time series 4.1. Performance tests of MC-DFA on model time series Figure 4 shows the MC-DFA results of model time series IS1 for different window sizes. It can be found from Figure 4(a) that the evolutionary curves of the scaling exponents consist of three segments: in the first and the third, the scaling exponents are obviously larger than that in the second (from n ∈ [616, 630]). As cutting procedure cannot significantly affect the scaling exponent even when up to 50% of the points are removed from a correlated time series, the reason for the relative larger scaling exponents in the first and the third in Figure 4(a) is that there is an abrupt dynamic change in n = 601, in which the evolution of the population is abruptly changed from deterministic Equation (4) to stochastic ones. When the window size M is 10, 15, 30, the results are similar with that for M = 5 (Figure 4). Furthermore, with increasing of the window sizes, the scaling exponents in the second segment are more close to the real one. Similar to the reference (He et al., 2008), in order to quantitatively identify the time-instants of abrupt changes, the variances of scaling exponents are analysed based on the MC-DFA results. Thus, the effects of cutting data on the calculation of scaling exponents can be estimated. If the variance contribution in a certain time is obviously larger than that in other times, it means that the cutting data possess different dynamic characteristic from other data, so the abrupt dynamic change in the original time series can be identified. In this paper, a variance threshold is defined as three times as the average variance of the scaling exponents. If the variance contributions exceed the threshold, there is an abrupt dynamic change. Figure 5 shows the variance analysis of the scaling exponents showed in Figure 4. It is easy to note in Figure 5(a) that the variance contributions in segment n ∈ [616, 630] are far greater than that in other regions in which the variance contributions are almost zero, and obviously exceed the defined threshold. According to the definition of the variance analysis, there are abrupt dynamic changes in this region. When the window size M is 10, 15, and 30, respectively, the corresponding regions of the abrupt dynamic change detected is n ∈ [611, 630], n ∈ [616, 630], and n ∈ [601, 630]. These detected regions are in the scope of the real abrupt change, namely Int. J. Climatol. 32: 1604–1614 (2012)
1608
W. HE et al. 0.38 (b)
(a) 0.36
0.36 0.34 p
p
0.32
0.32 0.28 0.30 M=5 0.28
0
200
400
600
800
M=10
0.24 1000
0
200
400
600
n
800
1000
n
(c)
(d)
0.36
0.35
p
p
0.32 0.30
0.28 0.25
0.24 M=15 0.20
0
200
400
600
800
M=30 0.20
1000
0
200
400
600
n
800
1000
n
Figure 4. The MC-DFA1 results of model time series IS1. (a) The fixed window size of removed segments is M = 5; (b) M = 10; (c) M = 15; (d) M = 30.
(a)
0.3 0.2 variance contribution threshold of variance contribution
0.1
0.8
M=5 variance contribution
variance contribution
0.4
(b)
0.6 0.5 0.4 0.3
variance contribution threshold of variance contribution
0.2 0.1
0.0
M=10
0.7
0.0 0
200
400
600
800
1000
0
200
400
n 1.0
800
1000
1.0 (c)
0.8 0.6 0.4 variance contribution threshold of variance contribution
0.2
(d)
M=15 variance contribution
variance contribution
600 n
0.0
M=30
0.8 0.6 0.4 variance contribution threshold of variance contribution
0.2 0.0
0
200
400
600 n
800
1000
0
200
400
600
800
1000
n
Figure 5. The variance contributions of scaling exponents for IS1. (a) M = 5; (b) M = 10; (c) M = 15; (d) M = 30.
Copyright 2011 Royal Meteorological Society
Int. J. Climatol. 32: 1604–1614 (2012)
1609
NEW METHOD FOR ABRUPT CHANGE DETECTION
(a)
0.40
(b)
0.40 0.38
0.38 p
p
0.36 0.36
0.34 0.34
0.32
0.32
M=5 0
200
400
600
800
M=10
0.30 1000
0
200
400
(c)
0.40
600
800
1000
n
n (d)
0.40
0.38
0.35 0.30
0.34
p
p
0.36
0.32
0.25
0.30
0.20
0.28 0.26
M=15 0
200
400
600
800
M=30
0.15
1000
0
200
n
400
600
800
1000
n
Figure 6. The MC-DFA1 results of model time series IS2. (a) The window size of removed segments is M = 5; (b) M = 10; (c) M = 15; (d) M = 30.
n ∈ [601, 630]. But for small window sizes, the MCDFA method cannot detect the all abrupt change ranges in the model time series IS1. To further clarify this question, the time series showed in Figure 2(a) and (c) have been compared. We find that the evolutionary trends of the two signals are almost completely consistent in the region n ∈ [601, 615], and there are very small differences between evolutionary values, namely the sizes of populations x(i) in Equation (4) (Figure omitted). Therefore, it is natural that MC-DFA cannot detect all of the abrupt dynamic change in IS1 for small window sizes. Another example IS2 (Figure 2(f)) contains the first abrupt change in n = 601, where the dynamic equation of the population abruptly change from Equation (5) to Equation (6) and the second change-point is n = 631, where the dynamic equation change from Equation (6) to Equation (5). For the case of a small change of a parameter in dynamic equations, the MC-DFA results of the signals IS2 remain consist of three segments, and the second segment is almost completely overlap with the abrupt dynamic change in IS2, whether the window size M is large (e.g. M = 30) or small (e.g. M = 10) (Figure 6). The detection results is much more clear in Figure 7, in which it can be easily found that the variance contributions in segment n ∈ [601, 630] are obviously greater than that in other regions in which the variance contributions are almost zero, and far greater than the threshold of variance contributions. The results indicate that the data in the region n ∈ [601, 630] have Copyright 2011 Royal Meteorological Society
a significant effect on the estimating scaling exponents of model time series IS2. On the basis of this, we can conclude that there are two abrupt dynamic change-point in IS2, namely n = 601 and n = 631. 4.2.
Comparison with traditional methods
We have thoroughly test the presented measure MC-DFA on various kinds of model correlated time series generated by Logistic map, and obtained satisfactorily results. Furthermore, in order to compare the performances of the present method with well-known methods, we have also tried the MDFA method, ApEn and other traditional methods, such as moving t-test, Cramer method, Yamamoto method, Mann–Kendall test, on the model time series. We firstly use the MDFA method (He et al., 2008) to analyse the model time series IS1. The MDFA results of IS1 are shown in Figure 8. According to the differences of the scaling exponents, it is easy to find in Figure 8(a) that the scaling exponents can be divided roughly into three segments: 1. (200, 620); 2. (621, 820); 3. (821, 1000). The scaling exponents in the second segment are larger than that in the first and third (Figure 8). Further, the variance contributions of the scaling exponents have been analysed (Figures omitted), and the results indicate that the first time-instant of abrupt changes is in n = Int. J. Climatol. 32: 1604–1614 (2012)
1610
W. HE et al. 0.5
0.5
0.4 0.3 0.2 variance contribution threshold of variance contribution
0.1
(b)
M=5 variance contribution
variance contribution
(a)
M=10
0.4 0.3 0.2 variance contribution threshold of variance contribution
0.1 0.0
0.0 0
200
400
600
800
1000
0
200
400
(c)
0.4 0.3 0.2 variance contribution threshold of variance contribution
0.1
1.0
M=15 variance contribution
variance contribution
0.5
600
800
1000
n
n
(d)
M=30
0.8 0.6 0.4 variance contribution threshold of variance contribution
0.2 0.0
0.0 0
200
400
600
800
1000
n
0
200
400
600
800
1000
n
p
p
Figure 7. The variance contributions of scaling exponents for IS2. (a) M = 5; (b) M = 10; (c) M = 15; (d) M = 30. 0.40 (a) 0.35 0.30 0.25 0.20 0.15 0.10 0.40 0.35 (b) 0.30 0.25 0.20 0.15 0.10 200 300
L=200
L=300
400
500
600 n
700
800
900
1000
Figure 8. The MDFA1 results for IS1. (a) The length of subseries L = 200; (b) L = 300.
624 and the second one is in n = 824. Obviously, the second time-instant detected is a fake one. The reason is that the subseries contain parts of the data in the abrupt dynamic change region n ∈ [601, 630] when the subseries move to the vicinity of the abrupt change region, and the DFA method is very sensitive to the data from different dynamic equations, which is easy to find from Figure 3. Similar results are got when the length of the subseries is L = 300 (Figure 8(b)). Moreover, the abrupt dynamic change in IS2 have been detected by using the MDFA1, and the results is similar to that of IS1 (Figures omitted). Thereby, MDFA cannot detect exactly the abrupt dynamic change in correlated time series, though the MDFA results can provide some Copyright 2011 Royal Meteorological Society
information such as a single time-instant of abrupt change. ApEn is a measure to quantify system complexity (Pincus, 1991; Pincus and Goldberger, 1994), which reflects the likelihood that ‘similar’ patterns of observations will not be followed by additional ‘similar’ observations. A time series containing many repetitive patterns has a relatively small ApEn. A less predictable (i.e. more complex) process has a higher ApEn. In recent years, ApEn has been used to detect abrupt change in meteorological observation data (Wang and Zhang, 2008). The detection algorithm is similar to MDFA. For a time series, firstly, select a subseries, and then calculate an ApEn of the given subseries. Next, move the subseries progressively but keep the length of the subseries unchanged, and then repeat above computation until the end of the analysed time series. Because ApEn reflects the complexity of a system, ApEn will change sharply when there is an abrupt dynamic change in a time series at a specific time. Figure 9 shows the ApEn results for IS1, we note that ApEn abruptly increases in about n = 601 whether the length of subseries is 200 or 300. This mainly due to the subseries comprises of the random signals and the regular signals generated by the Logistic map when the subseries contain parts of the data in the segment n ∈[601, 630]. Whereas, the complexity of the subseries comprised of segments with different dynamic properties is higher than that of the subseries only contains the signals generated by single dynamic equation. Same as MDFA, ApEn also cannot exactly detect all of the abrupt change regions. Int. J. Climatol. 32: 1604–1614 (2012)
1611
NEW METHOD FOR ABRUPT CHANGE DETECTION
p
and we find that Mann–Kendall test cannot exactly detect the time instant of the abrupt dynamic change in model time series IS1 whether the sample size is 800 or 1000. So, these four traditional methods cannot be suitable for detecting an abrupt dynamic change in a correlated time series. Therefore, the present method is a more effective measure to detect abrupt dynamic change in a correlated time series than those traditional methods.
L=200
0.50 0.45 0.40 0.35
L=300
p
0.48 0.44 0.40 200
300
400
500
600 n
700
800
900
4.3.
1000
Figure 9. The ApEn results for IS1. (a) The length of subseries L = 200; (b) L = 300.
Table I. The number of abrupt change in IS1 for different lengths of subseries by using traditional methods with a significance level of 0.01. Traditional method
Length of subseries 50
100
200
300
Moving t-test Cramer method Yamamoto method
0 0 0
0 0 0
0 0 0
0 0 0
2
U
UF UB
(a)
1 0 -1 -2 0 2
200
400
600
800
UF UB
(b)
1 U
1000
0 -1
Influences of noise on MC-DFA
In observational data, noise are inevitable, which may arise from external conditions. The existence of noise, especially strong noise, may have some influences on the intrinsic dynamics of the system. In this case, it is important to distinguish the noise from normal intrinsic fluctuations in the system. So, it is necessary to analyse the influence of strong noise on MC-DFA. In this section, the effects of stronger noise on MC-DFA have been investigated. Figure 11 shows the variance analyses of the scaling exponents calculated by MC-DFA for model time series IS1 when signal noise ratio (SNR) is 20 dB, which is a very strong noise compared with IS1. It can be noted in Figure 11 that the existences of noise make some false abrupt changes points detected, especially when the length of subseries is relatively short. Further analyses show that the positions of these false abrupt changes shift with the variation of the length of subseries. However, abrupt dynamic changes do not have relations with specific timescales. So, it is easy to identify that these abrupt changes are false ones. Meanwhile, a similar test on IS2 have been done, and the results are similar to that of IS1 (Figures omitted). Although strong noise has some influences on the MC-DFA results to some extent, it still can exactly identify the main time-instants of abrupt dynamic changes.
-2 0
200
400
600
800
1000
n
Figure 10. The detection results of Mann–Kendall for IS1 with a significance level of 0.01 (dot lines). (a) The sample size is L = 800; (b) L = 1000.
The performances of other traditional methods for abrupt dynamic change including moving t-test, Cramer method, and Yamamoto method, have been tested. The results for IS1 are summarised in Table I with a significance level of 0.01, which indicated that these traditional methods cannot provide any information on the abrupt dynamic change. The results for IS2 is same with that for IS1 (Table and Figures omitted). The reason is that these traditional methods, such as moving t-test, Cramer method, Yamamoto method, mainly be used to detect an abrupt change based on the statistically significant differences in subseries means. Moreover, we also test the performances of Mann–Kendall test for IS1, which usually only can be used to detect whether the change of the trend in a time series is statistically significant or not. Figure 10 shows the Mann–Kendall test results for IS1, Copyright 2011 Royal Meteorological Society
5.
Applications of MC-DFA in observational data
To test the practicability of MC-DFA, it is necessary to apply the present method to those observational data. Past studies demonstrated (Bunde et al., 2000; Eichner et al., 2003; Fraedrich and Blender, 2003; Bunde et al., 2005; Cao, 2007; Carvalho et al., 2007) that the evolution of meteorological elements such as air pressure, temperature and rainfall all display long-range correlation. So, daily surface air pressure records are chosen here. These records are from four nearby meteorological stations in Heilongjiang province in China, which are Huma, Heihe, Beian, and Hailun, respectively (Figure 12). The existence of periodic seasonal trends in daily surface air pressure is so common that it is almost unavoidable, but periodic trends apparently affect scaling behaviour of long-range correlated signals (Hu et al., 2001). Therefore, we need eliminate periodic trends to decrease their effects on MC-DFA. Considering a record PRi , where the index i count the orders in the records, i = 1, 2 . . . N. The PRi represents the daily surface air pressure, measured at a certain meteorological station. Int. J. Climatol. 32: 1604–1614 (2012)
1612
W. HE et al.
M=5
(b)
0.3 variance contribution
variance contribution
(a) 0.2
0.1 variance contribution threshold of variance contribution
0.0
M=10
0.2
variance contribution threshold of variance contribution
0.1
0.0 0
200
400
600
800
1000
0
200
400
800
(d)
M=15 variance contribution
variance contribution
(c) 0.6
0.4 variance contribution threshold of variance contribution
0.2
600
1000
n
n
M=30
0.6
0.4 variance contribution threshold of variance contribution
0.2
0.0
0.0 0
200
400
600
800
1000
0
200
n
400
600
800
1000
n
Figure 11. The variance contributions of scaling exponents for IS1 under SNR = 20 dB. (a) M = 5; (b) M = 10; (c) M = 15; (d) M = 30.
52
Huma
Heihe
50
Beian
48 Hailun
46
44 122
124
126
128
130
132
134
Figure 12. Locations of meteorological stations in Heilongjiang province in China including Huma station, Heihe station, Beian station, Hailun station.
To eliminate the periodic seasonal trends, the departures of PRi is studied here, PRi = PRi − < Pi >, where < Pi > is mean daily value for each calendar date i, for example, 1 May, which is obtained by averaging the air pressure of 1 May over all years in the records. Subsequently, we use MC-DFA to analyse the departures of PRi , namely PRi , and the detecting results are shown in Figure 13. We find that the evolutionary trends of the scaling exponents are similar for four meteorological stations, and they all have a minimal value in 1964. So, cutting the data in 1964 has the greatest influence on scaling exponent of the entire records. Copyright 2011 Royal Meteorological Society
In order to exactly detect the time-instants of abrupt dynamic change in air pressure records, the variance contributions of scaling exponents for four meteorological stations by using the MC-DFA2 are analysed. Four evolutionary curves of the variance contributions are similar (Figure 14). The maximum value of the variance contribution are respectively 44.27, 51.63, 71.36 and 45.02% for Huma, Heihe, Beian and Hailun, and they are all far greater than their variance threshold (about 6.52%). What is more, Cutting data of other years almost do not have influences on scaling exponents. So, in 1964, there is an abrupt dynamic change in Huma, Heihe, Beian, and Hailun, respectively. In order to ensure the reliability of detecting results for this abrupt change, air pressure records from other meteorological stations in Northeast China and North China are analysed, and the detected results are all similar with Figure 13. That means, in 1964, there is an abrupt dynamic change of surface air pressure in Huma, Heihe, Beian and Hailun. The reason for this abrupt change may attribute to a regional climate background not only a local climate phenomenon. Additional work needs to be carried out to accurately clarity it, and this will be a topic of further research.
6. Conclusions Summarising, a novel measure MC-DFA to detect abrupt dynamic change in correlated time series was proposed in this paper, and the capability of this new method to detect abrupt change time-instants in model time Int. J. Climatol. 32: 1604–1614 (2012)
1613
NEW METHOD FOR ABRUPT CHANGE DETECTION
0.570
Huma
(a)
0.550
0.565
0.545
0.560 p
p
0.555
0.540
Heihe
(b)
0.555 0.550
0.535
0.545
0.530
0.540 0.525 1960
1970
1980 year
1990
1960
2000
1970
1980 year
1990
2000
0.59 Beian
(c)
0.570
0.58
Hailun
(d)
0.565
0.57 p
p
0.560
0.56
0.555 0.550
0.55
0.545 0.54 1960
1970
1980 year
1990
0.540 1960
2000
1970
1980 year
1990
2000
Figure 13. The MC-DFA2 results for daily surface air pressure records, the length of subseries is equal to 1 year. (a) Huma station;(b) Heihe station; (c) Beian station; (d) Hailun station. 0.5 0.4
variance contribution
variance contribution
0.5
Huma
(a)
0.3 0.2 0.1 0.0
0.4 0.3 0.2 0.1 0.0
1960
1970
1980 year
1990
2000
1960
0.8
1970
1980 year
1990
2000
0.5 Beian
0.6
0.4
0.2
0.0 1960
(d) variance contribution
(c) variance contribution
Heihe
(b)
Hailun
0.4 0.3 0.2 0.1 0.0
1970
1980 year
1990
2000
1960
1970
1980 year
1990
2000
Figure 14. Variance contributions of scaling exponents of the MC-DFA2 results for daily surface air pressure records, dashed line represents the threshold of variance contribution. (a) Huma; (b) Heihe; (c) Beian; (d) Hailun.
Copyright 2011 Royal Meteorological Society
Int. J. Climatol. 32: 1604–1614 (2012)
1614
W. HE et al.
series has been demonstrated. Comparing with those new methods recently developed, although MDFA and ApEn can provide some information such as a single time-instant of abrupt change, but they cannot exactly detect all of the abrupt change regions. The tests on some traditional method, such as moving t-test, Cramer method, Mann–Kendall test, and Yamamoto method, indicate that these traditional methods cannot provide any information on abrupt dynamic change in model time series. So, the present method is superior to those traditional methods, such as MDFA, ApEn and moving t-test, etc. What’s more, although the windows sizes and strong noise have a tiny effect on the MC-DFA results to some extent, it still can exactly identify the main timeinstants of abrupt changes. Therefore, MC-DFA provides a reliable measure to detect abrupt dynamic change in correlated time series, and make up the deficiencies of MDFA and ApEn perfectively. The applications in daily surface air pressure records further verify the validity of the present method. Many observational data, such as precipitation records, daily temperature fluctuations, return times of extreme event, and heartbeat intervals, etc., exhibit complex behaviours characterised by longrange power-law correlations. Thereby, the new method is suitable to detect abrupt dynamic change of correlated time series in various fields, and will be applied widely in future. Acknowledgments The authors thank anonymous reviewers and editors for beneficial and helpful suggestions for this manuscript. This research was jointly supported by the National Natural Science Foundation of China (Grant Nos. 40905034, 40875040 and 40930952), and the Special Scientific Research Fund of Meteorological Public Welfare Profession of China (Grant Nos. GYHY201106015 and GYHY201106016). References Abaimov SG, Turcotte DL, Shcherbakov R, Rundle JB. 2007. Recurrence and interoccurrence behavior of self-organized complex phenomena. Nonlinear Processes in Geophysics 14: 455–464. Blender R, Fraedrich K, Sienz F. 2008. Extreme event return times in long-term memory processes near 1/f. Nonlinear Processes in Geophysics 15: 557–565. Bunde A, Havlin S, Kantelhardt JW, Penzel T, Peter JH, Voigt K. 2000. Correlated and uncorrelated regions in heart-rate fluctuations during sleep. Physical Review Letters 85: 3736–3739. Bunde A, Eichner JF, Kantelhardt JW, Havlin S. 2005. Long-term memory: a natural mechanism for the clustering of extreme events
Copyright 2011 Royal Meteorological Society
and anomalous residual times in climate records. Physical Review Letters 94: 048701 (1–4). Cao HX. 1993. Self-memorization equation in atmospheric motion. Science in China Series B 36: 845–855. Cao HX. 2007. Characteristics of long-term climate change in Beijing with detrended fluctuation analysis. Chinese Journal Geophysics 50: 420–424 (in Chinese). Carvalho LMV, Tsonis A A, Jones C, Rocha1 HR, Polito PS. 2007. Anti-persistence in the global temperature anomaly field. Nonlinear Processes in Geophysics 14: 723–733. Chen Z, Ivanov P, Hu K, Stanley HE. 2002. Effect of nonstationarities on detrended fluctuation analysis. Physical Review E 65: 041107(1–15). Eichner JF, Koscienlny-Bunde E, Bunde A, Havlinm S, Schellnhuber HJ. 2003. Power-law persistence and trends in the atmosphere: a detailed study of long temperature records. Physical Review E 68: 046133(1–5). Fraedrich K, Blender R. 2003. Scaling of atmosphere and ocean temperature correlations in observations and climate models. Physical Review Letters 90: 108501(1–4). He WP, Feng GL, Wu Q, Wan SQ, Chou JF. 2008. A new method for abrupt change detection in dynamic structures. Nonlinear Processes in Geophysics 15: 601–606. Hu K, Ivanov P, Chen Z, Carpena P, Stanley HE. 2001. Effect of trends on detrended fluctuation analysis. Physical Review E 64: 011114(1–19). Jan FE, Jan WK, Bunde A, Havlin S. 2007. Statistics of return intervals in long-term correlated records. Physical Review E 75: 011128(1–9). Kendall MG, Charles G. 1975. Rank Correlation Methods. Oxford University Press: New York, 202. Liu SD, Rong PP, Chen Q. 2000. The hierarchical structure of climate series. Acta Meteorological Sinica 58: 110–114 (in Chinese). Livina VN, Havlin S, Bunde A. 2005. Memory in the occurrence of earthquakes. Physical Review Letters. 95: 208501 (1–4). Makse HA, Havlin S, Schwartz M, Stanley HE. 1996. Method for generating long-range correlation for large systems. Physical Review E 53: 5445–5449. Mann HB. 1945. Non-parametric tests against trend. Econometrica 13: 245–259. Maraun D, Rust HW, Timmer J. 2004. Tempting long-memory on the interpretation of DFA results. Nonlinear Processes in Geophysics 11: 495–503. May, R. 1976. Simple mathematical models with very complicated dynamics. Nature 261: 459–467. Peng CK, Mietus J, Hausdorff JM, Havlin S, Stanley HE, Goldberger AL. 1993. Long-range anticorrelations and non-Gaussian behavior of the heartbeat. Physical Review Letters 70: 1343–1346. Peng CK, Buldyrev SV, Havlin S, Simons M, Stanley HE, Goldberger AL. 1994. Mosaic organization of DNA nucleotides. Physical Review E 49: 1685–1689. Peter T, Rudolf OW. 2000. Power spectrum and detrended fluctuation analysis: application to daily temperatures. Physical Review E 62: 150–160. Pincus SM. 1991. Approximate entropy as a measure of system complexity. Proceedings of the National Academy of Sciences of the United States of America 88: 2297–2301. Pincus SM, Goldberger AL. 1994. Physiological time-series analysis: What does regularity quantify? American Journal of Physology-heart and Circulatory Physiology 266: H1643–H1656. Wang QG, Zhang ZP. 2008. The research of detecting abrupt climate change with approximate entropy. Acta Physics Sinica 57: 1976–1983 (in Chinese). Yamamoto RT, Iwashima T, Sanga NK. 1985. Climatic change: a hypothesis in climate diagnosis. Journal of The Meteorological Society Of Japan 63: 1157–1160.
Int. J. Climatol. 32: 1604–1614 (2012)