Chaos theory in predicting surge water levels in the North ... - CiteSeerX

Chaos theory in predicting surge water levels in the North Sea D. P. Solomatine International Institute for Infrastructural, Hydraulic and Environmental Engineering, Delft, The Netherlands. Email: [email protected]

C. J. Rojas International Institute for Infrastructural, Hydraulic and Environmental Engineering, Delft, The Netherlands

S. Velickov International Institute for Infrastructural, Hydraulic and Environmental Engineering, Delft, The Netherlands

J. C. Wüst North Sea Directorate, P. O. Box 5807, 2280 HV Rijswijk, The Netherlands

ABSTRACT: The problem of predicting surge water levels is important for ship guidance and navigation. The data collected in the coastal waters of the Netherlands (Hook of Holland) is analysed with an objective of making such prediction. It was found that the correlation between data on surge, temperature, air pressure and wind is not sufficient to rely only on the input-output (connectionist) models like neural networks. It appeared that the surge time series in itself has enough information to make predictions. The applied linear prediction methods including autocorrelation and ARIMA models could not provide sufficient accuracy. Features of chaotic behaviour were identified in surge, and methods of chaos theory were applied. The predictions are quite accurate (RMSE is 3.6 cm for 1 hour, and 6.1 cm for 3 hours). Possible techniques allowing for increase of the prediction accuracy and horizon (wavelet analysis, data mining techniques) were also identified.

1 INTRODUCTION As an opening to his lecture, Edward Lorenz, Emeritus professor of meteorology at MIT holds a piece of paper above the stage and gently lets it go, watching it leisurely float down to the ground. Lorenz repeats the experiment starting from seemingly the same place, and the paper falls again, landing in a different place on the ground. This serves an illustration of the theory of chaos.

Chaotic (highly sensitive to initial conditions) behaviour of many systems was observed by many researchers for a number of decades, but was first described as such by Lorenz (1963). In 1961 he discovered the manifestation of chaotic behaviour when he was working with computer models of weather prediction. It appeared that the model he was using was extremely sensitive to a small change in one of the parameters - a change from 0.506127 to 0.506 lead to gradual deviation of

1 Proc. 4-th International Conference on Hydroinformatics, Iowa, USA, July 2000.

the original sequence of output to a very different one. This sensitivity dependence on initial conditions is common to chaos theory. Such a small amount of difference in a measurement might be considered experimental noise, background noise, or an inaccuracy of the equipment. Since the natural systems characterised by water level variability cannot be "restarted" with slightly different initial conditions (consider flood levels or levels in coastal waters), it is reasonable to follow a more practical definition: in a chaotic system close state space trajectories will diverge and they will never close on themselves. Chaos comprises a class of signal intermediate between regular sinusoidal or quasiperiodic motions and unpredictable, truly stochastic behaviour. Chaotic systems are treated as "slightly predictable" and normally are studied in the framework of non-linear system dynamics. With conventional linear tools such as Fourier tranform, chaos looks like "noise", but chaos has structure seen in the phase (state) space. The main reason for applying chaos theory is the existence of methods permitting to predict the future positions of the system in the state space. In this paper we base our considerations on work of Abarbanel (1996) and Tsonis (1992). During the past two decades, the theory of chaos showed its applicability in solving a wide class of problems in many areas of natural sciences. The discovery that very simple deterministic systems can produce seemingly irregular time series pushed researchers to try to identify such systems and apply chaos theory in order to predict their behaviour. However, chaotic signal analysis is still a novel approach in many areas related to civil engineering and to water-related problems in particular. Chaotic behaviour in various hydrological time series and water level data is being analysed for a number of years and reported by e.g. Hense (1987), Jayawardena and Lai (1994), Sivakumar et al., (1999a, 1999b), Rahman (1999). Applications of nonlinear dynamic analysis (chaos theory)

describing coastal waters and its comparison to other methods are reported by Frison et al. (1999) and Zaldivar (2000). Often physical reasoning does not easily explain why physical systems behave chaotically. However, regardless of why chaotic behaviour occurs, chaotic signal processing can provide the framework to describe non-linear system behaviour. Coastal ocean water levels are good candidates for chaotic signal processing because the governing Navier-Stokes equations are inherently non-linear, and the observed broadband and continuous Fourier spectra are indicators of chaos. (In principle, because superposition does not hold for non-linear systems, it can be inappropriate to decompose the signals into separate elements, as it is done when astronomical tides are removed. However, this is often done implicitly since the assumption is that the tidal component can be described deterministically and non-linearity is observed on top of it). 2 DESCRIPTION OF THE CASE STUDY The present paper covers parts of an effort aimed at analysing the surge water levels of 1 to 6 hours ahead in the coastal waters of the Netherlands at Hook of Holland (at the entrance to the port of Rotterdam). Such predictions play an important role for making decisions on allowing the ships to enter the estuary. This work is based on hydrometeorological data collected at several measurement locations along the North Sea. The present study deals with data collected at Hook of Holland station in the hydrological year 1994-1995. The sampling time is 10 minutes and the available parameters are: astronomical water level, surge water level (measured water level minus astronomical water level), wind speed, wind direction and air pressure. Besides observations this data set contains analysed wind and air pressure averages representing 6 areas covering the region between the northern part of the North Sea and the English Channel.

Proc. 4-th International Conference on Hydroinformatics, Iowa, USA, July 2000.

2

The performed initial experiments with artificial neural networks (ANNs) show their applicability, but the error observed so far (approx. 6.5cm for 1 hour prediction) was too high, and a considerable phase shift was observed. (In the presented case study the correlation between parameters was found to be too low to allow for an immediate application of connectionist models like ANNs, so the authors decided not to report here the intermediate results with ANNs and to continue experiments). The assumption for consecutive experiments reported below was that the surge time series carries enough information to make prediction. Other data is currently used in the research of using other data mining techniques (neural networks, support vector machines, clustering and association rules). Figure 1, shows surge water levels for Hook of Holland station in the hydrological year 1994-1995. These data exhibit predominant high-frequency fluctuations. However, embedded in these fluctuations there are some periodical tendencies, which suggest presence of some periodic components. Three techniques were applied for the identification of either the periodic (deterministic) or the stochastic components of time series: autocorrelation analysis, spectral analysis and chaos theory analysis. Surge Water Levels (Hydrological year 1994/1995) Station: Hoek van Holland (Sampling time : 10 minutes) 200

Autocorrelation function values of the surge water level decrease practically in a linear fashion being 0.802 at time lag 36x10 min = 6 hours and 0.4 at 108x10 min = 18 hours, which suggest that there is serial dependency in this series. Since autocorrelations for consecutive lags are formally dependent, the autocorrelation of the difference was calculated (the time series was differenced with the time lag of 1) giving the correlation coefficient after lag 36 (6 hours time) equal to a very low value of 0.072. Some periodic components were identified after 6 and 12 hours with correlation coefficients of 0.088 and 0.231 respectively. This behaviour can be also confirmed examining the partial autocorrelation functions. Spectral analysis of the surge water level shows large spectral density values at the beginning with sine and cosine components (Figure 2). Analysis of stationarity was not done. The spikes around 14 and 28 days can be explained by the tidal effects, which were not removed at the data preparation stage. Also, we can see that after around 6 months and 1 year there are spikes, which indicate 6 months and yearly periodicities, which need to be examined in more detail using the whole period of observations (6 years). It is interesting to note that the spectral density function shows small but very clear periodic components of 3 and 6 hours on a small scale. A broad spectrum may indicate chaotic behaviour. It was concluded that more detailed spectral analysis has to be performed using more sophisticated methods, such as wavelet analysis.

150 ] m [c 100 ls e v 50 le r e ta 0 w e rg u -50 S -100 -150 1

Oct. Nov. Dec. Jan. Feb. Mar. Apr. May Jun. July Aug. Sep. 120 239 358 477 596 715 834 953 1072 1191 1310 1429 = 1day)

Figure 1: Time series used 3 APPLICATION OF LINEAR METHODS OF PREDICTION


3

Figure 2: Spectral density function of surge water level Analysis of cross-correlations between all measured parameters showed that the component of wind in East-West direction has the largest negative correlation with the surge (-0.509) while the wind speed shows the largest positive correlation (0.418). These values are too low to be used for prediction. ARIMA (autoregressive integrated moving average process) model was built as well, see Box and Jenkins (1976). Various ARIMA parameters were tried and it was found that for the whole data set ARIMA (1,1,1) model gives results better than others; still its prediction RMS error for 2 hours ahead reaches 20 cm. Experiments were also performed with ARIMA models applied only for some periods were the process was found stationary. It was found that ARIMA (4, 1, 0) was performing reasonably well (RMSE in the order of several cm) only for 0.5 hour prediction. Overall, accuracy of ARIMA predictions was considered unacceptable. General reason for that is that the autocorrelation function of the surge timeseries, which is based on linear regression, does not clearly represent the amount of the information that the system carries from the past. An approach when the coefficients in the AR part of ARIMA prediction were updated in time was also tried. The results were better but not satisfactory either. The obvious next step was to use non-linear prediction methods. 4 BASICS OF CHAOS One of the important foundations behind the methods of non-linear signal processing and the chaos theory is an embedding theorem (Takens, 1981). It shows that the use of a single measured variable x(n) = x(t0 + nτ) with t0 some starting time and τ the sampling time, and its time delays provides n-dimensional space that is a proxy for the full multivariate state space of the observed system. The ndimensional state vectors x(t) are then defined as:

x(t ) = [x (t ), x (t − τ ),..., x (t − ( N − 1)τ ) ] where x(t) is a value of the time-series at time t, τ is a suitable time delay (sampling time) and N is the embedding dimension. This vector fully represents the non-linear dynamics when N is a large enough. The embedding theorem guarantees that a full knowledge of the behaviour over a system is contained in the time series of any one measurement and that a proxy for the full multivariate phase space can be constructed from the single time series. To perform the state space recognition at time delay τ and an embedding dimension N are needed. There are various methods to estimate them (including empirical methods based on global minimisation of an error, like neural networks and generic algorithms); in this work we used widely used analytical methods. Estimation of time delay τ. For finding τ we used the average mutual information (AMI) function, and for finding N, the method of false nearest neighbours. The time delay τ must be large enough that independent information about the system is in each component of the vector. However, τ must not be so large that the components of the vectors x(t) are independent with respect to each other. Conversely, if the time delay is too short, the vector components will be independent enough and will not contain any new information. A possible rule for good time delay τ is to use the first minimum of the AMI (Frazer and Swinney, 1986). Average mutual information is derived from notions of entropy in communications systems (Shannon, 1949). It determines how much information the measurements x(t) at some time have relative to measurements and some other time x(t+τ). The basic philosophy behind the definition of mutual information is that given an Nelement sequence, the transition probabilities Ps(si) that a measurement of the state s yields si is calculated and the information entropy is then defined as:


4

( )

( )

H (s ) = − ∑ Ps S i ∗ log Ps S i

Defining the conditional entropy we can investigate how x(t+τ) depends on x(t) as a function of τ.

(

(

)

(

)

 Psq si , q j   Psq s i , q j   log    Ps (si )   Ps (s i ) 

)

H q , si = − ∑ 

In the equation above, Psq(si ,qj) is the probability that measurements of s and q yield si , qj. The mutual information is then defined to be the amount that a measurement of s = si reduces the uncertainty of q as:

(

)

( )

(

I q , s i = H s i + H (q ) − H s i , q

)

For the considered case study τ was selected to be 9. Estimation of embedding dimension N. The global embedding dimension N is the minimum number of time-delay co-ordinates needed so that the trajectories x(t) do not intersect in N dimensions. In dimensions less than N, trajectories can intersect because they are projected down into too few dimensions. Subsequent calculations, such as predictions, may then be corrupted. If it is too large, noise and other contamination may corrupt other calculations because noise fills any dimension. The method of finding a proper N can be described using geometrical considerations: as N increases, attractors "unfold" and the vectors that are close in dimension N move to a significant distance apart in N+1. They are "false" neighbours in dimension N. The method of false nearest neighbours (Kennel et al., 1992) measures the percentage of false neighbours as N increases. Points that are close in N are marked and the number of these points that become widely separated in N+1 is calculated. System evolution over shorter time intervals can be often adequately described in fewer dimensions than N. The local embedded dimension NL is the number of degrees of

freedom that describe the short-term evolutions in small regions of phase space. The idea behind determining NL is to ensure that trajectories associated with close neighbours have to remain close for some time period. Assessment of prediction horizon. An approximate estimate of prediction horizon can be made with the help of Lyapunov exponents. These parameters (their number is equal to the state dimension) are widely used in classical control theory as indicators of stability - they describe the rate at which close trajectories diverge or converge. If all exponents are all zero or negative, the system trajectories on phase space do not diverge and the system is stable. Chaotic systems obviously are not stable, so their Lyapunov exponents are all positive, and the largest Lyapunov exponent λ1 describes the upper limit of accuracy for a predictive model, such as the described above. The largest Lyapunov exponent for the considered case is found to be 0.5. It is possible to give an assessment of the prediction horizon as T = τ / λ1 = 9 / 0.5 = 18 (that is 18 x 10 min = 3 hours). This assessment is only an estimate and does not give direct indication of the associated error. Prediction with the local models. Having parameters τ and N identified and the phase space reconstructed, one can build the prediction model in a form of multidimensional maps: x(t + T ) = fT (x(t ) ) where the phase space x(t) is the current state of the system and x(t+T) is the state of the system after a time interval T and fT is a mapping function. The problem is then to find a good expression (local models) for the function fT. A generalised scheme that was applied for constructing and testing the local models in this study is presented in Figure 3. The data is embedded and then divided into training and testing set. Based on the training set, the embedded data space is quantified (using K-NN algorithm at this stage). Local data sets are then constructed for each of the prototype vectors.


5

Finally, local data models (linear in this case study) are constructed based on the local data sets which are then used to predict the dynamics of the system (move the system from state x(t) into state x(t+T)). Time series data

Embedding

Several local data models were built in order to predict the surge water level for different time horizons. Data for the hydrological year 1994/95 with a 10-minute interval were used; 50000 samples used for training and 2000 for testing. The embedding dimension used was 4, τ was equal to 9 time steps. See Table 1 and Figure 4 for details.

Training data Testing data Vector quantization

Select prototype vector

(K-NN, neural gas, SOM)

Prototype vectors

N times

Build local data sets

Predict next value

Calculate local models based on local data sets Local predictors

Figure 3. A generalised scheme constructing and testing local models

for

Surge water level prediction Hook of Holland Coastal Station 100

Measured Prediction Error

80

Surge & Errors [cm]

60 40

Error 20 0 -20 -40 -60 0

250

500

750

1000

1250

1500

1750

2000

Samples

Figure 4. Local linear model for Surge time-series prediction (data for the hydrological year 1994/95 with 10-minute interval are used; 50000 samples used for training and 2000 for testing). Embedded dimension = 4, Time delay τ=9 time steps. Prediction Horizon=6 time steps (1 hour). RMSE=3.6 cm


6

Table 1. Prediction errors of the model for different time horizons Error RMSE MAE

20 min 1 h 2 h 2.5 h 3 h 2.277 3.614 5.481 5.928 6.116 1.707 2.656 4.005 4.32 4.451

A testing set (2000 samples in total) was chosen to contain two types of dynamic behaviour of the system. The first part is characterised by small amplitude and variance of the surge (cases 50000-51400), the second part is characterised by large variations both in variance and the surge amplitude (values between –47 cm and 79 cm). Such a selection of the testing set was done in order to test the predicting capabilities of the trained local linear models for contrasting dynamic states of the system. 5 CONCLUSIONS Use of linear (autocorrelation and ARIMA models) and non-linear methods showed that the latter are much better in predicting surge water levels in the coastal zone. Results for horizons from 20 minutes to 1 hour prediction are excellent (RMSE between 2.23 – 3.6 cm), extending the prediction horizon to 2 hours has showed that there is still enough local predictive information embedded into the attractor of the system (RMSE around 5.5 cm). Finally, the 3 hours prediction has shown that the local linear models are able to correctly predict the amplitudes of the surge, in “stormy” situations as well, but with a phase error (this pushed RMSE up to 6.1 cm). The reason for its presence might be a systematic nature, as well as the presence of low-frequency periodic components, and of course the linearity of the local model used. Identification, decomposition and removal of the components that produce the mentioned phase error can be done using transformation from “amplitude-time” domain into the “frequency-time” domain utilising techniques such as wavelet analysis. This clearly suggests the need for further investigation of the

complex dynamic behaviour of the system presented in this preliminary study. Furthermore, building non-linear local models (such as polynomials and radial-basis functions) of the phase-space of the system may improve the accuracy as well. In spite of the very good results achieved with the chaos theory, the prediction horizon may be increased through inclusion into the analysis of other hydrometeorological variables as well. Current research is aimed at using pattern recognition methods - clustering, association rules, neural network and support vector machines (Vapnik, 1998; Velickov et. al 2000). Some of these experiments were performed but the restricted space of this paper does not allow covering them - they will be reported elsewhere. ACKNOWLEDGEMENTS Authors are grateful to the North Sea Directorate of the Rijkswaterstaat, (Ministry of Transport, Public Works and Water Management, the Netherlands) for the permission to use the measurement data and for the partial financial support for this work. REFERENCES Abarbanel, H.D.I., “Analysis of Observed Chaotic Data”, Springer-Verlag, New York, 1996. Box, G. E. P., & Jenkins, G. M., “Time series analysis: Forecasting and control”, San Francisco: Holden-Day, 1976. Frazer, A. M. and H. L. Swinney, “Independent Coordinates for Strange Attractors from Mutual Information”, Physical Review A 33 (2), pp1134-1140, 1996. Frison, T. W., H. D. I. Abarbanel, M. D. Earle, J. R. Schultz and W. Scherer, “Chaos and predictability in ocean water levels”, Journal of Geophysical Research, 104 (4), pp. 7935-7951, 1999. Hense, A., “On the possible existence of a strange attractor for the southern oscillation Beitr”, Phys. Atmosph. 60 (1), 34-37, 1987.


7

Jayawardena, A. W., Lai F., “Analysis and prediction of chaos in rainfall and stream flow time series”, J. Hydrol. 153, pp. 23-52, 1994. Kennel, Matthew B., R. Brown, and H. D. I. Abarbanel, “Determining embedding dimension for phase-space reconstruction using geometrical construction”, Phy. Rev. A 45, pp. 3403-3411, March 1992. Lorenz, E. N., “Deterministic nonperiodic flow”, J. Atmos. Sci., 20, pp.130-141, 1963. Rahman, M. Analysis and prediction of chaotic timeseries. MSc Thesis, IHE/DHI, 1999. Shannon C. E., and W. Weaver, “The Mathematical Theory of Communication”, University of Illinois Press, Urbana, 1949. Sivakumar, B., Liong, S. Y., Liaw, C. Y., Phoon, K. K., “Singapore rainfall behaviour: chaotic?”, J. Hydrol. Eng., ASCE 4 (1), pp. 38-48, 1999. Takens, F., “Detecting Strange Attractors in Turbulence. In: Dynamical Systems and Turbulence”, D.Rand, L.Young, eds. Springer, Warwick, pp.366-381, 1981. Vapnik, V. N., “The Nature of Statistical Learning Theory”. Springer-Verlag, 1998. Velickov, S., Price R.K., Solomatine D.P. and Yu, X. "Application of Data Mining Techniques for Remote Sensing Image Analysis". Proc. 4th Int. Conference on Hydroinformatics, Iowa, USA, July 2000. Zaldivar, J.M., Gutierrez, E., Galvan, I.M., Strozzi, F. and Tomasin, A. "Forecasting high waters at Venice Lagoon using chaotic time series analysis and non-linear neural networks". Journal of Hydroinformatics, vol. 2, No.1, pp. 61-84, 2000.


8