Using Reservoir Computing for Forecasting Time

Using Reservoir Computing for Forecasting Time Series: Brazilian Case Study Aida A. Ferreira1, Teresa B. Ludermir2 1

Center of Informatics (CIn), Federal University of Pernambuco (UFPE) and Federal Center of Technologic Education of Pernambuco, Av Professor Luis Freire, 500, Cidade Universitaria, Cep:50.740-530 – Recife – PE – Brazil [email protected] 2 Center of Informatics (CIn), Federal University of Pernambuco (UFPE) [email protected] Abstract

This paper presents a Brazilian case study of forecasting a wind speed time series with Reservoir Computing (RC). RC is a recent research area, in which an untrained recurrent network of nodes is used for the recognition of temporal patters. In RC only the weights of the connections in a linear output layer are trained. This reduces the complexity of recurrent neural networks (RNN) training to simple linear regression. In this work we used Echo State Network (ESN) to create the case study and compare the results with Multilayer Perceptron Networks and persistence method. Our case study concerns forecasting the wind speed, which is fundamental information in the operation planning for electrical wind power systems. The results showed that the RC performed significantly better than Multilayer Perceptron Networks or persistence method, even though it presents a significantly simpler and faster, training algorithm.

1. Introduction This paper investigates a Brazilian case study of forecasting a wind speed time series with Reservoir Computing (RC). Reservoir Computing (RC) [1], [2] is a new paradigm with promising results [3], [4], [5] [6]. RC offers an intuitive methodology for using the temporal processing power of recurrent neural networks (RNN) without the inconvenience of training them. Originally introduced independently as Liquid State Machine [1] or Echo State Network [2], the basic concept is randomly construct a RNN and leave the weights

unchanged. A separate linear regression function is trained on the response of the reservoir to the input signals using pseudo-inverse. The underlying idea is that a randomly constructed reservoir offers a complex nonlinear dynamic transformation of the input signals which allows the readout to extract the desired output using a simple linear mapping [3]. Analysis and forecast of wind speed series are utmost importance in the operation planning for electrical wind power systems. The wind speed is the most important input for the forecasting system of wind power generation, because the power converted from the wind turbine depends mainly on the wind speed [17]. The data set used in this work has data about wind speed time series from Triunfo city in state of Pernambuco, Brazil.. This work is divided into: 1. Introduction, 2. Reservoir Computing; 3. Wind Speed Forecasting, 4. Persistence Method; 5. Case Study: Wind Speed Time Series; 6. Development of the Case Study; 7. Results and Comparative Analysis; and 8. Conclusions.

2. Reservoir Computing Reservoir Computing (RC) offers an intuitive methodology for using the temporal processing power of recurrent neural networks (RNN) without the hassle of training them [3]. Recurrent neural networks (RNN) are examples of neural computation models that handle time without the need for preprocessing delay lines. RNNs have recurrent connections between the processing elements (PEs) creating internally the memory required to store

the history of the input patterns [7][8]. RNNs have been widely used in many applications such as system identification and control of dynamical systems [9], [10], [11]. In recent years, a number of approaches for processing of time-varying inputs have been proposed that utilize the complex dynamic inherent in some recurrent networks architecture. Among the most prominent examples of such architectures are LSM [1] and ESN [2]. Here we will refer to both as Reservoir Computing (RC). In RC the reservoir is a fixed, randomly structured recurrent network that receives time-varying input on which certain computations are to be performed. The reservoir fulfills two functions. First, it nonlinearity transforms input streams into high-dimensional activation patterns. Second, it exhibits a fading memory of recent inputs. These properties are exploited by a simple linear readout mechanism that can be trained to perform interesting computation on input time series. To this end, the linear readout is often trained with standard linear regression techniques [12]. The Fig. 1. 1 shows a diagram of an Echo Stats Network (ESN) with M input units, N internal PEs and L output units. The Value of the input unit at time n is u(n), of internal units are x(n), and output units are y(n). Dynamical Reservoir

Input Layer

W

in

W W

u(n)

Readout

Wout

x(n)

. .

+ +

y(n)

Fig. 1. Diagram of an echo state network (ESN).

3. Wind Speed Forecasting The use of the wind power generation produces some inconveniences as, for instance, uncertainties in generation that results in additional cost to the operation planning of the system and also difficulties in the control systems of the wind power center. Such inconveniences are getting through the variation in the wind speeds. In this regard, in order to have an appropriate integration of the wind power energy with the electrical power system of the utility, it is essential

the application of tools or techniques capable of forecasting the energy to be generated by those sources. Some papers relating to wind power forecasting, besides incorporate series of wind speed and generated power, incorporate also the relief characteristics of the land and historical climate information that can be measured by satellites [13][14]. Besides this information, other models use wind data of other wind ranches upstream from the place where the forecast is accomplished [15]. The wind power model in energy planning is based on statistical operation of the wind farms considering the wind regime. The local wind regime includes the daily and annual patterns of wind speed, wind direction and temperature. For the Northeast Region of Brazil it has been shown in [16] that the wind characteristics are very directional wind speed.

4. Persistence Method The persistence method is based upon the high correlation between the present wind speed and the wind speed in the immediate future. This method was developed by meteorologists as a comparison tool to supplement a numerical weather prediction (NWP). Since the accuracy of very short-term prediction was historically deemed unimportant, persistence was sufficient. In fact, this simplified method proved to be more effective than a NWP model over very short-term prediction [18]. The persistence forecast method tends to be used as benchmark against which all other models are compared [19].

5. Case Study: Wind Speed Time Series In this case study, data from the project SONDA (System of National Organization of Ambient Data http://www.cptec.inpe.br/sonda/) have been used. The project SONDA is a project of the National Institute of space research (INPE) for implementation of a physical infrastructure and of human resources destined to raise it and to improve the database of the resources of solar and wind energy in Brazil. The wind speed time series chosen for this work was the series from Triunfo. Triunfo is a Brazilian city in state of Pernambuco. It was chosen because these data were measured by one significant anemometers installed in the northeast of Brazil. This city is located in the highest altitude of Pernambuco, 1,123 meters, and its series of wind speed presents the higher average speed of the state, 11.83 m/s. The series is constituted by the hourly wind speed, hourly wind direction and the hourly temperature obtained by the wind

headquarters of Triunfo in the period from January 01, 2006 to April 30, 2007, which amount to 11,640 patterns. To create the case study’s forecast, it is necessary to analyze several variables that can influence the model’s output magnitude, in order to determine the main variables that will constitute the developed model, and also their structures. In this work, the developed models using one, two and tree variables, i.e., the models use hourly wind speed, hourly wind direction and hourly temperature like explanatory variables of the forecast models. The Fig. 2 presents the behavior of the (a) average hourly wind speed, (b) average hourly wind direction and (c) of the average hourly temperature of the city of Triunfo.

characterizes the seasonality of winds among the day periods. Cross correlation is a measure of the degree of the linear relationship between two time series. A high correlation between time series at a specific lag might indicate a time delay in the system. The database was analyzed too by the cross correlation function of the data of the wind speed, the wind direction and of the temperature series in order to measure the degree of the linear relationship between two time series. A high correlation between time series at a specific lag might indicate a time delay in the system. In Fig. 4 (a) it is possible to observe the cross correlation between the hourly wind speed and the hourly direction and in Fig. 4 (b) it is possible to observe the cross correlation between the hourly wind speed and the hourly temperature. It can be seen that in the proximity of 24 hours the cross correlation function of the wind speed and wind direction series decreases and the cross correlation function of the wind speed and temperature speed increases (in module).

Fig. 2. (a) Average hourly wind speed, (b) Average hourly direction, (c) Average hourly temperature.

In forecast models that are univariate, the autocorrelation function for the series defines its applicability and acts mainly in the statistical models. Using this correlation function, it is possible to identify the dependence among series data, which facilitate the data analysis. Fig. 4. (a) Analysis of cross correlation: Wind Speed x Direction, (b) Analysis of cross correlation: Wind Speed x Temperature.

Fig. 3. Autocorrelation analysis of wind speed time series.

The database was analyzed by the autocorrelation function of the data of the wind series in order to define the forecast horizon. It can be easily verified through a qualitative analysis of the graph in Fig. 3 that the forecast models with small forecast horizon tend to have a superior performance. Besides, it can be seen that in the proximity of 24 hours the autocorrelation function of the series increases again. This behavior

Before creating the system, the base was preprocessed and the values of the average hourly speeds, directions and temperature were transformed to fall in a limited range [-1,1]. The values were transformed as in ( 1 ). (1) ( y max − y min ) * ( x − x min )

xtrans =

( x max − x min )

+ y min

were, xtrans is the value transformed in a limited range [-1, 1]; ymax, is the maximum value of the interval, 1; ymin is the minimum value of the interval, 0; xmax and xmin are the maximum and minimum values of the series and; x is the original value. To the maximum value of the series was added 20%. Thus, the maximum value accepted by the model is equal to the

maximum value found in the database increased by 20% and the minimum value is zero. Twenty-four-step-forward predictor of the average hourly wind speed was chosen, thus the series presented a good correlation index and 24-step-forward is a good interval to the operation system planning.

6. Development of the Case Study As in the analysis phase and preparation of the database, 24-step-forward predictor of the hourly wind speed was chosen, and then the series presented good coefficient of correlation and 24-step-forward is a good interval to planning the system operation. And also, as the RC does not need to use any of time window representation in the input data set, since RC has internal built-in memory resulting from the feedback connections. The data set for four different models were developed in the following way: a) Model 1: One Input (speed) The system inputs are the sequential hourly wind speed from the oldest value until the newest value. In addition, the desired output all models was defined as in ( 2 ), then the aim was to obtain a 24-step-forward forecasting,: y (t ) = x (t + 24) (2) Where y is the wind speed, 24-step-forward and x is the actual wind speed. b) Model 2: Two Inputs (speed and direction) The system inputs are the sequential hourly wind speed and sequential hourly wind direction, from the oldest value until the newest value. c) Model 3: Two Inputs (speed and temperature) The system inputs are the sequential hourly wind speed and sequential hourly temperature, from the oldest value until the newest value. d) Model 4: Tree Inputs (speed, direction and temperature) The system inputs are the sequential hourly wind speed, sequential hourly wind direction, and sequential hourly temperature, from the oldest value until the newest value. To create the systems, an open-source Matlab toolbox for Reservoir Computing was used, which is free available software from http://www.elis.ugent.be/rct [20]. After some experimental evaluations, the parameter configuration of the reservoir topology was chosen as following: • The reservoir is composed by 400 nodes, scaled to a spectral radius of λ max = 0.9 . •

The input node is connected to the reservoir nodes by weight scaling between [-.1, .1].

• • • •

The node type is “analog”. The nonlinearity function is “tanh”. The bias layer is connecting with output layer. The reservoir layer is connecting with reservoir and output layer. • The output layer is connecting with the output layer. • The readout consists of a series of repressors, each of which takes the current reservoir state as input and computes a weighted sum. • The readout layer has one output unit which is post processed by the reverse transformation to the original range of wind speed. The script to create the models consists of loading the data set and creating the reservoir by the topology described above. The reservoir was not preadapted because better results with the adaptation process were not achieved. The actual simulation reservoir of the reservoir was done, and the reservoir responses were saved. Afterwards, a readout function was trained and the performance of the reservoir was evaluated using the 10-fold cross-validation methodologies. The data set was split into 10 chunks. Each one of these chunks was used separately as a testing set, while the rest was used to train the readout. For each step of 10-fold cross-validation, three different networks (using the same training data and different weight initializations) were created. The method used to train the readout was the pseudo-inv. In parallel with the creation of the RC, Multilayer Perceptron (MLP) networks were created and training with the algorithm resilient backpropagation (RPROP) using the same training and test data and different weight initializations. The RPROP performs a local adaptation of the weightupdates according to the behavior of the error function. All of the MLP networks used have an input layer, a hidden layer and an output layer. The nodes of the hidden and output layers use the tan-sigmoid activation function. The maximum number of iterations for all of the trainings was set to 50 epochs. The training stops if the early stopping implemented by MATLAB® happens 10 times consecutively, or if the maximum number of epochs is reached, or if the error gradient reaches a minimum, or still if the error goal in the training set is met. After some experimental evaluations, the number of hidden layer processing elements was defined as N=4, which presented best results.

7. Results and Comparative Analysis The models performance was measured by the percentage of the mean-square error (MSE) specified

in ( 3 ), the mean absolute percentage error (MAPE) specified in ( 4 ), and the mean absolute error (MAE) specified in ( 5 ): (3) L − Lmín P N MSE% = 100 × máx (L pi − T pi ) 2 ∑∑ N ⋅ P p =1 i =1 where Lmax and Lmin are the maximum and minimum of the hourly speed values in the data, respectively; N is the number of output units of the ANN; P is the total number of patterns in data base; Lpi and Tpi are actual and desired target output of the ith neuron in the output layer respectively. 1 P Lp − Tp (4) MAPE% = ∑ × 100 P p =1 T p where P is the total number of patterns in data base; Lp and Tp are the actual and desired output value for a given input, respectively. (5) 1 P MAE = ∑ L p − T p P p =1 where P is the total number of patterns in data base; Lp and Tp are the actual and desired output value for a given input, respectively. In this work, the persistence method was equivalent to that found in other forecast models for the hourly wind speed [21], [22] however the performance of the created models with RC were more powerful. As shown Table I, the MAPE, MAE and MSE errors in the test set presented by the RC models were lower than that errors presented by MLP and persistence method. The MAPE, MAE and MSE errors by the best RC models were respectively 8.10, 0.88 and 0.95. These results were equals in two different models. The model with only one input (wind speed) and the model with two inputs (wind speed and temperature). All models created with RC presented MSE values very closed in the test set. The hypothesis tests (t-test) showed that all models created with RC have a same MSE in a significance level of 5%. The hypothesis tests (t-test) showed too that all RC models were better than the MLP and persistence model in a significance level of 5%. The results of the MLP, in the test set, were respectively 24.42, 2.52 and 7.57. The results of the persistence model, in the test set, were respectively 18.39, 2.02 and 4.92. Besides, it can be observed that the standard deviation of those errors by the RC models was smaller, indicating that the experiments with RC models were more stable than MLP and persistence method.

TABLE I PERFORMANCE OF RESERVOIR COMPUTING SYSTEM IN TEST Test Model MAPE MAE MSE (%) (m/s) (%) RC with Speed Avg. 8.10 0.88 0.95 RC with Speed and Direction

Std Avg.

2.63 8.28

0.05 0.89

0.16 0.99

Std

2.75

0.07

0.19

RC with Speed and Temperature

Avg.

8.10

0.88

0.95

Std

2.65

0.05

0.15

RC with Speed, Direction and Temperature MLP with speed

Avg.

8.22

0.89

0.98

Persistence

Std

2.71

0.06

0.18

Avg.

24.42

2.52

7.57

Std

10.39

0.60

3.97

Avg.

18.39

2.02

4.92

Std

4.88

0.22

1.14

Fig. 5 shows the result of the forecasting using RC (with wind speed input) and persistence method with test data (fold 10). The black line refers to the real measures of this time series, and the white one refers to the “forecasted” amount of average hourly wind speed.

Fig. 5 Testing patters: black line (target) and white line (forecaster output).

According to Fig. 6, we can say that the predictor model created by RC was able to learn and get most of the behavior instability presented on this time series, which appears not to be an easy task for any predictor.

8. Conclusions This paper presents a case study of forecasting a wind speed time series using RC, which possesses a highly interconnected and recurrent topology of nonlinear processing elements.

As there is no knowledge so far of the RC being applied to the forecasting wind speed problem and also considering the training simplicity and quickness, we decided to analyze this new paradigm. The comparison of the obtained results from the RC models with persistence method has become important because the persistence method was developed by meteorologists as a comparison tool to supplement a numerical weather prediction (NWP). Here, the RC is used to forecast hourly wind speed, fundamental information to the operation planning for electrical wind power systems. The results of the RC forecasting models were compared with MLP and persistence method. They showed that the RC performed significantly better than MLP and persistence method in a significance level of 5%, even though it presents a significantly simpler, and faster training algorithm. The predictor models created by RC were able to learn and get most of the behavior instability presented on this time series, which appears not to be an easy task for any predictor.

9. References [1] W. Maass, T. Natschlager, and H. Markram. "Real-time computing without stable states: A new framework for neural computation based on perturbations”, Neural Computation, 14(11):2531-2560, 2002. [2] H. Jaeger. “The echo state approach to analyzing and training recurrent neural networks”, Technical Report GMD 148, German National Resource Center for Information Technology, 2001. [3] B. Schrauwen, J. Defour, D. Verstraeten and J. V. Campenhout. ”The introduction of time-scales in reservoir computing, applied to isolated digits recognition”, LNCS, 4668, Part I, pp. 471-479, 2007. [4] E. A. Antonelo, B. Schrauwen, X. Dutoit, D. Stroobandt, and M. Nuttin. ”Event detection and location in mobile robot navigation using reservoir computing”, LNCS, 4668, Part II, pp. 660-669, 2007. [5] I. Ilies, H. Jaeger, O. Kosuchinas, M. Rincon, V. Sakenas, N. Vaskevieius. ”Stepping forward through echoes of the past: forecasting with echo states networks”. Avaible: http://www.neural-forecastingcompetition.com/downloads/methods/27NN3_Herbert_Jaeger_report.pdf [6] H. Jaeger, W. Maass, and J. Principe. Special issue on echo state networks and liquid state machines. Neural Networks, 20(3), 2007. doi:10.1016/j.neunet.2007.04.0001. [7] H. Jaeger and H. Hass, “Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication”, Science, 304(5667):78–80, 2004. [8] R. Sacchi, M. C. Ozturk, J. C. Principe, A. A. F. Carneiro and I. N. da Silva. “Water Inflow Forecasting Using the Echo State Network: a Brazilian Case Study”,

[9]

[10]

[11]

[12]

[13] [14]

[15]

[16]

[17]

[18] [19] [20] [21]

[22]

Proceedings of International Conference on Neural Networks”, Orlando, Florida, 2007. G. Kechriotis, E. Zervas, and E. S. Manolakos, “Using recurrent neural networks for adaptive communication channel equalization”, IEEE transactions on Neural Networks, 5(2):267–278, 1994. G. V. Puskorius and L. A. Feldkamp, “Neurocontrol of nonlinear dynamical systems with kalman filter trained recurrent networks”. IEEE Transactions on Neural Networks, 5(2):279–297, 1994. A. Delgado, C. Kambhampati and K. Warwick, “Dynamic recurrent neural network for system identification and control”. IEEE Proceedings of Control Theory and Applications, 142(4):307–314, 1995. A. Lazar, G. Pipa, J. Triesch. “Fading Memory and Time Series Prediction in Recurrent Networks with Different Forms of Plasticity”. Neural Networks, vol. 20, pp. 312-322, 2007. Ernst, B., Rohrig, K., Online-Monitoring and Prediction of Wind Power in German Transmission System Operation Center. Sideratos, G., Hatziargyriou, N. D., Application of Radial Basis Function Networks for Wind Power Forecasting. Lecture Notes in Computer Science, ICANN Greek, 4132, pp. 726-735, 2006. Alexiadis, M. C., Dokopoulos, P. S., Sahsamanoglou, H. S. Wind Speed and Forecasting based on Spatial Correlation Models, IEEE Transactions on Energy, 4(3), pp. 885-8969, 1998. Rodrigues, G. “Wind Characteristics of the Northeast Region – Analysis, models and application to wind farm projects” (in Portuguese), M.Sc. dissertation Dep. Of Mechanical Engineering, UFPE, Recife, Brazil – 2003. M. M. Schawartz, M. Wan, Y. ”Statistical wind power forecasting models: Results for U.S Wind Farms”, National Renewable Energy Laboratory, 2003, CP 50033956. D. Milborrow, “Forecasting for scheduled delivery,” Windpower Monthly, p. 37, Dec. 2003. Simon Waston. “Fresh Forecast”, IEE Power Engineer, pp 36-38, 2005. B. Schrauwen, D. Verstraeten and M. D’Haene. Reservoir Computing Toolbox Manual. Available: http://www.elis.ugent.be/rct Ioannis G. Damousis, Minas C. Alexiadis, John B. Theocharis and Petros S. Dokopoulos. “ A fuzzy model for wind speed prediction and power generation in wind parks using spatial correlation”, IEEE Transaction on Energy Conversion, vol. 19, no. 2, pp 352-361, 2004. R. R. B. Aquino, J. B. Oliveira, O. N. Neto, M. M. S. Lira, A. A. Ferreira, P. A. C. Rosas, G. S. M. Santos. “Assessment of Conventional Methods and Artificial Intelligence for Wind forecast and Wind Power Generation”, SNPTEE, Rio de Janeiro, 2007. (In portuguese).

Using Reservoir Computing for Forecasting Time

Using Reservoir Computing for Forecasting Time

Suggest Documents

Benchmarking Reservoir Computing on Time ... - CiteSeerX

Inflow forecasting using Artificial Neural Networks for reservoir operation

Parallel Reservoir Computing Using Optical Amplifiers

Photonic reservoir computing using semiconductor ring lasers ...

Reservoir Computing using Stochastic p-Bits - arXiv

Reservoir Computing Approach to Robust Computation using ...

Behavior Switching Using Reservoir Computing for a Soft Robotic Arm

Future Weather Forecasting Using Soft Computing ... - ScienceDirect

reservoir computing - Biblio UGent

A Reservoir Computing Approach

Optoelectronic Reservoir Computing

Monthly reservoir inflow forecasting using a new ...

Reservoir inflow forecasting using artificial neural network - CiteSeerX

Delay learning and polychronization for reservoir computing

Memristive Reservoir Computing Architecture for ...

Genetic Algorithm for Reservoir Computing Optimization

Reservoir computing for static pattern recognition - CiteSeerX

Nanophotonic Reservoir Computing for Noisy ... - Semantic Scholar

Reservoir Computing for Learning in Structured ...

Product Reservoir Computing: Time-Series Computation with ... - arXiv

Financial Time Series Forecasting Using Empirical

Forecasting Financial Time Series using Multiple ...

time series forecasting using neural networks - arXiv.org

Real Time Snowmelt Runoff Forecasting using