Real-time water demand forecasting using support

8 downloads 0 Views 232KB Size Report
Nov 30, 2015 - [7] I. S. Msiza, F. V. Nelwamondo, and T. Marwala. Artificial neural net- works and support vector machines for water demand time series ...
Real-time water demand forecasting using support vector machine and adaptive Fourier series B. Brentan? ∗, E. Luvizotto Jr.? , M. Herrera † J. Izquierdo and R. P´erez-Garc´ıa◦



(?)Computational Hydraulic Laboratory, University of Campinas, Brazil (†) EDEn - Dept. of Architecture and Civil Eng., University of Bath, UK (◦) Fluing - IMM, Universitat Polit`ecnica de Val`encia, Spain

November 30, 2015

1

Introduction

Both safe operations of water supply systems (WDSs) and design of new WDSs require water demand forecasting. For a monitored WDS, the use of data can provide deep knowledge on water demand and become a powerful management tool to improve system efficiency. ARIMA models have been traditionally applied for modeling water demand. However this approach usually considers linear correlations among variables. This hypothesis not always help develop a model able to make predictions with required accuracy, thus harming control processes. Recent works propose the use of artificial intelligence and machine learn tools to model the non-linearity among the variables. [5] compare various predictive methods applied to hourly water demand forecasting, suggesting support vector machine as one of the best models. The arrival of new data can become obsolete in off-line models and the application to operation modes requires great improvement of accuracy, which can be obtained by on-line models. ∗

[email protected]:

1

Modelling for Engineering & Human Behaviour 2015

2

The main feature of on-line models is their ability to improve the accuracy by recalculation the whole process each time new data is embedded. Some on-line methods are proposed in the literature such as sliding windows methodologies, which use kernel regression with fast parameter optimization, hybrid model with ARIMA and ANN working together, applied to daily demand forecast. A similar approach is used in [4] modeling water demand intervened (e.g. by open/close valve manoeuvres). Based on [7, 3], our work uses support vector regression (SVR), running for short-term water demand forecasting. Built on this model, an on-line process based on Fourier time series is launched to improve the predictions. The error associated to the SVR model is investigated and using adaptive Fourier series (AFSs), the prediction made by SVR is adjusted by error prediction using AFS. Also an optimal training cycle is defined using an efficiency parameter, which is critical to update The SVR off-line model.

2 2.1

Hybrid model: Support Vector Regression - Adaptive Fourier Series Support Vector Regression (SVR)

Kernel-based learning methods use an implicit mapping φ in a high dimensional (feature) space and convert the non-linear relations into linear ones. The learning then takes place in the feature space, while the learning algorithm can be expressed so that the data points only appear inside dot products with other points, readily calculated via the kernel. A specify margin ε is the key characteristic of SVR, within which we are willing to accept errors in the sample data without they affecting quality prediction. The SVR predictor is defined by those points or vectors which lie outside the region formed by the band of size ±ε around the regression. These vectors are the so-called support vectors. The goal is to find a function fˆ(x) = hw, φ(x)i + b,

(1)

that at most deviates ε from the observed output, yi and, at the same time, minimizes the so-called model complexity, which depends only on the support vectors. This method of tolerating errors is said to be ε-insensitive [8]. To complete this approach, slack variables are included in the ε-regression

Modelling for Engineering & Human Behaviour 2015

3

fostering the chances to achieve better predictions in a more relaxed space.

2.2

Adaptive Fourier Series (AFS)

The Fourier series set of equations presented here is based on [6], where trigonometric adjustment is applied to data coming from both equally and not-equally spaced measurements. Taking equally spaced values of t in the period of interest, and normalizing times to the interval [0 , 2π], the error e between the real deviation d and the Fourier value at time step ti is written e=

N −1 X

{f (ti ) − [a0 +

i=0

M X j=1

aj · cos(jti ) +

M X

bj · sin(jti )]}2 .

(2)

j=1

where M is the length of the Fourier polynomial, and a0 , aj and bj are the adjustable Fourier coefficients. Applying, the least square method to minimize e, taking into account obvious orthogonality conditions, it is possible to obtain each adjustable coefficient of the series.

2.3

Hybrid model

The hybrid model can be fast enough to respond to abrupt changes in water demand conditions. Using SVR for standard prediction, we propose an additive layer, the Fourier layer, to adjust the deviation. First, a calibration process is required to define the SVR parameters: • Parameter C, which gives a trade-off between model complexity and the amount up to which deviations larger than ε are tolerated, also responsible of the robustness of the regression model. • Parameter ε, which regulates the radius of the ε-tube around the regression function. To train the model, a mesh is created with pairs of  and C, with corresponding ranges: 0.05 ≤  ≤ 0.9 and 1 ≤ C ≤ 1500. A Grid Search Method is applied to tune the parameters, in which each pair represent a possible of solution. Once tuned, the model is running and the deviations d between predicted and observe demand are computed. Deviation is typically larger at demand

Modelling for Engineering & Human Behaviour 2015

4

peaks and, as expected, has periodical behavior.The final value of water demand is obtained by correcting the SVR value ySV R with the predicted error by the AFS model dAF S . An optimal cycle of model regeneration is presented as a novelty. The off-line component model can become an obsolete structure, once the training data is far from the prediction time. However, a continuous update have a high computational cost and is deemed not necessary. This cycle is determined by controlling both model accuracy and total CPU time.

3

Experimental Study

This study uses water demand data collected from a real district metered area (DMA) in Franca, Brazil, corresponding to metered data at the DMA’s inlet every 20 minutes from May 2012 until December 2013. Previous studies found in the literature use variables taking into account weather and calendar information for generating models to predict water demand [1, 2, 5]. Our SVR model uses rain, temperature, humidity, and wind velocity as the most important physical variables involved in water demand forecasting and uses weekday, hour of day, month or year and holiday occurrence as calendar variables. The best training parameter values for C and  found by Grid Search are 50 and 0.05, respectively. The predicted demand using the best validated model is presented in Figure 1a, which confirms the largest deviations occur at the maxima and minima. Finally, the SVR model prediction with corrected deviation via the AFS model is presented in Figure 1b. Statistical evaluation of error shows the decrease of error from 12.91l/s to 3.45l/s when applying the AFS adjustment and the increase of correlation coefficient from 0.745 to 0.974, pointing at clear improvement of the quality results obtained by the AFS-SVR hybrid model proposed. The off-line model structure needs to be periodically updated through new data. Thus, determining an optimal updating cycle is of paramount importance. Let T the total CPU time spent to run the calibrated hybrid model corresponding to SVR prediction and ASF deviation adjustment. We define the efficiency as the trade-off between accuracy and computational cost, and can be written as the relation between the training data size Ntr and the product e × T . The optimum training cycle we found was after 3000 register, corresponding to 125 days.

Modelling for Engineering & Human Behaviour 2015

45

15

Observed SVR

Observed AFS

10 Water demand deviation (l/s)

40 Water demand (l/s)

5

35 30 25 20

5

0

−5

15 20

40

60

80

100

120

140

Time

(a) SVR model results

160

20

40

60

80 Time

100

120

140

(b) Hybrid model results

Figure 1: (a) deviation between real and SVR predicted demand and (b) AFS approximation of this deviation

4

Conclusions

This work presents a hybrid model for hourly water demand forecasting in WDSs. The model builds over an off-line Support Vector Regression approach, constituting a base forecasting, an an on-line Adaptive Fourier Series adjustment, responsible for correcting the SVR deviation. SVR accounts for physical and calendar information necessary for water demand forecasting. However, it is not able to capture well the extremes and, as a result, the accuracy diminishes at the demand peaks. The use of Adaptive Fourier Series aggregates to the SVR model a way to update the prediction in near-real time by correcting the demand predicted by the off-line base model. A simple way to determine the optimum training data size for the offline model is presented. This cycle can help water companies to organize interruptions of the model activity for update with new data. Updating the model is important since new data supplement the model with new correlations among demand and input variables.

References [1] J. Bougadis, K. Adamowski, and R. Diduch. Short-term municipal water demand forecasting. Hydrological Processes, 19(1):137–148, 2005. [2] M. Firat, M. A. Yurdusev, and M. E. Turan. Evaluation of artificial neural network techniques for municipal water consumption modeling. Water Resources Management, 23(4):617–632, 2009.

Modelling for Engineering & Human Behaviour 2015

6

[3] M. Herrera, S. Canu, A. Karatzoglou, R. P´erez-Garc´ıa, and J. Izquierdo. An approach to water supply clusters by semi-supervised learning. In D. A. Swayne, W. Yang, A. Voinov, A. Rizzoli, and T. Filatova, editors, International Congress on Environmental Modelling and Software, pages 1925–1932. International Environmental Modelling and Software Society (iEMSs), 2010. student’s award. [4] M. Herrera, J. Garc´ıa-D´ıaz, J. Izquierdo, and R. P´erez-Garc´ıa. Municipal water demand forecasting: tools for intervention time series. Stochastic Analysis and Applications, 29(6):998–1007, 2011. [5] M. Herrera, L. Torgo, J. Izquierdo, and R. P´erez-Garc´ıa. Predictive models for forecasting hourly urban water demand. Journal of Hydrology, 387(1-2):141 – 150, 2010. [6] E. Luvizotto Jr. Representation of Characteristic Curves of Hydraulic Machines for Computational Simulations. PhD thesis, University of S˜ao Paulo, 1992. [7] I. S. Msiza, F. V. Nelwamondo, and T. Marwala. Artificial neural networks and support vector machines for water demand time series forecasting. In Systems, Man and Cybernetics, 2007. ISIC. IEEE International Conference on, pages 638–643. IEEE, 2007. [8] B. Sch¨olkopf and A. J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond (Adaptive Computation and Machine Learning). The MIT press, Cambridge, MA, USA, 2002.