AbstractâThis paper discusses a new algorithm and defines the functionality ... forward artificial neural network (ANN) algorithms are used to provide high ...
A Short-term Load Forecasting Model for Demand Response Applications Jonathan Schachter and Pierluigi Mancarella The University of Manchester Manchester, United Kingdom Email: {jonathan.schachter, p.mancarella}@manchester.ac.uk Abstract—This paper discusses a new algorithm and defines the functionality required for developing a short-term loadforecasting module for demand response applications. Feedforward artificial neural network (ANN) algorithms are used to provide high forecasting performance when dealing with nonlinear and multivariate problems involving large datasets. The approach is thus suitable for short-term load prediction for disaggregated sites to optimize the demand response process when the data relating to the operating regime or load characteristics of the individual devices and loads connected are unavailable. A detailed description of the relevant external data needed for the forecast is explained. In particular, the algorithm considers weather data for the corresponding time period. The model is tested on data from actual ground source heat pump (GSHP) and heating, ventilation and air conditioning (HVAC) loads of various non-residential buildings at several real sites in the United Kingdom (U.K.). The sensitivity of the parameters of the algorithm, including the number of hidden layers used, is also researched. The proposed algorithm is tested against a linear regression and proves to outperform the latter in all cases. The performance of the algorithm is quantitatively assessed using mean absolute per cent error and mean absolute error metrics. Further analysis plots a comparison of actual and forecasted loads and R-values to determine forecast accuracy. Keywords—Artificial neural networks, demand response, energy load forecasting, multi-layer perceptron.
from these devices will be even greater as more customers wish to participate in demand response programs [6]. Indeed, to schedule, dispatch or correctly contract demand response needs accurate forecasts of expected power demand, days, if not weeks, in advance. Forecasts must also be made only with the information available to the forecaster at the time of forecasting. This requires new short-term load forecasting models capable of accurately modeling non-linear relationships present in these types of loads [7]–[9]. Notwithstanding the importance of this type of loads to provide demand response, the benefit of forecasting GSHP and HVAC loads for the dispatch of demand response has not yet been researched. This paper provides an algorithm for short-term load forecasting of actual GSHP and HVAC power demand and describes the functionality required for developing an algorithm for demand response applications. By using limited information, the model is easily implementable. In the next section, we discuss linear regression and artificial neural network (ANN) algorithms used for forecasting in this paper. In Section III a detailed description of the relevant external data needed for the forecast is presented. Section IV presents results for several case studies and analyses the performance of each forecasting method. The sensitivity of the input variables to the forecast accuracy is tested. A discussion of our results and conclusions are presented in Section V. II. FORECASTING ALGORITHMS
I. INTRODUCTION Accurate load forecasts are essential in both energy planning and operation. In planning, they provide the basis for appropriate decision-making for the dispatch of generation, for the scheduling of spinning reserve and for preventing system failures [1], [2]. In operations and in particular in a demand response system for domestic and thermal loads, the data relating to the operating regime and load characteristics of the individual devices and loads are unknown to the operator at the time of the response. Optimizing the response of loads therefore relies upon forecasts for controlling and shifting energy from devices such as ground source heat pump (GSHP) [3] and heating, ventilation and air-conditioning (HVAC) devices [4], [5]. The necessity to forecast power consumption
The authors would like to thank Toby Powell and Rachel Stanley of E.ON New Build & Technology for support to carry out this study. The main author would also like to thank the PhD support provided by EPSRC and APS.
Two main categories for forecasting loads are common in the literature, although hybrid models are gaining in popularity [10], [11]. The first category forecasts future loads using time series and usually does not include any weather variables [12], [13]. It assumes that the load is a signal with known seasonal, weekly and daily periods and any difference with the actual load can be modeled as a stochastic process. The other category considers the dependence between load and weather information and focuses on their relationship. In the case of GSHP and HVAC devices, whose power demands are highly dependent on external temperatures and humidity [14], the latter category is better suited. A. Linear Regression Regression falls within the category that relies on the dependence between weather information and load. Regression tries to find relationships between a dependent variable and one or more explanatory variables. The most common method is
978-1-4799-6095-8/14/$31.00 ©2014 IEEE
linear regression, as it is simple to implement and the relationship between an input matrix x and an output vector y is easy to understand [13]. However, when weather variables are included, linear regression algorithms assume a linear relationship between weather and load. Yet, this relationship is neither linear nor stationary [1], [7], as seen from Table I. Table I checks for linear correlations at site1 between actual load and external temperature, dew point temperature, day of the week, hour of the day and the previous 20th minute load. Values near 0 show no linear relationship while values close to 1 or -1 show strong linear correlation. It is clear that very little linear correlation (-0.065 and -0.073) exists between temperatures and load. Using the previous 20th minute load highlights that lagged values have a high linear correlation. The relevance of this is discussed later. TABLE I. Load
Temperature -0.065
LINEAR CORRELATION WITH LOAD FROM SITE1 Dew Point -0.073
Day 0.077
Hour -0.28
20th Min Load 0.884
The relationship between weather and load variables depends on temporal variations, which linear models cannot fully consider [7]. A more efficient method is the artificial neural network (ANN), which models non-linear relationships between variables and can hence more accurately model the relationship here between load and weather variables. B. Artificial Neural Network Artificial neural networks (ANN) use previous load data to predict future load patterns, like time series, but they are also coupled with regression techniques that require no linear assumption. They can perform complex nonlinear mappings between input variables xi and output variables yi. Inspired by biological nervous systems, they create connections between elements, known as neurons or nodes [15], to perform a task or function by adjusting the values of the connections, weights wi, between elements so that a particular input leads to a specific target output. The multi-layer perceptron (MLP) is the most common ANN in many forecasting applications. It is composed of several layers j of nodes n, where input and output layers are separated by processing stages known as hidden layers [8], [15], and shown in Fig. 1.
explanation of wavelet filtering the reader is referred to [2]. The output of each neuron is a function of the input signal, representing the sum of the weighted inputs, combined with a bias term T and mathematically written in (1).
yj
§ n · f ¨ ¦ w ji xi T ¸ ©i 1 ¹
The adjustment of the weights is done based on training samples taken from different operating points of the electricity load forecast. Each node receives information from a number of input nodes, contained in the input layer, processes it locally, first linearly and then through a nonlinear activation or transfer function f, to produce a transferred output signal to other nodes until it reaches the final output layer. The activation function used here is a logistic sigmoid function described in (2), but can be different as in [16].
f
1 1 e x
The output of each neuron acts as input for the transfer function at each node (dashed circle in Fig. 1). Starting from a random initial point, the learning algorithm determines the weights so that the error for mapping the inputs of the training samples to their outputs is minimized with the expectation that a low error will be obtained for an unseen test sample. The errors are quantified using the mean absolute error (MAE) calculated by taking the difference between forecasted load values for each half-hour period yˆi and actual load values for the respective period, and averaging them over the number of forecasted values n. The mean absolute percent error (MAPE) is also calculated and formulated in (3).
MAPE
1 n yˆi yi u 100 ¦ n i 1 yi
In this paper, the algorithm used is a MLP with feedforward architecture meaning that the output from one layer is used as the input of the following layer. The MLP is particularly useful for solving nonlinear and multivariate problems involving large datasets [9]. For training, the Levenberg-Marquardt back-propagation algorithm is applied, where the error is propagated backwards and the fitting functions are adjusted with regards to their ‘frequencies’, ‘phases’ and ‘amplitudes’ [17]. Since load forecasting is a function fitting problem which requires the computation of the Jacobian matrix, the Levenberg-Marquardt algorithm is used as it performs best on function fitting and non-linear regression problems and is fast as it avoids computing the Hessian matrix, unlike other training algorithms. III. DATA ANALYSIS
Fig. 1. Diagram of an artificial neuron (left) [15] and a MLP feed-forward neural network (right) [8]
The MLP is a multi-resolution decomposition technique for wavelet filtering, which identifies different sources of useful information embedded in a load time series. For a detailed
A critical part of defining the forecasting algorithm is determining what data are available and what data are relevant. A description of the relevant data is now presented.
A. Input Data The data used in the study include the actual power demand measured in kilowatts (kW) for different sites in minute resolution (detailed in Table II), the dates and times at which the load is recorded. TABLE II. Site Information Site1 Site2 Site3 Site4
DATA AND SITE INFORMATION Data Information Load Type
Number of Data Points
GSHP HVAC HVAC HVAC
112,920 114,013 115,678 3,942*
and HVAC loads. For instance, a hot day in summer is likely to present a higher air-conditioning consumption than a warm day in spring. x
Dew point temperature: The dew point temperature, in degrees Celsius, is taken from sites nearby. This captures the level of humidity in the air, and therefore people’s comfort level with respect to perspiration, and gives an indication of how GSHP and HVAC loads are used. For instance, higher consumption of airconditioning is expected on a hot humid day than on a dry day.
*Data for this site is relatively new, hence the limited data
Two groups of input variables xi are used in the algorithm: a) Input variables that are site-independent x Date: The date of historical load data allows the consideration of patterns in a given season (winter, spring, summer or autumn). x
Time: The time for each historical data point, in particular, the hour of the day allows considering patterns in night and day consumption.
x
Day of the week: The day of the week is set from Monday to Sunday while a flag indicates a workday, a weekend or a holiday. A list of U.K. bank holidays is also inputted so that these days are not considered outliers. Office HVAC loads, for example, would commonly be on during workdays and off during weekends, which the algorithm needs to consider.
b) Input variables that are site-dependent: x Past load data: The load data is differenced once, so that the original values at time t, vt are transformed into the difference between consecutive values dt vt vt 1 . This helps with stationarity, removes any seasonality and allows a coherent conclusion to be reached [2]. Serial correlation in the data can then be checked using the autocorrelation and partial autocorrelation function (ACF and PACF) measuring the association between a data point with past values of itself, indicating which past values are most useful in predicting future ones. From the ACF and PACF, the minutes of previous loads of each site with the highest significant correlations, are chosen as lagged values. For example, Fig. 2 shows the ACF and PACF for site1. The highest correlations correspond to the previous minute, 10th minute and 20th minute load on the same day. A lagged vector is built using these values, as well as with the previous 20th minute average on the same day. x
Temperature: The external temperature, in degrees Celsius, is taken from sites nearby and corresponds to the same day, hour and time as the load data. The temperature greatly affects the consumption of GSHP
Fig. 2. 48-hour lag sample autocorrelation (top) and partial-autocorrelation (bottom) functions at Site1
IV. SIMULATION RESULTS The following section describes the results given in Tables III and IV of the algorithm for four sites in the U.K., which include one GSHP and three HVAC loads for non-residential buildings. Each site is modeled first using a linear regression on the x matrix before being modeled with the ANN algorithm developed in this paper. The ANN for site1 were run with 10 hidden layers, while ANNs for sites 2 to 4 were run with 2 hidden layers in sections A and 0. Several tests are made to show the effects of previous load data, weather data and the number of hidden layers on the accuracy of the forecasts. A. Effect of previous load data Here, we test the importance of using serial autocorrelation in the data. Usually the previous day (previous 24 hours) and previous week (previous 168 hours) data are used. However, when forecasting GSHP and HVAC loads, the ACF and PACF show no significant correlation at the 24th and 168th hours; these are respectively 0.0003 and -0.0019. As a result, these forecasts are extremely inaccurate, as can be seen from Table III. The linear regression has a MAPE of 11,188% and a MAE of 2.34 kW, while an ANN with 5 hidden layers, although improving the forecast, still has a very large MAPE of 9,705% and MAE of 2.38 kW. On the other hand, using the data points with highest correlation (determined from Fig. 2), at the 1st, 10th and 20th minute load for site1, significantly improves results.
TABLE III. Load Site1 Site2 Site3 Site4
Linear Regression Using 24th and 168th hour lags Using correlated lags MAE (kW) MAPE (%) MAE (kW) MAPE (%) 2.34 11,188 0.12 453.73 17.95 577.46 6.48 0.86 9.92 418.45 0.83 8.90 2.67 200.52 1.21 15.98
TABLE IV. Load Site1 Site2 Site3 Site4
MAE (KW) AND MAPE (%) FOR FORECASTING METHODS USING DIFFERENT LAGGED VALUES Artificial Neural Network Using 24th and 168th hour lags Using correlated lags MAE (kW) MAPE (%) MAE (kW) MAPE (%) 2.38 9,705 0.11 121.11 11.01 127.16 0.86 5.84 7.81 225.18 1.00 9.73 6.96 509.19 41.43 0.81
MAE (KW) AND MAPE (%) FOR FORECASTING METHODS WITH AND WITHOUT WEATHER DATA
Linear Regression Excluding weather data Including weather data MAE (kW) MAPE (%) MAE (kW) MAPE (%) 0.12 453.73 0.13 523.75 0.86 6.48 1.38 9.08 0.83 8.90 0.88 9.67 1.21 15.98 1.20 14.48
Fig. 3. Site1 – Daily GSHP forecast vs. actual load (top) and errors (bottom)
Artificial Neural Network Excluding weather data Including weather data MAE (kW) MAPE (%) MAE (kW) MAPE (%) 0.11 121.11 0.09 109.46 0.86 5.84 0.85 3.92 1.00 9.73 0.84 8.36 41.43 1.10 49.20 0.81
Fig. 5. Site3 – Histogram of errors: Most errors are contained within -8.35 kW and 8.39 kW
Fig. 4. Site2 – Daily HVAC forecast vs. actual load (top) and errors (bottom) Fig. 1: Site3 – Regression plots with R-values for the linear regression (left) and the artificial neural network (right)
The MAPE is now 453% for the linear regression and 121% for the ANN. Similarly, the MAEs are significantly reduced in both models. Another way of verifying the accuracy of the forecast is to plot the forecasted data against the actual data as shown in Fig. 3 for site1 and Fig. 4 for site2. Alternative ways include plotting error histograms (Fig. 5) or plotting the regression of the results over the targets that the neural network aims at (Fig. 6). Exact forecasts (outputs) have a correlation coefficient, or R-value, equal to 1 with the targets. As shown in Fig. 6 for site3, the ANN achieves a higher R-value of 0.99462 than the linear regression, which has an R-value of 0.99382. Results for each site are given in Table III, where the smallest error values are highlighted in bold. It is worth noting that forecasts of site4 are actually better under linear regression than under the ANN. This is due to the limited amount of data
available for this site (only 1 month of data) and hence there is not enough to accurately train the network algorithm. This highlights the importance of having sufficient data to include in the ANN. Since ANNs are data-driven, the more data are available the more accurate the results. Satisfactory results would typically require at least 4 months of data, though this is dependent on the type of load predicted. The GSHP load at Site1 for example proves to be much more difficult to predict than HVAC loads at the other sites. B. Effect of weather data It has been greatly advocated that load consumption, particularly for GSHP and HVAC devices, closely follows weather patterns throughout the day. It is therefore essential to take weather information into account. To quantify this effect,
two forecasts are compared: one including weather data, the other excluding it. The difference in MAE and MAPE are compared in Table IV. It is worth noting that the inclusion of weather actually worsens the results of the linear regression. TABLE V. Number of hidden layers 2 5 10 15
SITE3 – MAE (KW) AND MAPE (%) FOR ANNS WITH DIFFERENT NUMBER OF HIDDEN LAYERS Error Metrics MAE (kW)
MAPE (%)
Training Time (mm:ss)
0.84 0.87 0.83 1.01
8.36 7.64 6.49 10.53
07:35 06:42 09:46 07:11
improve forecast accuracy. The ANN algorithm also has the advantage that training can be repeated as new data is gathered. This is particularly useful for new sites, which have very little data to begin with, helping to improve the forecast over time. As a result, these more accurate forecasts lead to more accurate decision-making as they allow precise amounts of demand to be contracted for demand response services. REFERENCES [1]
[2]
This is due to the fact that the relationship between weather and load data is non-linear. Yet, when including the weather data in the ANN algorithm, both the MAE and the MAPE are reduced. This highlights the advantage of using ANNs, which are capable of modeling this non-linear relationship. C. Effect of ANN model parameters Most load forecasting ANN algorithms recommend the use of two hidden layers [18]. However, ANN algorithms have rarely been applied to actual GSHP or HVAC power demand data. These may require more layers to fully capture the dynamics of the load and to improve the accuracy of the forecast. The algorithm is therefore run for a different number of hidden layers, ranging from 2 hidden layers to 15. For site3, the most accurate results are achieved with 10 hidden layers. The resulting MAE and MAPE values are given in Table V. In comparison to the linear regression method, which is almost instantaneous, the length of training for ANNs can be significant. However, training usually takes less than 10 minutes for more than 100,000 points. After the training, the adjusted weights and activation functions are recorded in a model, which is then used on the new data to instantaneously produce the forecasts. Table V also shows the different training times based on the number of hidden layers chosen. V. CONCLUSIONS Forecasts are essential for the demand response optimization process since the data relating to the operating regime or characteristics of the individual devices or loads connected are unknown. As a result, demand response measures can only provide the right benefits if available demand is correctly forecasted. From the studies conducted, it has become clear that artificial neural networks (ANNs) require large amounts of data, without which, training is inadequate and can result in large errors. Nonetheless, when large datasets are available, the implementation of ANN algorithms always outperforms linear regression and achieves a very high forecasting performance. Using lags that have most significant serial autocorrelation achieves the best results in terms of smallest mean absolute error (MAE) and mean absolute percent error (MAPE) for both the linear regression and neural network, compared with arbitrary lags of 24 hour and 168 hour lags which do not present high correlation for the types of loads considered here. Furthermore, including weather data in the model as well as choosing the correct number of hidden layers can further
[3]
[4]
[5]
[6]
[7]
[8] [9]
[10]
[11]
[12] [13]
[14]
[15]
[16]
[17]
[18]
D. C. Park, M. a. El-Sharkawi, R. J. Marks, L. E. Atlas, and M. J. Damborg, “Electric load forecasting using an artificial neural network,” IEEE Trans. Power Syst., vol. 6, no. 2, pp. 442–449, May 1991. A. P. A. da Silva and V. H. Ferreira, “Short-term load forecasting,” in Electric Power Systems: Advanced Forecasting Techniques and Optimal Generation Scheduling, J. P. S. Catalao, Ed. Boca Raton, FL, USA: CRC Press, 2012, pp. 1–22. H. Esen, M. Inalli, A. Sengur, and M. Esen, “Performance prediction of a ground-coupled heat pump system using artificial neural networks,” Expert Syst. Appl., vol. 35, no. 4, pp. 1940–1948, Nov. 2008. A. Beghi, L. Cecchinato, M. Rampazzo, and F. Simmini, “Load forecasting for the efficient energy management of HVAC systems,” in 2010 IEEE International Conference on Sustainable Energy Technologies (ICSET), 2010, pp. 1–6. Y. Yao, Z. Lian, Z. Hou, and W. Liu, “An innovative air-conditioning load forecasting model based on RBF neural network and combined residual error correction,” Int. J. Refrig., vol. 29, no. 4, pp. 528–538, Jun. 2006. K. Bruninx, D. Patteeuw, E. Delarue, L. Helsen, and W. D’haeseleer, “Short-term demand response of flexible electric heating systems: The need for integrated simulations,” in 2013 10th International Conference on the European Energy Market (EEM), 2013, no. May, pp. 1–10. J. Moral-Carcedo and J. Vicéns-Otero, “Modelling the non-linear response of Spanish electricity demand to temperature variations,” Energy Econ., vol. 27, no. 3, pp. 477–494, May 2005. S. A. Kalogirou, “Applications of artificial neural-networks for energy systems,” Appl. Energy, vol. 67, no. 1–2, pp. 17–35, Sep. 2000. G. Zhang, B. Eddy Patuwo, and M. Y. Hu, “Forecasting with artificial neural networks: The state of the art,” Int. J. Forecast., vol. 14, no. 1, pp. 35–62, Mar. 1998. A. P. A. da Silva and L. S. Moulin, “Confidence intervals for neural network based short-term load forecasting,” IEEE Trans. Power Syst., vol. 15, no. 4, pp. 1191–1196, 2000. M. Khashei and M. Bijari, “An artificial neural network (p,d,q) model for timeseries forecasting,” Expert Syst. Appl., vol. 37, no. 1, pp. 479–489, Jan. 2010. G. E. P. Box, G. M. Jenkins, and G. C. Reinsel, Time series analysis: Forecasting and control, 4th Editio. 2008, p. 784. H. Hahn, S. Meyer-Nieberg, and S. Pickl, “Electric load forecasting methods: Tools for decision making,” Eur. J. Oper. Res., vol. 199, no. 3, pp. 902–907, Dec. 2009. F. Gugliermetti, G. Passerini, and F. Bisegna, “Climate models for the assessment of office buildings energy performance,” Build. Environ., vol. 39, no. 1, pp. 39–50, Jan. 2004. H. S. Hippert, C. E. Pedreira, and R. C. Souza, “Neural networks for short-term load forecasting: a review and evaluation,” IEEE Trans. Power Syst., vol. 16, no. 1, pp. 44–55, 2001. S. J. Kiartzis, a. G. Bakirtzis, and V. Petridis, “Short-term load forecasting using neural networks,” Electr. Power Syst. Res., vol. 33, no. 1, pp. 1–6, Apr. 1995. R. Hecht-Niesen, “Theory of the backpropagation neural network,” in International Joint Conference on Neural Networks, 1989, pp. 593–605 vol.1. J. N. Fidalgo and M. A. Matos, “Forecasting Portugal global load with artificial neural networks,” in Artificial Neural Networks - ICANN 2007, 17th International Conference, pp. 728–737.