Optimal combined short-term building load forecasting - CiteSeerX

1

Optimal combined short-term building load forecasting Cruz E. Borges, Yoseba K. Penya, and Iván Fernández

Abstract—Short-term load forecasting (STLF) is one of the main pillars of the smart grid vision since a reliable prediction helps reducing the deviation in the generation and, consequently, increases the overall efficiency. Classic STLF methods range from statistical models to more complicated Artificial Intelligence approaches. All of them presents remarkable records in a certain situations while simultaneously fail in others and, moreover, each possibility offers different information and precision. In this way, analysing the results of the models gives us the chance to 1) learn which model should be applied when, 2) correct these results and, 3) combine them to obtain a prediction of higher quality. Finally, we focus here in building STLF, an special branch that presents additional requirements, especially regarding the need of simplicity. In this way, we explore these 3 post-process alternatives on the most popular STLF techniques. Specifically, we present here a comparative between 4 forecasting methods and 6 forms of post-processing their results. We have tested all thoroughly with 4 different datasets and shown that, in this problem domain, the best forecasting method can only be improved by post-processing only in case it does not clearly outperform the rest, since all analysed post-processing methods use the precision difference on the methods to correct them. Index Terms—Demand forecasting, Energy management, Smart grids, Power demand, Power system dynamics

I. I NTRODUCTION Short-term load forecasting is a major basis in everyday’s life of current power networks. The accuracy of the prediction is crucial for the correct working way and even the smallest deviation may become a loss of hundreds of thousands or even millions of dollars [1]. Historically, there has been a hectic activity around shortterm load forecasting (STLF). The solutions proposed can be divided into two main groups, depending on the strategy followed. On the one hand, statistical methods try to estimate a regression function that contains the points registered in the historical load data (i.e. consumption record). They are very effective ways for approaching linear curves but, since load forecasting is non-linear, statistical methods present poorer results than their counterparts [1], [2]. On the other hand, Artificial Intelligence has designed a bunch of techniques, methods, and models that deal with risk and uncertainty (the main aspects behind forecasting and prediction). The most popular due to their efficiency are Support Vector Machines (SVM) and Neural Networks (NN) (see Section II for a Cruz E. Borges and Iván Fernández are with DeustoTech (Energy Unit) University of Deusto, Avenida de las Universidades 24, 48007 Bilbao, Spain. {cruz.borges,ivan.fernandez}@deusto.es Yoseba K. Penya is with University of Deusto, Avenida de las Universidades 24, 48007 Bilbao, Spain. [email protected]

more accurate description). Still, though their accuracy, artificial intelligence methods present a number of drawbacks and inconveniences such as difficult parametrisation, nonobvious selection of variables, and over-fitting and require much historical data to learn the patterns inherent on it [2]. The most popular and accurate of these methods, NN, presents inconveniences such as very time-consuming learning process involving risk of local minima, the lack of an exact rule for setting the number of hidden neurons to avoid over-fitting or under fitting, the inability to generate explanations for their results and their poor scalability [3]. Moreover, some models may present very good records in a certain situation whereas fail in others. Similarly, each one offers different information and precision. If we simply choose the method whose error is minimal as the optimum method, we may lose some important information. Model combination addresses this issue: it is a well-established procedure for improving forecasting accuracy [4] and has already been applied in other disciplines (see [5] for a survey). According to [5], [6], past research in model combination has produced two primary conclusions, one expected and one surprising. The expected conclusion says that combining forecasts reduces the error compared to the average error of the component forecasts (conclusion also highlighted in [4], [7]). The surprising conclusion shows that a simple average of the component forecasts performs as well as more sophisticated statistical approaches. Our experiments confirm both conclusions, as we will see. This technique has already been applied to STLF with classifiers such as average [8], multiple linear combination [9], or diverse machine learning techniques to determine the weights [8], [10], [11] (please see Section II for more details). We focus here on non-residential building STLF (bSTLF), meaning schools, universities, public buildings and companies’ facilities. This novel branch presents special features: for instance, in normal country-wide STLF, the non-linearity of the load becomes smoothed, since expected consumption that does not take place is compensated by non-expected consumption that does. Moreover, the consumption curve tends to be stationary, seasonal and regular, coinciding with the times the building is used. Hence, there is no consumption at night (or it is negligible) and, anyway, there exists a notable gap between idle and activity times. Further, many of these buildings are not yet fully-automated: either the HVAC is manually controlled or it is switched on and off remotely. Anyway, it does not adapt to sudden weather changes and this influence is comprehended within the consumption data. Another critical aspect is that usually there is scarce (if any) historical data on hourly load

2

and the load profile is sure to vary and evolve over the time (just think of the gadget an office used to have 10 years ago compared to nowadays fully-equipped on-line ones). Having good bSTLF it is not so important in a grid environment (except if the own of the building buy his own electricity) than in a micro-grid environment where summing up the predictions of every node of the micro-grid can give betters forecasts than using the aggregate data [12]. Moreover, the method or algorithm chosen for this purpose must be simple to tailor to every single case (e.g. there should not be a Neural Network freak in a school to control and periodically adapt the NN that predicts their load profile) Moreover, the method must not be computationally expensive as it must scale to make the forecast of all nodes in the micro-grid. Finally, there is a definitive difference: non-residential buildings offer a very simple method to classify day types since they are occupied only during working time. Therefore, the work calendar can act as a day type predictor, avoiding in this way much of the non-linearity of the forecasting process [13], [14]. There has been a remarkable research on building STLF, backed especially by forecast competitions such as [15] or [16] but, to our knowledge, this is the first attempt to apply combined forecasting to bSTLF. Against this background, we advance the state of the art by applying, for the first time, combined forecasting to nonresidential building short-time load forecasting. We have designed and developed six different meta-models to this end and tested them thoroughly on four different datasets. The remainder of the paper is structured as follows. Section II analyses related work in STLF and building STLF. Section III presents the model behaviour learners, the models themselves and details the features of the used datasets. Section IV details the tests and discusses the obtained results. And, finally, Section V concludes and draws the avenues of future work. II. R ELATED W ORK Short-term load forecasting presents a large research tradition applied to country loads (see [17], [2], [18] for a comprehensive survey on STLF) but not so much restricted to more accurate goals (e.g. buildings). Research on STLF mainly focuses on two branches. The first one deals with statistical methods and causal models like dynamic linear or non-linear models, ARMAX models [19], or non-parametric regression [20], with ARIMA as the method that achieves most promising results [21]. The second group is related to artificial intelligence methods that address and try to cope with the nonlinear characteristics of the historical data (e.g support vector machines [22], [23], [22] or neural networks [24], [25], [26]). Nevertheless, STLF in buildings points at a different problem domain, and there have been a number of interesting initiatives such as using SVM to predict the load of a building complex [27], or a feedback NN that used the temperature to obtain a remarkable MAPE of the 1.945% [24], but this result is the load forecasting for one week in a whole year, which is not representative and neither has it been validated with other data patterns. As aforementioned, meta models are not a new approach. The branch of work that has gathered the most attention is

focused on meta-heuristics (the term already coined in 1986 [28]), an upper-level strategy that controls and modifies other heuristics in order to produce solutions of higher quality [29], [30]. Still, to our knowledge, there is no single work applying this technique to non-residential building STLF. As for normal (country-wide) STLF, the research meta-heuristics has concentrated on two areas. The first one uses a metaheuristic to calculate the best set of parameters of an SVM or a NN [31], [32], [33], [34], [35], [36], [37] (see [3] for a survey on NN-based hybrid methods) but these works suffer from the same flaw single models (i.e. without heuristics) do. The second area has explored the optimal way of combining the output of the single models, usually by assigning weights (see [7] for different approaches to this end). The second research line in meta models points to the combinations of forecasts. For instance, a very simple but effective approach consists of defining equal weights, usually referred to as the Simple Average (SA) combination method (which, despite being simple, has shown to be surprisingly effective [6], [8]). More sophisticated approaches include linear combination [9] (including diverse machine learning techniques to determine the weights [8], [10]), dynamic optimal weight combination [11], a genetic algorithm as best model selector [38] or rulebased best model selection [39], [40], [41] (which is similar to the first classifier we have designed). Please see [42] for a survey on meta-heuristics and forecast combination applied to power systems in general. III. A DVANCED STLF IN NON - RESIDENTIAL BUILDINGS In previous works [13], [14], we analysed the nature of our problem domain based on the validation of 3 hypotheses, all of then related to the non-linearity of the load data. The first hypothesis claimed that the influence of weather variables on the load consumption in this specific domain is negligible. The second hypothesis defended the work calendar as more effective and accurate day-type predictor than any clustering technique usually addressed to this task. Finally, the third hypothesis maintained that the work calendar provided enough information to solve the non-linearity of the day-type prediction. Our experiments validated the three of them [13]. The methodology put forward comprises two steps. First, we classify the day whose load we want to predict depending on the date and the work calendar and, then, we adjust the load curve of such day within the models and methods for each hour. To this end, we use three type of days: week day, Saturday, and Sundays. In terms of regression computing, there is an extended error when it comes to predicting the whole load on different types of days chronologically, (see for instance [43]): It is not accurate to compare the load on weekends to that of weekdays. This fact means that predicting the load for 11am on a Monday implies selecting data from similar types of previous days, such as Friday, Thursday, and Wednesday if the learning window is 3 days. The next step was done in [14] where we make an in depth tuning and validation of the methodology proposed. We carry out a grid search in the parameter space of each different models proposed following the advice given in [44] as well as validate ours results in four different datasets.

3

the parameter l. Namely we have carried out our tests with l ∈ {0, 1, 2, 3, 4, 5, exp}. Note that l = 0 corresponds to the mean of the previous values and exp denotes the exponential method. 2) Polynomial model: The second model consists of univariate polynomial that tries to (clumsily) capture the load curve. It is defined as follows:

Forecasted Data

Data Base

Real Data

Post-Process

ld (x) =

d X

αii xi .

i=0

Algorithm1 .. .

Model Prediction

Algorithmn

Fig. 1.

Flowchart of the proposed methodology.

In this paper we move one step ahead and apply the concept forecast combination The idea is to post-process the different outputs of every model in order to create a better forecast using the different outputs of every model but remaining still as simple as possible (along to the requirements of bSTLF). With this objective in mind, we propose the methodology illustrated in Fig. 1. In this way, we use the outputs and past errors of all forecasting algorithms as the input of the post-process methods. A. Forecasting algorithms We have used the output of the following methods as inputs to feed the post-process methods. As we have seen in Section II, all those methods have been widely used in the literature for this problem domain. 1) Time Series model: We have chosen an Autoregressive Model (which is commonly used for modelling univariate time series) for every hour and day type: = sh,d t

q X

h,d ϕh,d i st−i ,

i=1

ϕh,d i

where are the model parameters. In the adjusting step, we have computed the q last values of the same day type (e.g. with q = 3, from a Tuesday, the previous Monday, Friday, Thursday) and not the q last chronological values (e.g. from a Tuesday, the previous Monday, Sunday, and Saturday). Moreover, we assign weights (model coefficients l) for those days of the prediction window, in order to give a higher priority to the latest data against the oldest values, by polynomial or exponential methods. Polynomial methods produce the following parameters: (q − i)l ϕi = Pq , l i=0 (q − i) whereas the exponential method produces: 2(q−i) ϕi = Pq , (q−i) i=0 2 where q is the value of learning window and l can take values l ∈ Z for polynomial case. We have used different values for

d is the degree of the polynomial. It is adjusted to every single day and hour by using the least squares technique. We have tested several degrees, namely d ∈ {4, 5, 6, 7, 8}. 3) Neural network: NNs are non-linear circuits whose perceptron (say simple information processors) structure adapts according to the external or internal information that flows through the network during the learning phase. Their output is a linear or non-linear function of the inputs and, therefore, they have been widely used for predicting non-linear data (as in STLF [1], [2], [24]). We have performed the tests using only one hidden layer composed of {10, 30, 50, 100} neurons with T AN H activation function. 4) Support Vector Machines: SVM constructs a hyperplane or set of hyperplanes in a high or infinite dimensional space, which can be used for classification, regression, or other tasks. SVM have been used for load forecasting in buildings ([27]). In this case we have used a ν-SVR using a Radial Basis Function as kernel and parameters: ν = 0.9, ε = 10−2 , C ∈ {1, 10, 100} and γ ∈ {1, 10−1 , 10−2 }. B. Post-process Methods We have grouped the different post-process method in three different families: 1) Learning Methods: This methods used different strategies to select the forecast algorithm that have made the best forecasting for this day-type and hour and then we will use them to make the final forecast. a) Rule-based learning: The rule-based learner induces its rules from the historical load data. More accurately, it determines the forecasting algorithm that issues the best prediction for a given day-type and hour. To this end, we store a learning window with the last h errors incurred by the forecasting algorithms and choose the one that presents the lowest mean error. We have tried different learning windows lengths as well as several weights. Namely, we have conducted our experiments with h ∈ {5, 10, 20, 30, 90, 180} and with weights similar to those in the Time Series model. b) Bayesian Network learning: A Bayesian Network is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph. In this case the random variable computed by the Bayesian Network is the probability of being the method with less error in a particular day-type and hour. A Bayesian Network needs a train period in order to build the statistical model. As with the rule-based algorithm we have tested with different train periods, in fact, we have used the same train periods. In order to avoid the data ageing

4

problem, we continuously buffer the training data and re-train the Bayesian Network every time the buffer is filled. Then we empty the buffer and repeat the process. c) Neural Network and Support Vector Machine learning: We also use Neural Networks and Support Vector Machine to the same end. In order to avoid the data ageing problem we use the same strategy as with Bayesian Networks. 2) Bias Correction: This post-process method simple adds a Gaussian Random Value with the same mean and standard deviation than the error in the learning windows that have been measured for this particular method, hour and day type. The aim of this strategy is to mimic the behaviour of the model given in Equation (1) with in hopes of correcting the typical historical error of the output. 3) Forecast Combinations: This post-process method combines the prediction of the forecast algorithms to improve them. Each single prediction is weighted according to the (normalised) historical error, obtained by polynomial or exponential methods. Namely: e−l wi := Pn i −l i=1 ei

or

l−ei wi := Pn −e i i=1 l

where wi is the weight applied to the prediction of the algorithm i, n is the amount of them, ei the average error of the algorithm i in the selected training window and l is a parameter that enables tuning-up this method. For the polynomial method we have tested the following values: l = 0 is the simple average, l = 1 obtains just the linear combination respective to the historical error and l = 3 stresses predictions with lesser error. And for the exponential method we have tested with l = 2 strengthens this bias. We have named these new modalities sa, li, cu and ex, respectively (see Tab. IV and V). IV. D ISCUSSION This section describes the results of the experiments we carried out to determine the most suitable meta model to for STLF in non-residential buildings. A. Experimental Design We have designed two experiments in order to test the following hypotheses: 1) If there is a model presenting always better forecasting results than the rest, we cannot improve its results. 2) If the models present similarly distributed errors, the methodology described herewith can improve the performance of the forecasting algorithms. As we have seen in [13], [14] the Time Series model offers better forecasting results than the rest of ours models. So, in order to test the first hypothesis, we executed all post-process methods detailed in Subsection III-B using all the forecasting algorithms methods explained in Subsection III-A. Since the polynomial and NN methods present a similar (say bad) performance, in order to test the second hypothesis, we have conducted a second experiment with only these two forecasting algorithms and post-process methods.

1) Methodology Description: As aforementioned, we selected the day-type according to the work calendar. Then we fed the model with the previous q days (depending on the length of the learning window) of the same type and adjusted the model to that data using the different forecasting methods. After that, we used the meta-models to choose the best forecast for every hour. We iterated over the whole dataset following this procedure. Then, we compared the predicted result with the real consumption value and computed the Mean Absolute Percentage Error (MAPE) to measure it. We have selected this error to measure the performance of the models since it is unit free; this is, it allows comparisons between forecasting errors from different measurement units. Moreover, it is the error measure most widely used in forecasting [24]. It is calculated as follows: ! days 24 1 X 1 X |rij − pji | × 100, M AP E := days j=1 24 i=1 rij where pji is the predicted value of the load for the hour i of the day j, rij the actual one and days represent the numbers of days in that particular datasets. 2) Datasets: We have recorded the energy consumption of the University of Deusto in Donostia-San Sebastián (Basque Country). We have downloaded this data directly from the meter, placed by the Spanish law (54-1997) directly at the transformer, and using the IEC 60870-5-102 standard protocol [45]. This building presents an special feature since its heating system is not regulated according to the weather: from autumn to spring, it is manually turned on every day at approximately the same time and it works until the campus is closed at night; therefore, meteorological conditions do not have influence on the electricity consumption at all (season, on the contrary, does) or this influence is somehow dissolved (say represented) in the data. Furthermore, a new building was added to the campus in July 2009 (our first records date from March 2009). This fact yields forecasting more difficult due to the noise it introduces; nevertheless, the tested algorithms also demonstrate their ability to adapt to evolving data, as we will see. Hereby, we have split this full dataset in two datasets (donosti1 and donosti2 ) because there is a big difference due to the increment of consumed energy after the construction of the new building. Fig. 2 and Fig. 3 show the average daily load curve for dataset donosti1 and donosti2 . As shown in these figures, the curve presents quite a regular profile in working days, with consumption from 7am to 10am (open hours go from 8am to 9pm). On Saturdays, it shows a peak at noon and on Sundays it is almost flat. The first one (donosti1 ), with a length of 6 months (March to September 2009) is more regular and homogeneous. The second one (donosti2 ) has a length of 12 months (September 2009 to September 2010), showed quite a non-regular profile with frequent noisy values due to the construction of the new buildings.

5

TABLE I M INIMUM EXPECTED MAPE FOR EVERY DATASET (%). dataset donosti1 donosti2 ashrae eunite

Week 3.89 7.16 4.24 9.22

Saturdays 6.47 8.66 5.67 NA

Sundays 5.52 9.52 6.70 8.68

Total 5.06 8.05 4.96 9.20

TABLE II B EST PARAMETERS FOR DAY- AHEAD FORECASTING . Models Donosti1 Donosti2 Ashrae Eunite AR (l) exp exp exp 1 SVM (C, γ) 10, 1 10, 1 10, 1 10, 1 Polynomial (d) 8 6 6 8 Neural Net (hidden) 10 10 10 10

Moreover, we made our experiments with other datasets specifically, we used the data provided in the Eunite ([16]) and Ashrae ([15]) competitions. The Ashrae competition dataset (ashrae) comes from a unknown building and has a length of only 6 months (September 1989 to February 1990). It presents similar profile as the University of Deusto datasets (donosti1 and donosti2 ). In contrast, the Eunite competition dataset (eunite) has a length of 24 months (January 1997 to December 1998) and comes from the whole Eastern Slovakian electricity demand. It only presents two patterns, one for weekdays and one for holidays. We extracted the work calendar from the load data. Fig. 4 and 5 portray their profiles. Note that the results obtained of both competitions are not comparable to ours. The Eunite competition asks to solve the forecasting of maximum daily electrical load of one month while the Ashrae competition demands participants to predict certain gaps in the dataset and used a custom error function. Finally, we have estimated the best MAPE that can be expected, as summarised in Table I (see the Appendix for a detailed explanation on the procedure). B. Experimental Result Tab. IV summarises the MAPE results obtained first by the forecasting algorithms [13], and, then, by the meta-models. The number in parenthesis shows the length of the training window and, in the case of the ERLC, the value of l (see Section III-B). As we can see, only in one case does a meta model beat the mark of the Time Series algorithm (the ERLC), due to the fact that the respective error rates in that situation are quite similar (which makes the Ashrae dataset closer to Experiment 2). In the rest of the tests, they cannot improve the results, validating in this way our first hypothesis. Remarkably, the NN manages to learn the best algorithm and equals the results of the Time Series in 2 cases. Tab. V shows the MAPE results obtained in the second experiment, including only the results of the polynomial and the NN. Here, the ERLC performs again better than the rest, improving in every case the best result. In the Ashrae dataset, the remarkable record obtained by the rulebased model responds to the fact that the Neural Network is exceptionally good in Saturdays and Sundays but fails in

TABLE III MAPE RESULTS IN DAY- AHEAD FORECASTING (%). I N BRACKETS RESULTS WITH B IAS C ORRECTION POST- PROCESS Algorithm Time Series SVM Polynomial Neural Net

Donosti1 7.34 (7.51) 7.92 (8.53) 11.91 (11.19) 13.46 (12.23)

Donosti2 13.78 (13.76) 14.25 (14.67) 19.78 (18.02) 17.64 (16.51)

Ashrae 5.76 (5.86) 5.88 (6.06) 6.94 (6.58) 6.63 (6.67)

Eunite 6.69 (6.66) 7.34 (7.45) 7.36 (7.30) 7.78 (7.94)

TABLE IV MAPE RESULTS FOR EXPERIMENT 1 (%). I N BRACKETS BEST PARAMETERS . Algorithm ERLC Rule-based Bayesian Meta-NN Meta-SVM

Donosti1 7.93 (30,li) 8.42 (10,m) 13.54 (10) 7.34 (10) 7.42 (180)

Donosti2 14.11 (180,cu) 14.99 (10,li) 16.78 (180) 13.78 (10) 13.79 (180)

Ashrae 5.62 (180,cu) 5.87 (30,li) 6.72 (30) 5.61 (10) 5.61 (180)

Eunite 7.09 (30,li) 7.74 (30,li) 8.3 (10) 8.15 (10) 7.21 (180)

weekdays (worsening the average); therefore, the rule-based model learns to use the NN in weekends and the polynomial in weekdays. Finally, worth to mention, both rule-based and SA models beat more complicated methods (BN, NN and SVM), confirming the tendency in other domains [5], [6], [7], [4]. V. C ONCLUSIONS We have focus here on non-residential building Short-time Load Forecasting. This special problem presents several added requirements if compared to the more general STLF. Basically, any bSTLF should be easily adaptable and not require any tedious trial-and-error process customisation, be able to work with scarce and evolving historical data and be as accurate as possible. Under these premises, we have continued previous work and applied the concept of model selection, metaheuristic or model combination. This methodology uses the outputs of the (simple) forecasting algorithms as inputs of a model that decides the way of combining those results (if any). We have developed two meta models combining the forecasts [7], [9], [9], [8], [10], [11]) and four choosing the best algorithm for each day and hour type. We have conducted two experiments with 4 popular forecasting algorithms applied on 4 different datasets, selected accordingly then by one of the six meta models. Confirming previous works [5], [6], [7], [4], our results highlight that in bSTLF an error-respective linear combinator suffices to improve the forecast iff there is not forecasting algorithm that clearly outperforms the rest. Future works include opening our scope and focus on normal STLF (free from simplicity requirements). Additionally, we plan to compare and choose the best meta model by using yet another upper level (i.e. meta model of the meta models). A PPENDIX In this section we present how we have computed the estimation of the minimum MAPE. Suppose that the load curve l(h) of an specific day type has the following expression: l(h) := f (h) + ξh ,

(1)

where f is an unknown function and ξh is a Gaussian random variable with mean 0 and variance σh2 . In our experiments we

6

5

10

15

20

5

10

15

20

15

20

10

15

20

5

10

15

20

400 100

200

300

400 200 100

200 100 5

Sunday

300

400

Saturday

300

400 300 100 10

10

15

20

5

10

15

20

5

10

15

20

5

10

15

20

10

15

20

5

10

15

20

5

10

15

20

400 300 100

100

200

300

400

Sunday

200

300 200 100

200 100 5

Saturday

400

Friday

400

Thursday

300

400 300 100

100

200

300

400

Wednesday

200

300 200 100

5

5

10

15

20

5

10

15

20

Average daily load for dataset donosti2 . Error bars denotes ±σ.

15

20

5

10

15

20

10

15

20

5

10

15

20

1000

Sunday

600

800

1000 600

800

1000 600

800

1000

5

Saturday

5

10

15

20

400

10

600

800

1000 800

5

Friday

400

20

Thursday

400

15

400

10

Wednesday

600

800

1000

Tuesday

600 5

400

600

800

1000

Monday

400

Fig. 3.

5

10

15

20

Average daily load for dataset ashrae. Error bars denotes ±σ.

10

15

20

5

10

15

20

5

10

15

20

5

10

15

20

1400 1000

1000

1200

1400

Sunday

1200

1400 1000

1200

1400 1200 1000

1200 5

Saturday

5

10

15

20

800

20

Friday

800

15

Thursday

800

10

1000

1200 1000 5

800

1000

1200

1400

Wednesday

1400

Tuesday

1400

Monday

800

Fig. 4.

800

400

5

Tuesday

400

Monday

800

Friday

Average daily load for dataset donosti1 . Error bars denotes ±σ.

Fig. 2.

Fig. 5.

Thursday

200

300 100

200

300 200 100

100

200

300

400

Wednesday

400

Tuesday

400

Monday

5

10

15

20

Average daily load for dataset eunite. Error bars denotes ±σ.

TABLE V MAPE RESULTS FOR EXPERIMENT 2 (%). I N BRACKETS BEST PARAMETERS . Algorithm Donosti1 Donosti2 Ashrae Eunite ERLC 10.65 (30,li) 16.2 (10,li) 7.27 (30,ex) 6.06 (30,ex) Rule-based 10.85 (10,ex) 14.36 (10,ex) 6.09 (10,ex) 7.74 (30,ex) Bayesian 14.36 (10) 19.42 (10,li) 6.87 (10) 8.17 (30) Meta-NN 11.85 (10) 17.29 (10) 8.05 (10) 6.49 (30) Meta-SVM 12.06 (10) 18.1 (10) 8.16 (10) 6.69 (30)

for that case as follows: " # 24 100 X f (h) + ξh − f (h) min := E = 24 i=1 f (h) + ξh 24 100 X ξh , (2) = E 24 f (h) + ξh h=1

Up to this point we do not have any evidence on how to compute the exact value of this expected value. Note that this would be the best theoretical error we may achieve. Our next steps aim at giving a rude estimation on Equation (2). Suppose the following bound applies: have measured (via a Gaussian Test) that this is a rational hypothesis, at least in the case of the Time Series model. Any method that successfully forecasts the load curve l will have learned f (h). We may estimate the min expected MAPE

f (h) + ξh < max(l). Using the bound in Equation (3) in Equation (2) leads to: min ≥

24 X 100 1 E[|ξh |]. 24 max(l) h=1

(3)

7

As E[|ξh |] =

q

2 π σh

(see [46] for example) we have that:

100 min ≥ 24

r

24 X 2 1 σh . π max(l) h=1

We may then estimate σh for instance by Var(l(h)). R EFERENCES [1] H. Alfares and M. Nazeeruddin, “Electric load forecasting: literature survey and classification of methods,” International Journal of Systems Science, vol. 33, no. 1, pp. 23–34, 2002. [2] V. Hinojosa and A. Hoese, “Short-term load forecasting using fuzzy inductive reasoning and evolutionary algorithms,” IEEE Transactions on Power Systems, vol. 25, no. 1, pp. 565–574, 2010. [3] A. ul ASAR, S. R. ul HASSNAIN, and A. U. KHATTACK, “A multiagent approach to short term load forecasting problem,” International Journal of Intelligent Control and Systems, vol. 10, pp. 52–59, 2005. [4] M. Hibon and T. Evgeniou, “To combine or not to combine: selecting among forecasts and their combinations,” International Journal of Forecasting, vol. 21, no. 1, pp. 15–24, 2004. [5] R. Clemen, “Combining forecasts: A review and annotated bibliography,” International Journal of Forecasting, vol. 5, pp. 559–583, 1989. [6] J. Scott-Armstrong, “Combining forecasts: The end of the beginning or the beginning of the end,” International Journal of Forecasting, vol. 5, pp. 585–588, 1989. [7] L. DeMenezes, D. Bunn, and J. Taylor, “Review of guidelines for the use of combined forecasts,” European Journal of Operational Research, no. 120, pp. 10–204, 2000. [8] R. Prudencio and T. Ludermir, “Using machine learning techniques to combine forecasting methods,” in In Lecture Notes in Artificial Intelligence, 2004, pp. 1122–1127. [9] K.-B. Song, Y.-S. Baek, and D.-H. Hong, “Short-term load forecasting for the holidays using fuzzy linear regression method,” IEEE Transactions on Power Systems, vol. 20, no. 1, pp. 96–101, 2005. [10] R. Prudencio and T. Ludermir, “A machine learning approach to define weights for linear combination of forecasts,” in In 16th International Conference on Artificial Neural Networks, 2006, pp. 274–283. [11] Y.-X. Jin and J. Su, “Similarity clustering and combination load forecasting techniques considering the meteorological factors,” in Proceedings of the 6th WSEAS International Conference on Instrumentation, Measurement, Circuits and Systems. World Scientific and Engineering Academy and Society (WSEAS), 2007, pp. 115–119. [12] E. F. Snchez-beda and A. Berzosa, “Modeling and forecasting industrial end-use natural gas consumption,” Energy Economics, vol. 29, no. 4, pp. 710–742, 2007, modeling of Industrial Energy Consumption. [13] Y. Penya, C. Borges, D. Agote, and I. Fernandez, “Short-term load forecasting in air-conditioned non-residential Buildings,” in Proceedings of the 20th IEEE International Symposium on Industrial Electronics (ISIE). IEEE, 2011, pp. 1359–1364. [14] I. Fernández, C. Borges, and Y. Penya, “Efficient building load forecasting,” in Proceedings of the 16th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), 2011 in press. [15] J. Kreider and J. Haberl, “Predicting hourly building energy usage: The great energy predictor shootout – overview and discussion of results,” ASHRAE Trans. 100, part, no. 2, p. 1104, 1994. [16] E. C. Comite, “Electricity load forecast using inteligent adaptative technology,” 2001. [17] E. Feinberg and D. Genethliou, “Load forecasting,” in Applied Mathematics for Power Systems, Chapter 12, 2005, pp. 269–285. [18] E. Kyriakides and M. Polycarpou, “Short term electric load forecasting a tutorial,” Studies in Computational Intelligence (SCI), vol. 35, pp. 391–418, 2009. [19] H. Yang and C. Huang, “A new short-term load forecasting approach using self-organizing fuzzy armax models,” IEEE Transactions on Power Systems, vol. 13, no. 1, pp. 217–225, 1998. [20] W. Charytoniuk, M. Chen, and P. Van-Olinda, “Non parametric regression based short-term load forecasting,” IEEE Transactions on Power Systems, vol. 13, no. 3, pp. 725–730, 1998. [21] M. Hagan and S. Behr, “The time series approach to short term load forecasting,” IEEE Transactions on Power Systems, vol. 2, no. 3, pp. 785–791, 1987. [22] A. Jain and B. Satish, “Clustering based short term load forecasting using support vector machines,” in Proceedings of the IEEE Bucharest PowerTech, 2009, pp. 1–8.

[23] S. Lin, Z. Lee, S. Chen, and T. Tseng, “Parameter determination of support vector machine and feature selection using simulated annealing approach,” Applied Soft Computing, vol. 8, no. 4, pp. 1505–1512, 2008. [24] P. González and J. Zamarreño, “Prediction of hourly energy consumption in buildings based on a feedback artificial neural network,” Energy and Buildings, vol. 37, no. 6, pp. 595–601, 2005. [25] S. E. Papadakis, J. B. Theocharis, S. J. Kiartzis, and A. G. Bakirtzis, “A novel approach to short-term load forecasting using fuzzy neural networks,” IEEE Transactions on Power Systems, vol. 13, no. 2, pp. 480–492, 1998. [26] R. Sadownik and E. P. Barbosa, “Short-term forecasting of industrial electricity consumption in brazil,” International Journal of Forecasting, vol. 18, no. 3, pp. 215–224, 1999. [27] B. Dong, C. Cao, and S. Lee, “Applying support vector machines to predict building energy consumption in tropical region,” Energy and Buildings, vol. 37, no. 5, pp. 545–553, 2005. [28] F. Glover, “Future paths for integer programming and links to artificial intelligence,” Computers and Operations Research, no. 5, pp. 533–549, 1986. [29] C. Johnson, “A design framework for metaheuristics,” Artificial Intelligence Review, vol. 29, pp. 163–178, 2008. [30] C. Lemke and B. Gabrys, “Meta-learning for time series forecasting and forecast combination,” Neurocomputing, vol. 73, no. 10-12, pp. 2006– 2016, 2010. [31] Z. Liuzhang, “Short-term electric load forecasting with combined data mining algorithm,” Automation of Electric Power Systems, 2006. [32] C.-H. Wu, G.-H. Tzeng, and R.-H. Lin, “A novel hybrid genetic algorithm for kernel function and parameter optimization in support vector regression,” Expert Syst. Appl., vol. 36, pp. 4725–4735, April 2009. [33] G.-C. Liao, “Hybrid chaos search genetic algorithm and meta-heuristics method for short-term load forecasting,” Electrical Engineering (Archiv fur Elektrotechnik), vol. 88, pp. 165–176, 2006. [34] Z. Yang, X. Nie, W. Xu, and J. Guo, “An approach to spam detection by naive Bayes ensemble based on decision induction,” in Proc. of the 6th International Conference on Intelligent Systems Design and Applications (ISDA’06), 2006, pp. 861–866. [35] G.-s. Hu, F.-f. Zhu, and Y.-z. Zhang, “Short-term load forecasting based on fuzzy c-mean clustering and weighted support vector machines,” in Proceedings of the Third International Conference on Natural Computation - Volume 05, ser. ICNC ’07. Washington, DC, USA: IEEE Computer Society, 2007, pp. 654–659. [36] Z. Ismail, F. Jamaluddin, and F. Jamaludin, “Time series regression model for forecasting malaysian electricity load demand,” Asian Journal of Mathematical Statist, no. 1, pp. 139–149, 2008. [37] V. Ferreira and A. Pinto-Alves-da Silva, “Automatic kernel based models for short term load forecasting,” in 15th International Conference on Intelligent System Applications to Power Systems (ISAP), 2009, pp. 1–6. [38] P. Cortez, M. Rocha, and J. Neves, “A meta-genetic algorithm for time series forecasting,” 2001. [39] F. Collopy and J. S. Armstrong, “Rule-based forecasting: development and validation of an expert systems approach to combining time series extrapolations,” Management Science, vol. 10, pp. 1394–1414, 1992. [40] K. Hwang, “A stlf expert system,” in Proceedings of 5th Russian-Korean IEEE International Symposium on Science and Technology, 2001, pp. 112–116. [41] X. Wang, K. Smith-Miles, and R. Hyndman, “Rule induction for forecasting method selection: Meta-learning the characteristics of univariate time series,” Neurocomputing, vol. 72, no. 10-12, pp. 2581 – 2594, 2009, lattice Computing and Natural Computing (JCIS 2007) / Neural Networks in Intelligent Systems Designn (ISDA 2007). [42] M. M. Teresa, M. Teresa, P. Leo, J. T. Saraiva, J. Nuno, F. V. Mir, J. Lus, P. J. Peas, L. J. Rui, F. Jorge, and M. C. Pereira, “Meta-heuristics applied to power systems,” 2001. [43] C. Garc´ıa-Ascanio and C. Maté, “Electric power demand forecasting using interval time series: A comparison between VAR and iMLP,” Energy Policy, vol. 38, no. 2, pp. 715–725, February 2010. [44] C.-W. Hsu, C.-C. Chang, and C.-J. Lin, “A practical guide to support vector classification,” 2010. [45] IEC60870: Telecontrol equipment and systems - Part 5: Transmission protocols - Section 102: Companion standard for the transmission of integrated totals in electric power systems, 1st ed., IEC, 1996. [46] J. Patel and C. Read, Handbook of the normal distribution, ser. Statistics, textbooks and monographs. Marcel Dekker, 1996.

8

Cruz E. Borges was born in 1983 in Canary Island. In 2005 he received his diploma degree in Mathematics from University of La Laguna (Spain). Then he joined Ph.D program “Mathematics and it applications” in University of Cantabria under the advisor of Luis M. Pardo and J. Luis Montaña. His thesis was related to Root-finding and Symbolic Regression Problems. He has also worked in Genetic Programming and Numerical and Evolutionary Methods for Decision Problems. Now he is working in prediction models of power demand and energy consumptions as well as the introduction of Automatic Modelling in the catalyst design processes in the Energy Unit of DeustoTech.

Yoseba K. Penya was born in 1977 in the Basque Country. In 2000 he received his diploma degree in Computer Science (Dipl.-Ing.) from the University of Deusto (Bilbao, Basque Country) and he joined the Institute of Computer Technology (Vienna University of Technology), where he had also completed his diploma thesis. In the summer of 2003 he spent a 3-month sabbatical at the University of Southampton by his PhD co-advisor Nick Jennings carrying on a research on agent-based markets for energy management. In 2004, he changed to the the Vienna University of Economics and Business Administration (Department of Information Systems - New Media Lab) as assistant. In April 2006 he promoted with honours under the supervision of Prof. Dietmar Dietrich (Austrian IEEE President, Institute of Computer Technology, TU Vienna) and Prof. Nick Jennings (School of Electronics and Computer Science, University of Southampton). Since September 2007 he is back with the University of Deusto as lecturer (at the Faculty of Economics and Business AdministrationESTE) and research fellow (within the Deusto Technology Foundation) where he leads the Smart Grids group of the Energy research area. He also holds a degree in Anthropology from the UNED (National Spanish Distant University) and a postgraduate degree in Basque Culture and Language Transmission from the HUHEZI (Faculty of Humanities and Education, University of Mondragon). Currently, he is member of the IES Technical Committee of the Building Automation, Control and Management (TC BACM).

Iván Fernández Telecommunications Engineer. He is actually doing a Master in Integration of Renewable Energy in Electric Grids. His research is related to microgrids and electrical safety modelling as well as prediction models of energy consumptions. He had also worked as a Telecommunication Consultant in Telefonica.

Optimal combined short-term building load forecasting - CiteSeerX

Optimal combined short-term building load forecasting - CiteSeerX

Suggest Documents

Optimal combined short-term building load forecasting - CiteSeerX

Optimal combined short-term building load forecasting - CiteSeerX

Efficient Building load forecasting - CiteSeerX

LOAD FORECASTING

LOAD FORECASTING

Evaluating Combined Load Forecasting in Large ... - IIS Windows Server

Evaluating Combined Load Forecasting in Large ... - IIS Windows Server

Electric load forecasting: literature survey and ... - CiteSeerX

Electricity Load Forecasting Based on Autocorrelation ... - CiteSeerX

Electricity Load Forecasting Based on Autocorrelation ... - CiteSeerX

SHORT-TERM ELECTRICAL LOAD FORECASTING ... - CiteSeerX

continuous flood forecasting combined with automatic ... - CiteSeerX

Residential Load Forecasting Usin

Chapter 12 LOAD FORECASTING

Chapter 12 LOAD FORECASTING

Introduction to Load Forecasting

Building Energy Load Forecasting using Deep Neural Networks

Robust Building Energy Load Forecasting Using Physically ... - MDPI

Enhancing the Performance of Building Load Forecasting Using ...

Improvements in shortterm forecasting of ... - Wiley Online Library

Finding Multiple Optimal Solutions to Optimal Load ... - CiteSeerX

Optimal Combined Intrusion Detection and Biometric ... - CiteSeerX

Optimal Combined Reaction-Wheel Momentum ... - CiteSeerX

Optimal Combined Word-length Allocation and ... - CiteSeerX