A data-driven approach for steam load prediction ... - Semantic Scholar

Applied Energy 87 (2010) 925–933

Contents lists available at ScienceDirect

Applied Energy journal homepage: www.elsevier.com/locate/apenergy

A data-driven approach for steam load prediction in buildings Andrew Kusiak *, Mingyang Li, Zijun Zhang Department of Mechanical and Industrial Engineering, 3131 Seamans Center, The University of Iowa, Iowa City, IA 52242–1527, USA

a r t i c l e

i n f o

Article history: Received 18 June 2009 Received in revised form 29 August 2009 Accepted 3 September 2009 Available online 30 September 2009 Keywords: Data mining Building load estimation Steam load prediction Neural network ensemble Energy forecasting Monte Carlo simulation Parameter selection

a b s t r a c t Predicting building energy load is important in energy management. This load is often the result of steam heating and cooling of buildings. In this paper, a data-driven approach for the development of a daily steam load model is presented. Data-mining algorithms are used to select significant parameters used to develop models. A neural network (NN) ensemble with five MLPs (multi-layer perceptrons) performed best among all data-mining algorithms tested and therefore was selected to develop a predictive model. To meet the constraints of the existing energy management applications, Monte Carlo simulation is used to investigate uncertainty propagation of the model built by using weather forecast data. Based on the formulated model and weather forecasting data, future steam consumption is estimated. The latter allows optimal decisions to be made while managing fuel purchasing, scheduling the steam boiler, and building energy consumption. Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction Predicting the load of heating, ventilating and air-conditioning (HVAC) systems is important for energy management especially during peak energy demand hours [1]. Prior research has been done for both the long- and short-term prediction of heating and cooling loads. The approaches discussed in the literature include exponential smoothing [2], multiple regression [3], Kalman filters, [4] and state estimation [5]. The autoregressive integrated moving average (ARIMA) model [6,7] is an example of statistical approach applied for predictions of several hours. Time-series models capture relationship between the energy usage over time given time-series data. The statistical time-series models may lead to numerical instability and could be inaccurate, if highly correlated factors such as weather-related variables are ignored [8]. Traditional models usually reflect stationary linear relationships between the load and weather variables; however, the nonlinearity and complexity of the weather–load relationship make them not practical, especially for the long term-forecasting [9]. In recent years, considerable attention has been given to data-driven based methods, e.g., neural networks (NNs) [9–11], support vector machine (SVM) [12], etc. Unlike energy analysis based on analytical models [13–15] and industrial methods [16,17], data-mining algorithms offer powerful tools for discovery of models from large volumes of data. Another advantage of data-derived models is that they can be easily updated based on new data. This is especially important as the relationship

* Corresponding author. Tel.: +1 319 3355934; fax: +1 319 3355669. E-mail address: [email protected] (A. Kusiak). 0306-2619/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.apenergy.2009.09.004

between the weather and the load is nonstationary and nonlinear. Elkateb et al. [18] applied a fuzzy neural network for forecasting medium-term loads and enhancing the prediction performance by introducing a time index feature. Hou et al. [19] combined the rough set and support vector machine theory to predict the cooling load. The rough set approach was applied to find the relevant parameters of the load. Based on the building cooling load calculated from the software, Li et al. [20] compared different supervised learning algorithms for load prediction. Yang et al. [21] evaluated performance of an adaptive NN to model unexpected pattern changes in the incoming data. A real-time approach to on-line building energy prediction was presented. In this paper, steam load is used to represent the heating and cooling load of buildings, as most of the steam consumption is absorbed by the two loads. Data-mining algorithms are used to develop a nonlinear mapping between the steam load and the outside weather data. Computational experience has determined that the multiple-linear perceptron (MLP) ensemble NN outperformed other algorithms. The MLP NN ensemble is selected for the development of a steam load predictive model on a daily basis. The future steam load is estimated by updating the predictive model with the weather forecast data. The steam prediction model is of interest to building energy management, fuel management, and boiler operations and maintenance scheduling. 2. Data description and methodology for steam load prediction This research is based on the data provided by the University of Iowa (UI) Energy Management Department. The UI Power Plant

926

A. Kusiak et al. / Applied Energy 87 (2010) 925–933

produces steam consumed by over 100 buildings, including the University Hospital. In the cold season, steam provides heat to the buildings. In the cooling season, steam is predominantly used to run chillers. The weather patterns impact the total steam consumption. The weather and the steam load data are stored in a data historian (PI system). A predictive model is built based on the historical steam consumption and the weather data collected from 2004 to 2007. Once the weather forecasting data becomes available, the steam load can be computed with the predictive model. The diagram in Fig. 1 illustrates the model extraction and steam load estimation process. Though the historical data has been collected at a higher frequency (here 1 min), the weather forecast frequency is a limiting factor, and therefore daily intervals are considered. However, once validated for daily loads, the methodology proposed in this paper can be applied to estimate loads at any frequency determined by the data, e.g., a minute, 10 min, or an hour. The basic description of the data used in this research is provided in Table 1. The data in Table 1 has been preprocessed, and therefore missing and abnormal (e.g., outside of the physical range) data has been removed. 3. Case study In this section, parameters, training data, and the algorithm are selected for the development of a steam load model. 3.1. Parameter selection and dimensionality reduction The historical steam load and outside weather data stored in the PI system at 1-min time intervals are used. The dataset of 1000

Actual steam load

Outside weather patterns

Weather forecasting data

Supervised learning algorithm

Steam load predictive model Fixed HVAC operations

Estimated steam load Fig. 1. Modeling process.

data points contains values of 13 parameters. Six of them are related to the power plant sideline loads. The daily total steam load is calculated by summing up these values. The remaining parameters include outside air temperature, outside air humidity, barometric pressure, wind speed, rain gauge, solar radiation, and wind position. Unfortunately, for some parameters most of the data was missing. Only two parameters, air temperature and humidity, were somewhat complete. To represent the daily weather patterns, four attributes are computed for each parameter: the mean, standard deviation, maximum, and minimum value. After data transformation, nine parameters are derived from the original data set, as illustrated in Table 2. Among the eight input parameters (the last eight rows in Table 2), some may be redundant or even irrelevant to the steam load prediction, and therefore parameter selection is needed. Eliminating parameters that are less important may improve the comprehensibility, scalability, and possibly, accuracy of the resulting models [22]. The correlation data shown in Table 3 demonstrates the strength of linear relationships among different parameters. Assuming each parameter is a random variable, correlation quantifies the strength and direction of the linear relationship among random variables [23]. As shown in Table 3, when the correlation threshold is set at 0.2, four variables, namely Temp_max, Temp_mean, Humidity_std and Humidity_min, correlate to the output Total_day. Among these four variables, Temp_mean and Temp_max have a strong linear relationship. The Humidity_std and Humidity_min are also strongly correlated. Therefore, the input dimension can be reduced to two, one reflecting the outside air temperature and the other humidity. Correlation indicates a degree of linear relationship among variables; however, the total dependence structure cannot be fully conveyed by this statistical measure. Data mining offers ways to select parameters to overcome the ‘‘curse of dimensionality” phenomenon [24]. High dimensional data may be redundant and some of the data may not be useful, thus increasing complexity of data mining. Dimensionality reduction (parameter selection) discards unrelated and redundant data. In this paper, a boosting tree algorithm is used, as it shares advantages of the decision tree induction and tends to be robust in removal of irrelevant parameters [25,26]. In the boosting method, a sequence of binary trees is built. Each tree focuses on learning instances misclassified by the previous trees based on the prediction error. All trees are integrated with different weights in a single model. In the boosting tree algorithm, a split at every node of any regression tree is based on certain cri-

Table 2 Descriptions of transformed parameters. Transformed Parameter

Description

Unit

Total_day Temp_mean

Total steam load per day Mean value of the outside air temperature per day Standard deviation of the outside air temperature per day Maximum value of the outside air temperature per day Minimum value of the outside air temperature per day Mean value of the outside air humidity per day Standard deviation of the outside air humidity per day Maximum value of the outside air humidity per day Minimum value of the outside air humidity per day

klbs °F

Temp_std Temp_max Table 1 Data set description.

Temp_min

Data set

Time period

Description

Training data set Validation data set Test data set

1/1/2004– 12/31/2005 1/1/2006– 12/31/2006 1/1/2007– 12/31/2007

722 Observations; used for parameter selection, algorithm selection, and data split 357 Observations; used for parameter selection, validation of the algorithm, and data split 364 Observations; used to test models

Humidity_mean Humidity_std Humidity_max Humidity_min

°F °F °F RH RH RH RH

927

A. Kusiak et al. / Applied Energy 87 (2010) 925–933 Table 3 Correlation coefficient values between different parameters. Parameters Correlation coefficient Parameters Correlation coefficient Parameters Correlation coefficient Parameters Correlation coefficient Parameters Correlation coefficient Parameters Correlation coefficient

Temp_mean Total_day 0.3906

Temp_std Total_day

Temp_max Total_day 0.3928

Temp_min Total_day 0.1458

Humidity_mean Total_day 0.162

Humidity_std Total_day 0.2228

0.1789

Humidity_max Total_day 0.0034

Humidity_min Total_day 0.2067

Temp_std Temp_mean 0.2772

Temp_max Temp_mean 0.9188

Temp_min Temp_mean

Humidity_mean Temp_mean 0.1008

Humidity_std Temp_mean 0.2821

Humidity_max Temp_mean 0.0931

Humidity_min Temp_mean 0.2097

Temp_max Temp_std 0.5584

Temp_min Temp_std 0.5739

Humidity_mean Temp_std 0.439

Humidity_std Temp_std 0.4232

Humidity_max Temp_std 0.222

Humidity_min Temp_std 0.4808

Temp_min Temp_max 0.2409

Humidity_mean Temp_Temp_max 0.1988

Humidity_std Temp_max 0.3696

Humidity_max Temp_max 0.0401

Humidity_min Temp_max 0.3149

Humidity_mean Temp_min 0.315

Humidity_std Temp_min 0.1433

Humidity_max Temp_min 0.2977

Humidity_min Temp_min 0.2627

Humidity_std Humidity_mean 0.5734

Humidity_max Humidity_mean 0.7742

Humidity_min Humidity_mean 0.9187

Humidity_max Humidity_std 0.0072

Humidity_min Humidity_std 0.8027

Humidity_min Humidity_max 0.554

teria, e.g., minimization of the total regression error used in this paper. In the process of generating successive trees, the statistical importance of each variable at each split of every tree is accumulated and normalized. Predictors with a higher importance rank indicate a larger contribution to the predicted output parameter. Table 4 lists the predictor importance produced by the boosting tree algorithm. As shown in Table 4, if the threshold value for the rank is set at 40, five variables above this value are considered as important. Based on the correlation analysis and the results produced by the boosting tree algorithm (see Tables 3 and 4), two variables, Temp_mean and Humidity_min, are selected as inputs to the steam load prediction model. 3.2. Algorithm selection After parameter transformation and selection, the steam prediction model is expressed in (1).

ysteam ¼ f ðxTemp

mean ; xHumidity min Þ

ð1Þ

To extract the mapping among these variables, several datamining algorithms, namely CART, CHAID, exhaustive CHAID, boosting tree, MARSplines, random forest, SVM, MLP, MLP Ensemble, and k-NN, are used. The algorithms included in the data mining software Statistica has been used. CART (classification and regression tree) is a common method for building statistical models in a tree-building fashion. Developed by Breiman et al. [27], CART constructs binary trees for both classification trees and Table 4 Predictor rank and importance produced by the boosting tree algorithm.

Temp_mean Temp_max Temp_min Humidity_mean Humidity_min Humidity_max Temp_std Humidity_std

Variable rank

Importance

100 85 68 56 48 39 37 35

1.000000 0.852059 0.678650 0.560133 0.481238 0.386233 0.372589 0.347607

0.4702

regression. It predicts continuous as well as categorical variables with minimization of prediction square error as the split criterion. Chi-squared automatic interaction detector (CHAID) [28] and exhaustive CHAID [29] is decision-tree algorithm allowing multiple splits of nodes. Exhaustive CHAID is derived from CHAID, and it involves a more thorough merging compared to the standard CHAID. Boosting tree [25,26] is a data-mining algorithm applied in regression and classification. Multivariate adaptive regression splines (MARSplines) [30] is a nonparametric regression procedure for solving regression-type problems. It predicts continuous values based on a set of predictors. Random forest is a data-mining method for classification and regression introduced by Breiman and Cutler [31]. Unlike the standard classification tree that uses the best split among all variables at each node, the random-forest algorithm uses the best split among a subset of randomly selected predictors at that node. Support vector machine [26] (SVM) is a supervised learning method based on kernel functions, and it is used for classification and function approximation. Using specific kernel functions, original parameter space is transformed into a high-dimensional space where a separated hyperplane is constructed with the maximum-margin. Multi-layer perceptron (MLP) [32,33] is a commonly used feed-forward neural network involving numerous units organized into multiple layers. Through adaptive adjusting weights among units under supervised learning, the MLP is capable of identifying and learning patterns based on input data sets and the corresponding target values. Due to the high variance parameter of the MLP, the MLP ensemble method [24] presents a model combiner to leverage the power of multiple models in achieving better prediction accuracy than any individual MLP models could on their own. The knearest neighbor (k-NN) is a simple machine-learning algorithm [24] based on the concept that objects which are similar to each other are likely to belong to the same category. It is used in the prediction of continuous and categorical variables. As shown in Table 1, the data set of 2004–2005 is used to build a model, while the 2006 data is used to validate it. The following four metrics (3)–(7) are used to measure the prediction accuracy of the model: the mean absolute error (MAE), standard deviation of absolute error (Std_AE), mean absolute percentage error (MAPE) and standard deviation of absolute percentage error (Std_APE) [34]:

928


^ yj AE ¼ jy PN AEðiÞ MAE ¼ i¼1 N ^y y APE ¼ j j y PN APEðiÞ MAPE ¼ i¼1 N sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi PN 2 i¼1 ðAEðiÞ MAEÞ Std AE ¼ N1 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi PN 2 i¼1 ðAPEðiÞ MAPEÞ Std APE ¼ N1

ð2Þ ð3Þ ð4Þ ð5Þ ð6Þ ð7Þ

^ is the predicted value where AE in (2) is the absolute error, and y obtained from the model, while y is the actual value measured, N is the number of data points used for model training, validating or testing. Table 5 presents the training and test accuracy results of models built with different data-mining algorithms. In the CART algorithm, minimization of the misclassification cost is used as the splitting criterion, while splitting continues when the minimum number of cases at the node is more than five and the maximum number

Table 5 Training and testing accuracy results for models extracted with different data-mining algorithms. Data Set

Algorithm

MAE

Std_AE

Training 2004– 2005 Testing 2006 Training 2004– 2005 Testing 2006 Training 2004– 2005 Testing 2006

CART

133.2204

165.6246

3.30

4.49

CART CHAID

649.1558 405.3211

530.2855 352.1727

18.15 9.90

16.71 10.24

CHAID Exhaustive CHAID

577.5735 398.4697

440.2995 347.7473

16.20 9.71

13.70 10.12

Exhaustive CHAID Boosting tree regression

569.5217

434.9203

15.87

13.28

365.7726

334.6696

9.18

10.34

Boosting tree regression Random forest

540.3095

408.6700

15.18

12.88

3

360.6245

319.8897

9.05

10.14

4

Random forest MARSplines

561.6843 344.1872

407.1481 345.7752

15.99 8.75

13.38 10.77

MARSplines SVM

512.4989 439.9067

413.0148 353.3947

14.47 11.35

12.98 12.18

SVM MLP

648.5277 340.5511

431.8684 341.5077

18.56 8.65

14.67 10.57

MLP MLP ensemble

512.6047 338.7780

414.7041 340.8402

14.53 8.60

13.14 10.52

MLP ensemble k-NN

510.1599 299.6001

412.7694 311.2201

14.44 7.57

13.06 9.36

Training 2004– 2005 Testing 2006 Training 2004– 2005 Testing 2006 Training 2004– 2005 Testing 2006 Training 2004– 2005 Testing 2006 Training 2004– 2005 Testing 2006 Training 2004– 2005 Testing 2006

Std_APE (%)

Table 6 Four scenarios of parameter selection. Scenario

Selected parameters

Description

1

Temp_mean, Humidity_min

2

Temp_mean, Temp_std, Temp_max, Temp_min, Humidity_mean, Humidity_std, Humidity_max, Humidity_Humidity_min Temp_Temp_max, Temp_mean, Humidity_std, Humidity_min Temp_mean, Temp_max, Temp_min, Humidity_mean, Humidity_min

Based on selection procedure in Section 3.1 All eight temperature and humidity inputs

Based on correlation coefficient of Table 3 Based on the boosting tree algorithm

18% 16% 14% 12%

MAPE

Training 2004– 2005 Testing 2006

MAPE (%)

of nodes does not exceed 1000. For CHAID and exhaustive CHAID algorithms, the p value is set at 0.05 for splitting and merging. The boosting tree regression uses a learning rate of 0.1, and the maximum number of additive trees equals 200. To avoid overfitting, each consecutive tree is built using a subset of data, and the subset proportion is set to 0.5. In the random-forest algorithm, the maximum number of trees from the forest is set to 200, and the subset proportion is 0.5. For the MARSplines algorithm, the number of basis functions and the corresponding weighted coefficients are adjusted to minimize the least square error. The maximum number of basis functions is set as 21. The radial basis function (RBF) is used as the kernel function in the SVM algorithm, while the capacity factor is set to 10 to avoid overfitting. For the hidden and output neurons of the MLP neural network algorithm, five different activation functions are selected, namely, the logistic, identity, tanh, exponential, and sine functions. The number of hidden units is set between 5 and 18, and the weight decay for both the hidden and output layer varies from 0.0001 to 0.001. The MLP Ensemble involves five MLPs. In the k-NN algorithm, k is set to 5. Based on the training and testing error, the computational results reported in Table 5 show that the MLP Ensemble performs best on the MAE and MAPE metrics. Therefore, it is selected as the algorithm for building the steam load model in Sections 3.3 and 3.4. To demonstrate the importance of parameter selection, along with two variables selected in Section 3.1, three other scenarios choosing different input variables based on both the results of the coefficient matrix (Table 3) and the boosting tree (Table 4) are shown in Table 6. The training and test MAPEs of the models generated by the MLP Ensemble for the four scenarios in Table 6 are compared in Fig. 2.

10% 8% 6% 4% 2% 0%

Senario 1

Senario 2

Training MAPE k-NN

548.8362

423.5867

15.29

13.18

Senario 3

Senario 4

Testing MAPE

Fig. 2. MAPEs for the four parameter selection scenarios of Table 6.

929


As shown in Fig. 2, though the two-parameter scenario produces a larger training error than any other scenario, its test result has the smallest mean absolute percentage error. It should be noted that the two-parameter scenario has the smallest gap between the training and testing error. A smaller number of inputs not only leads to a stable prediction accuracy but also tends to reduce the variance.

3.3.1. Data split scenario I Due to the seasonality effect, the data from 2004 to 2005 is divided into four data subsets shown in Table 7. Four different models are built from each data set respectively. For load estimation at future horizons, a model for each month is selected. 3.3.2. Data split scenario II As shown in Tables 3 and 4, the mean temperature value is a significant parameter in estimating steam load. Two dimensional plots, with the horizontal axis referring to the mean temperature and the vertical axis representing the steam consumption of 2004 and 2005, are shown in Figs. 3 and 4. As illustrated in Figs. 3 and 4, there is a rough threshold for the split of the mean temperature at 55 °F; the steam load and the mean temperature vary. Therefore, 2004 and 2005 data are divided into two subsets based on this split point, and models are built for each data subset. Based on the current practice, the split point 55 °F is used to select the most appropriate model.

3.3. Data split selection In this section, different scenarios of data split are investigated to improve the model’s accuracy. In Section 3.2, data from year 2004 to 2005 is used as the training data set. In predicting the steam load of a certain month, seasonal effects may highly influence the results. For example, predicting steam load during the winter is much different from predicting it in the summer. Since training data includes all types of seasonal data, much information is redundant or even irrelevant in predicting the steam load for a specific month. Three different scenarios are proposed and described below.

3.3.3. Data split scenario III Unlike scenario II, where the data was divided into two parts based on one split temperature, here the data is grouped into different temperature bins with equal intervals of 10 °F. The details are shown in Table 8. The previously selected MLP ensemble with five MLPs is used to develop the model with different data split scenarios. To test the model accuracy, 2 months’ worth of data with the best and the worst test MAPE results of 2006 are selected. Fig. 5 shows the test MAPE results for each month in 2006 obtained from the best model built in Section 3.2 using 2004 and 2005 data. The results in Fig. 5 illustrate that May and December of 2006 have the largest and smallest MAPE, respectively. Therefore, these 2 months have been used for testing the model developed by data split scenarios of Table 8. The MAE, Std_MAE, MAPE, and Std_MAPE for the test data sets are shown in Table 9, where two different baselines are included. Baseline I refers to using the 2004 load data directly as the estimation of the load at the same day in 2006 (e.g., using steam load on May 1st, 2004 to estimate load on May 1st, 2006). Baseline II refers to using the 2005 load data directly as the estimation of the load at the same day of 2006. As shown in Table 9, data split scenario I has the best overall performance of all the scenarios. Therefore, the seasonal split data is used to develop the steam load model. Note that though data split scenario I uses significantly fewer data points for training the model and predicting the future load, it has a much better accuracy in May 2006 and an acceptable accuracy in December 2006. Less redundant data demonstrate that the model accuracy

Table 7 Data split scenario I. Data set

Description

1 2 3 4

January–March of 2004 and 2005 April–June of 2004 and 2005 July–September of 2004 and 2005 October–December of 2004 and 2005

Steam load (Klbs)

8000 7000 6000 5000 4000 3000 2000 0 -10

10

20

30

40

50

60

70

80

May

June

90

Temperature (ºF) January July

August

Febuary September

March

April

October

November

December

Fig. 3. Relationship between the mean temperature and steam consumption in 2004.

Steam load (Klbs)

8000 7000 6000 5000 4000 3000 2000 0 -10

10

20

30

40

50

60

70

80

90

Temperature (ºF)

Fig. 4. Relationship between the mean temperature and steam consumption in 2005.

930


Table 8 Data split scenario III.

Table 9 Test results with different data split scenarios.

Date set

Mean temperature bin (°F)

No. of observations

1 2 3 4 5 6 7 8

Less than 10 10–20 20–30 30–40 40–50 50–60 60–70 70–80

26 28 62 120 103 105 151 127

Data split scenario

Test data set

I

Test May 2006 Test December 2006

279.5387

173.6689

7.74

4.83

234.0529

223.0083

4.90

4.46


874.2131

594.0196

29.02

20.38

193.2253

164.5481

4.00

3.07


792.9835

519.6384

26.63

17.93

266.8676

213.3769

5.71

4.61


1217.8438

820.3141

26.28

15.67

1276.3474

729.7703

25.70

14.86


647.1931

354.5781

17.51

8.82

1043.6175

717.7555

18.08

11.59

808.0371

469.6589

27.20

16.59

186.3490

172.1987

3.81

3.22

II

III

MAPE

Baseline I

Baseline II

Jan. Feb. Mar. Apr. May June July Aug. Sep. Oct. Nov. Dec. All 2004, 2005 data

Fig. 5. Test MAPE for each month of 2006.

could be improved. Figs. 6 and 7 show the test results of scenario I for 2 months. 4. Test results and sensitivity analysis An independent data set of 2007 was used as the test set to evaluate the steam load prediction model. Based the selected parameters, algorithm, and data split, both the training and validation data of Table 1 are used as the training set. The MLP ensemble with five MLPs is the mapping algorithm, while data split scenario I is used to divide the training set into several subsets. Four models are built to predict steam consumed at different seasons. Table 10 contains the details of the four models. Model 1 is used for predicting steam consumed from January to March 2007. Model 2 is used for predicting steam consumed from April to June 2007. Model 3 is used for predicting steam consumed


Std of MAE

MAPE (%)

Std of MAPE (%)

from July to September 2007. Model 4 is used for predicting steam consumed from October to December 2007. The prediction statistics for each month in 2007 are shown in the Table 11. As shown in Table 11, the accuracy of the load prediction during the winter is better than in the summer. The observed and predicted loads from the four models are shown in Fig. 8 through Fig. 11. As demonstrated in Fig. 8, the predicted and observed load from January to March match each other well. In Fig. 11, the results are almost the same except November. It is due to the fact that during the heating season, the heating load is roughly equal to the steam consumed and mapping between the weather pattern and the steam load could be clearly established. The prediction error is rel-

4500 4300 4100

Steam load (Klbs)

MAE

3900 3700 3500 3300 3100 2900 2700 2500

Time (day) Observed Predicted Fig. 6. Test results of scenario I for May 2006.

931


7000 6500

Steam load (Klbs)

6000 5500 5000 4500 4000 3500 3000 2500

Time (day) Observed Predicted

Table 10 Descriptions of the four MLP ensemble models. Model

NN structure

Activation function at hidden layer

Activation function at output layer

1

MLP 2-8-1 MLP 2-7-1 (3) MLP 2-9-1

Logistic function Hyperbolic tangent function Logistic function

Sine function Sine function

2

MLP 2-7-1 (3) MLP 2-8-1 (2)

Hyperbolic tangent function Logistic function

Sine function Identity function

MLP MLP MLP MLP MLP

Exponential Exponential Exponential Exponential Exponential

Logistic function Identity function Sine function Identity function Logistic function

4

2-4-1 2-4-1 2-6-1 2-6-1 2-8-1

function function function function function

MLP 2-7-1 (3) MLP 2-4-1

Hyperbolic tangent function Logistic function

MLP 2-8-1

Hyperbolic tangent function

Table 11 Test results for each month of 2007.

11

21

31

41

51

61

71

81

Fig. 8. Test results of January, February, and March of 2007.

Sine function Hyperbolic tangent function Sine function

9000 8000 7000 6000 5000 4000 3000 2000 1000 0

1

6000

Steam load (Klbs)

3

Sine function

Steam load (Klbs)

Fig. 7. Test results of scenario I for December 2006.

5000 4000 3000 2000 1000 0

MAE

Std of MAE

MAPE (%)

Std of MAPE (%)

January February March April May June July August September October November December

229.6125 470.6864 376.4847 729.2874 591.0668 640.7030 692.3318 717.4539 1030.7757 297.9699 1319.8354 356.0059

127.6796 218.3952 202.0408 547.4329 497.8888 417.8491 104.5319 144.6213 439.0330 253.0848 562.1025 111.7109

4.07 7.39 10.71 17.10 17.64 16.19 17.82 18.61 31.74 10.31 52.62 6.15

2.08 2.61 8.11 10.23 16.10 10.48 3.21 4.55 17.99 11.81 26.18 1.79

atively constant in November; however, the trend remains the same. In Figs. 9 and 10, as the outside temperature increases, the prediction error becomes larger. Note that observed values are less than the predicted ones for most of the time. One possible explanation is that during the cooling season, only some of chillers are run by steam. Electricity-driven chillers also account for a major proportion of the load. The predicted load is based on the assumption

1

11

21

31

41

51

61

71

81

Fig. 9. Test results of April, May, and June of 2007.

Steam load (Klbs)

Month

5000 4500 4000 3500 3000 2500 2000 1500 1000 500 0

1

11

21

31

41

51

61

71

81

Fig. 10. Test results of July, August, and September of 2007.

91

932

A. Kusiak et al. / Applied Energy 87 (2010) 925–933 Table 12 Selected instance for Monte Carlo simulation.

Steam load (Klbs)

8000 7000 6000

Date

Actual load

Temp_mean

Humidity_min

Predicted load

5000

1/1/2007

4802.6609

32.1764

56.3725

4698.9778

4000 3000 2000 1000 0 1

11

21

31

41

51

61

71

81

91

Fig. 11. Test results of October, November, and December of 2007.

that the steam consumption equals approximately the total cooling load. In fact, the observed steam load is less than the total cooling load as it only partially represents the cooling load. The latter explains why the observed load is smaller than the predicted load. The load data does not contain the electricity consumed by the chillers during this period. To estimate the steam load during the cooling season, the electric chillers data should be available. As indicated in Section 2, the model inputs for the predicted load are the weather forecast data. Since the forecasted weather data is not highly accurate, each input can be considered as a random variable. In that case, the deterministic model is transformed

into a stochastic one. The input variables may affect the probability distribution of the outcome. Therefore, uncertainty propagation needs to be investigated in this model. Due to the complexity of the NN ensemble model, Monte Carlo simulation [35,36] is applied for analyzing uncertainty propagation. Data of 1/1/2007 has been used in the Monte Carlo simulation. Table 12 describes the data. By incorporating Gaussian noise for each input variable, the mean is set as 0 while the standard deviation is set as 1. The total 1000 simulated points for each input variable have been generated, and their distribution plots are described in Figs. 12 and 13. Based on Model 1 described in Table 10, the steam load output can be calculated, and its distribution is presented in Fig. 14. Note that short bar in the middle of Fig. 14 indicates the position of initial predicted load. As shown in Fig. 14, the predicted load distribution can be considered as the approximate normal distribution. The mean of the load distribution is 4702, while the initial predicted load is 4698. It has been demonstrated that if the confidence of the forecasted weather data is relatively high, though noise and uncertainty exist,

Frequency

Temp_mean

Temp_mean interval Fig. 12. Distribution of the simulated Temp_mean.

Frequency

Humidity_min

Humidity_min interval Fig. 13. Distribution of the simulated Humidity_mean.

933

Frequency


Predicted load interval Fig. 14. Distribution of the predicted load.

the predicted load has a higher confidence, since the mean of the load distribution is close to it. Small uncertainty does not significantly affect the quality of the predicted results.

5. Conclusion In this paper, a data-driven approach for steam load prediction was presented. A correlation coefficient matrix and the boosting tree algorithm were used for parameter reduction. Performance of 10 different data-mining algorithms was studied, and the ensemble algorithm with five MLPs was selected as the best mapping algorithm. Different training data split scenarios were also investigated. A steam load prediction model was developed using data from 3 years. Test results for the follow-up year were discussed. The current steam prediction model is limited to the heating season. The lower accuracy in the cooling season was due to lack of data on the electric chillers supplementing the chill load. Once the data on system operations, occupancy, and chiller running schedules become available, they will improve prediction accuracy.

Acknowledgement This research has been supported by the Iowa Energy Center, Grant No. 08-01.

References [1] Bida M, Kreider JF. Monthly-averaged cooling load calculations-residential and small commercial buildings. ASME Trans: J Sol Energy Eng 1987;109(4):311–20. [2] Christiaanse WR. Short-term load forecasting using exponential smoothing. IEEE Trans Power Ap Syst 1971;PAS-90:900–10. [3] Thompson RP. Weather sensitive demand and energy analysis on a large geographically diverse power system: application to short-term hourly electric demand forecasting. IEEE Trans Power Ap Syst 1976;PAS-95:384–93. [4] Frissari GD, Widergren SE, Yehsakul PD. On-line load forecasting for energy control center application. IEEE Trans Power Ap Syst 1982;PAS-101:71–8. [5] Toyoda J, Chen MS. An application of state estimation to short-term load forecasting, Parts 1 and 2. IEEE Trans Power Ap Syst 1970;PAS-89:1678–1688,. [6] Kimbara A, Kurosu S, Endo R. On-line prediction for load profile of an airconditioning system. ASHRAE Trans 1995;101(2):198–207. [7] Kawashima M, Dorgan CE, Mitchell JW. Hourly thermal load prediction for the next 24 h by ARIMA, EWMA, LR, and an artificial neural network (Part 1). ASHRAE Trans 1995;101:186–200. [8] Park DC, E1-Sharkawi MA, Marks II RJ. Electric load forecasting using an artificial neural network. IEEE Trans Power Syst 1991;6(2):442–9. [9] Islam SM, Al-Alawi SM, Ellithy KA. Forecasting monthly electric load and energy for a fast growing utility using an artificial neural network. Electr Power Syst Res 1995;34(1):1–9.

[10] Kawashima M, Dorgan CE, Mitchell JW. Optimizing system control with load prediction by neural networks for an ice-storage system. ASHRAE Trans 1996;102(1):169–1178. [11] Gonza´lez PA, Zamarreno JM. Prediction of hourly energy consumption in buildings based on a feedback artificial neural network. Energy Build 2005;37(6):585–601. [12] Li Q, Meng Q, Cai J, Yoshino H, Mochida A. Applying support vector machine to predict hourly cooling load in the building. Appl Energy 2009;86(10):2249–56. [13] Zhai H, Dai YJ, Wu JY, Wang RZ. Energy and energy analyses on a novel hybrid solar heating, cooling and power generation system for remote areas. Appl Energy 2009;86(9):1395–404. [14] Difs K, Danestig M, Trygg L. Increased use of district heating in industrial processes – impacts on heat load duration. Appl Energy 2009;86(11):2327–34. [15] Yildiz A, Güngör A. Energy and energy analyses of space heating in buildings. Appl Energy 2009;86(10):1939–48. [16] Desideri U, Proietti S, Sdringola P. Solar-powered cooling systems: technical and economic analysis on industrial refrigeration and air-conditioning applications. Appl Energy 2009;86(9):1376–86. [17] Ruan Y, Liu Q, Zhou W, Firestone R, Gao W, Watanabe T. Optimal option of distributed generation technologies for various commercial buildings. Appl Energy 2009;86(9):1641–53. [18] Elkateb MM, Solaiman K, Al-Turki Y. A comparative study of medium-weatherdependent load forecasting using enhanced artificial/fuzzy neural network and statistical techniques. Neurocomputing 1998;23(1–3):3–13. [19] Hou Z, Lian Z, Yao Y, Yuan X. Cooling load prediction based on the combination of rough set theory and support vector machine. HVAC&R Res 2006;12(2):337–52. [20] Li Q, Meng Q, Cai J, Yoshino H, Mochida A. Predicting hourly cooling load in the building: a comparison of support vector machine and different artificial neural network. Energy Conver Manage 2009;50(1):90–6. [21] Yang J, Rivard H, Zmeureanu R. On-line building energy prediction using adaptive artificial neural network. Energy Build 2005;37(12):1250–9. [22] Wang J. Data mining: opportunities and challenges. Hershey, PA: IGI Global; 2003. [23] Rodgers JL, Nicewander WA. Thirteen ways to look at the correlation coefficient. Am Stat 1988;42(1):59–66. [24] Tan PN, Steinbach M, Kumar V. Introduction to data mining. New York: Addison Wesley; 2005. [25] Friedman J. Stochastic gradient boosting. Stanford University, Statistics Department; 1999. [26] Hastie T, Tibshirani R, Firedman JH. The elements of statistical learning. New York: Springer; 2001. [27] Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and regression trees. Monterey, CA: Wadsworth International; 1984. [28] Kass GV. An exploratory technique for investigating large quantities of categorical data. Appl Stat 1980;29(2):119–27. [29] Bigss D, Ville B, Suen E. A method of choosing multiway partitions for classification and decision trees. J Appl Stat 1991;18(1):49–62. [30] Friedman JH. Multivariate adaptive regression spline. Ann Stat 1991;19(1):1–67. [31] Breiman L. Random forests. Mach Learn 2001;45(1):5–32. [32] Hertz JA, Krogh A, Palmer RG. Introduction to the theory of neural computation. Boulder, CO: Westview Press; 1999. [33] Haykin S. Neural networks: a comprehensive foundation. Englewood Cliffs, NJ: Prentice Hall; 1998. [34] Casella G, Berger R. Statistical inference. 2nd ed. Pacific Grove, CA: Duxbury Press; 1990. [35] Metropolis N, Ulam S. The Monte Carlo method. J Am Stat Assoc 1949;44(247):335–41. [36] Goodman L. On the exact variance of products. J Am Stat Assoc 1960;55(292):708–13.

A data-driven approach for steam load prediction ... - Semantic Scholar

A data-driven approach for steam load prediction ... - Semantic Scholar

Suggest Documents

Load Value Prediction Using Prediction Outcome ... - Semantic Scholar

A Classi cation Approach for Prediction of Target ... - Semantic Scholar

A Novel Approach for Cardiac Disease Prediction ... - Semantic Scholar

A link prediction approach for item ... - Semantic Scholar

A Delay Prediction Approach for Teleoperation ... - Semantic Scholar

using the load-velocity relationship for 1rm prediction - Semantic Scholar

Host Load Prediction in a Google Compute Cloud ... - Semantic Scholar

Host Load Prediction in a Google Compute Cloud ... - Semantic Scholar

Fuzzy based approach for Load balanced ... - Semantic Scholar

Geospatial modeling approach for prediction of ... - Semantic Scholar

SVMRFE based approach for prediction of most ... - Semantic Scholar

Wind farm power prediction: a data-mining approach - Semantic Scholar

A Text Mining Approach to the Prediction of ... - Semantic Scholar

A New Approach to Linear Filtering and Prediction ... - Semantic Scholar

A Performance Prediction Approach to Enhance ... - Semantic Scholar

Performance Prediction-based versus Load-based ... - Semantic Scholar

PREDICTION OF NETWORK LOAD IN BUILDING ... - Semantic Scholar

PREDICTION OF NETWORK LOAD IN BUILDING ... - Semantic Scholar

Prediction of cytomegalovirus load and resistance ... - Semantic Scholar

A Microwave Sensor for Steam Quality - Semantic Scholar

Fault prediction/diagnosis and sensor validation technique for a steam ...

The Value of Lost Load for Sectoral Load ... - Semantic Scholar

Load Balancing: An Approach Based on ... - Semantic Scholar

A Practical Approach for Computing ... - Semantic Scholar