WIND ENERGY Wind Energ. 2009; 12:275–293 Published online 24 September 2008 in Wiley Interscience (www.interscience.wiley.com) DOI: 10.1002/we.295
Research Article
Wind Farm Power Prediction: A Data-Mining Approach Andrew Kusiak*, Haiyang Zheng and Zhe Song, Department of Mechanical and Industrial Engineering, 3131 Seamans Center, The University of Iowa, Iowa City, IA 52242–1527, USA
Key words: Wind farm power prediction; data mining; neural network; weather forecasting data; long-term prediction; short-term prediction
In this paper, models for short- and long-term prediction of wind farm power are discussed. The models are built using weather forecasting data generated at different time scales and horizons. The maximum forecast length of the short-term prediction model is 12 h, and the maximum forecast length of the long-term prediction model is 84 h. The wind farm power prediction models are built with five different data mining algorithms. The accuracy of the generated models is analysed. The model generated by a neural network outperforms all other models for both short- and long-term prediction. Two basic prediction methods are presented: the direct prediction model, whereby the power prediction is generated directly from the weather forecasting data, and the integrated prediction model, whereby the prediction of wind speed is generated with the weather data, and then the power is generated with the predicted wind speed. The direct prediction model offers better prediction performance than the integrated prediction model. The main source of the prediction error appears to be contributed by the weather forecasting data. Copyright © 2008 John Wiley & Sons, Ltd. Received 25 April 2008; Revised 14 August 2008; Accepted 22 August 2008
Introduction The wind power industry is rapidly expanding, and accurate power forecasting is essential. Wind power forecasts are used as input for various simulation tools, including market operations, unit commitment and economic dispatch. Therefore, the short- and long-term wind farm power predictions are significant in transforming a wind farm into a wind power plant. A number of different approaches have been used in forecasting wind speed and wind farm power on different time scales. Landberg et al.1 built a model to predict the power produced by a wind farm using the data from the weather prediction model (HIRLAM – High Resolution Limited Area Model) and the local weather model (WASP – Wind Atlas Analysis and Application Program). Mohandes et al.2 compared the performance of neural network and autoregressive models applied for wind speed prediction. The neural network model outperformed the autoregressive model in both prediction graph and root mean squared errors. Lange et al.3 presented various models for short-term wind power prediction, including physics-based, fuzzy and neurofuzzy models. Using meteorological data, Barbounis et al.4 constructed a local recurrent neural network model for long-term wind speed and power forecasting. Hourly wind park forecasts for up to 72 h ahead were produced. Damousis et al.5 developed a fuzzy logic model that was trained with a genetic algorithm. The model was then used to forecast wind speed over horizons ranging from 0.5 to 2 h. Sfetsos6 presented a novel method to forecast the mean hourly wind speed based on a time series analysis, and showed that the developed model * Correspondence to: Andrew Kusiak, Department of Mechanical and Industrial Engineering, 3131 Seamans Center, The University of Iowa, Iowa City, IA 52242–1527, USA. E-mail:
[email protected]
Copyright © 2008 John Wiley & Sons, Ltd.
276
A. Kusiak, H. Zheng and Z. Song
outperformed the conventional forecasting models. Torres et al.7 built the Auto-Regressive Moving Average (ARMA) model based on time series data after transformation and standardization, and predicted mean hourly wind speed up to 10 h ahead. Physics-based and statistical modeling approaches have been widely used to forecast wind speed and wind farm power. The two methods have both advantages and disadvantages. Development of prediction models for wind speed and wind power, for either short-term or long-term horizons, pose a challenge because of the stochastic nature of wind. Even assuming that an accurate wind speed prediction exists, the satisfied wind farm power forecasting cannot be guaranteed, as the status of each wind turbine determines the ultimate power output. Frequent updates of the prediction models for wind speed or wind farm power pose another challenge. Data mining is a promising approach to model wind farm performance. Numerous successful applications of data mining in manufacturing, marketing, medical informatics and the energy industry have been reported in the literature.8–12 The models trained and built by data mining algorithms can be easily updated. In this paper, a data-mining approach is applied to build models for the wind farm power prediction over both a short-term horizon (1 to 12 h ahead) and a long-term horizon (3 to 84 h ahead). The short- and longterm prediction models are constructed with the Rapid Update Cycle (RUC) model3,13 and the North American Mesoscale (NAM)3,14 model, respectively. Both the RUC and the NAM are Numeric Weather Prediction models and can generate weather forecasting data. Two different methodologies for power prediction have been compared and analysed. The models are built using historical data collected by Supervisory Control and Data Acquisition (SCADA) systems installed at a wind farm and weather forecasting data records for 16 locations surrounding the wind farm.
Data Description and Methods for Wind Farm Power Prediction Weather Forecasting Data The data from two different National Weather Service Forecast models are used for wind farm power prediction. The models provided data for different locations surrounding the wind farm. Figure 1 shows the special layout of the 16 model data points around the wind farm considered in this research. The immediate neighbourhood of the wind farm includes data points 6, 7, 10 and 11. Note that data on the specific location, terrain and grid points were not available in this research. RUC Model and Data The RUC model is designed to provide accurate short-range numerical forecast guidance to various users. In this research, the RUC model data is the source for constructing a short-term wind farm power prediction model. The basic features of this model are as follows:
Figure 1. Location of the data points surrounding the wind farm Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
Wind Farm Power Prediction
277
• Short-term predictions with a maximum forecast length of 12 h; • the spacing between model grid points is 20 km; • a new 12 h forecast is issued every hour at 00:00, 03:00, 06:00, 09:00, 12:00, 15:00, 18:00 and 21:00 GMT; and • at all other hours, a 9 h forecast is issued. Table 1 describes the parameters of the RUC model. In this research, each model data point has 10 parameters, and there are 16 model data points. Therefore, the RUC data has, in total, 160 variables for predicting shortterm wind farm power. NAM Model and Data The NAM model is designed to provide day-ahead weather forecast guidance. In this research, the NAM model data is the source for building a long-term wind farm power prediction model. The basic features for this model are as follows: • • • •
A day-ahead forecasting with a maximum forecast length of 84 h; the spacing between model grid points is 40 km; a forecast value is issued every 3 h; and a new 84 h forecast is issued four times daily at 00:00, 06:00, 12:00 and 18:00 GMT.
Table 2 describes the parameters of the NAM model. In this research, each model data point has 12 parameters, and there are 16 model data points. Therefore, the NAM model has, in total, 192 variables for predicting longterm wind farm power.
Table 1. Data description of the RUC model Description
Parameter Spd_10m Dir_10m Spd_XXmb Dir_XXmb AD_30mb PTdiff_ 30mb_sfc
Unit
Wind speed 10 m above the surface Wind direction 10 m above the surface Average wind speed in the lowest XX mb of the atmosphere (XX is 30, 60 and 90, respectively) Average wind direction in the lower XX mb of the atmosphere (XX is 30, 60 and 90, respectively) Average air density in the lowest 30 mb of the atmosphere Potential temperature difference between the surface and 30 mb above the surface; measure of atmospheric stability in lower spaces
ms−1 Deg ms−1 deg kg m−3 K
Table 2. Data description of the NAM model Description
Parameter Spd_10m Dir_10m Spd_XXmb Dir_XXmb AD_30mb PTdiff_ 30mb_sfc SHTFL VEG
Unit
Wind speed 10 m above the surface Wind direction 10 m above the surface Average wind speed in the lowest XX mb of the atmosphere (XX is 30, 60 and 90, respectively) Average wind direction in the lower XX mb of the atmosphere (XX is 30, 60 and 90, respectively) Average air density in the lowest 30 mb of the atmosphere Potential temperature difference between the surface and 30 mb above the surface; measure of atmospheric stability in lower spaces Sensible heat flux at the surface; indicator of surface heating or cooling Percentage of the surface that is covered by vegetation
Copyright © 2008 John Wiley & Sons, Ltd.
ms−1 Deg ms−1 deg kg m−3 K Wm−2 %
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
278
A. Kusiak, H. Zheng and Z. Song
SCADA Data Description The data used in this research were generated at a wind farm with dozens of turbines. The data was collected by a SCADA system installed at each wind turbine. Each SCADA system collects data for more than 120 parameters. Though the data is sampled at high frequency, e.g. 2 s, it is averaged and stored at 10 min intervals (referred to as the 10 min average data). The data used in this research were collected over a period of 3 months for all turbines of the wind farm. Due to the current industrial data practices, only a 3 month long data set is available in this research. However, the proposed methodology for short- and long-term power prediction applies to data of any magnitude. In this research, the wind speed and wind power are considered as dependent variables for the power prediction model, while the weather forecasting data are used as predictors. The wind farm data used in this research were measured by nacelle anemometers, and is 10 min average SCADA data.
Feature Selection To obtain an accurate prediction model with a data-mining approach, the original high-dimension data need to be reduced to low-dimension data. The significant parameters for each of the 16 model data points need to be selected first, as not all the data contribute to an accurate prediction. Data mining is a powerful tool for extracting knowledge and solving problems from voluminous data. Data mining offers different algorithms to perform the feature selection task, e.g. the boosting tree algorithm15,16 and the wrapper approach, integrated with the genetic or the best first search algorithms.17,12 The boosting tree algorithm is used in this research for feature selection. Important predictors are determined by the importance generated by the boosting tree algorithm, and predictors with bigger importance will be selected. It is not surprising to observe that the importance of the predictors of weather forecasting data is ranked according to the closeness of model points to the wind farm. The closer to the wind farm the model point is, the more significant the predictor. Based on the results produced by the boosting tree algorithm, the four closest model points 6, 7, 10 and 11 were selected as predictors for the wind farm power. As the result of the feature selection, the original 192-dimension NAM data was reduced to a 48-dimension predictor for wind farm power prediction, while the RUC data was reduced from 160 dimensions to 40 dimensions.
Principal Component Analysis Even with the feature selection, the dimensionality of the predictors for power prediction is still high. To gain more insight into the data, the correlation coefficient among the weather forecasting parameters (40-dimension RUC data and 48-dimension NAM data) was computed. The results show that the parameters measured with the same unit are highly correlated. To further reduce the input dimensionality, the principal component analysis (PCA)18 was applied. The PCA expresses the variance–covariance structure of a set of variables by a few linear combinations. The basic steps of the PCA are as follows:18 1. Compute a correlation matrix for all parameters. 2. Compute the eigenvectors and eigenvalues of the correlation matrix. 3. Select the components to form an eigenvector. 4. Derive the new data comprised of the principal component of the original data. The weather forecasting data have different units, and therefore, the principal components are determined for parameters with the same units. The parameters with different units include: wind speed (ms−1), wind direction (º), air density (kg m−3), temperature difference (K), SHFTL (Wm−2) and VEG (%). Table 3 presents the eigenvalues of the correlation matrix and the related statistics of the RUC wind speed data. Based on the eigenvalue statistics, the first two principal components can explain 94.4% of the total variance, and therefore, a subset (here two) of eigenvalues is selected. Thus, the dimensionality of the wind speed data stream (16 inputs) is reduced to 2. The two principal components, which are uncorrelated linear combinations of the 16 original RUC wind speeds, should form the new coordinate and input for the wind power prediction model discussed in the following section. Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
Wind Farm Power Prediction
279
Table 3. Eigenvalues of the correlation matrix and the related statistics of the RUC wind speed data Value number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Eigenvalue
Total variance (%)
Cumulative eigenvalue
13.6954 1.4035 0.4166 0.1791 0.1151 0.0837 0.0314 0.0223 0.0194 0.0116 0.0072 0.0048 0.0037 0.0031 0.0014 0.0008
85.5968 8.7719 2.6041 1.1191 0.7196 0.5234 0.1964 0.1394 0.1218 0.0729 0.0473 0.0303 0.0229 0.0195 0.0091 0.0054
13.6955 15.0989 15.5156 15.6947 15.8098 15.8936 15.9249 15.9473 15.9668 15.9784 15.9861 15.9909 15.9945 15.9976 15.9991 16.0001
Cumulative % 85.5968 94.3687 96.9727 98.0918 98.8114 99.3348 99.5312 99.6706 99.7924 99.8653 99.9127 99.9431 99.9659 99.9854 99.9945 100
Figure 2. PCA transformation of the wind speed and direction
Following the same steps of the PCA transformation of the RUC wind speed, all other RUC and NAM parameters with the same unit can be transformed into principal components. Figure 2 shows the PCA transformation of wind speed and wind direction. Note that WS is wind speed, WD is wind direction, WSPC is the principal component of wind speed and WDPC is the principal component of wind direction. Table 4 shows the number of principal components (PCs) transformed by the PCA algorithm. The dimensionality of the original data has been significantly reduced by integrating the feature selection (the boosting tree algorithm in the previous section) and the PCA algorithm. The dimensionality of the RUC data has been finally reduced from 160 to 6, and the dimensionality of the NAM data has been further reduced from 196 to 8.
Wind Farm Power Prediction Model The original dimensionality of both the NAM and the RUC data has been significantly reduced by feature selection and PCA transformation. In this research, the RUC model data is used for building short-term wind Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
280
A. Kusiak, H. Zheng and Z. Song Table 4. PCA transformation of the weather forecasting data
Parameter
Unit
Wind speed Wind direction Air density PTdiff_30 mb_sfc SHTFL VEG
ms−1 º kg m−3 K W m−2 %
Original no. of dimensions
Number of PCs
16 16 4 4 4 4
2 2 1 1 1 1
farm power prediction, and the NAM model data is used for long-term prediction. Two ways to predict the power generated by the wind farm are proposed in this research. One is to directly use the weather forecasting data (direct prediction model described in Section 2.5.1), and the other is to use the weather forecasting data to predict the future wind speed first, and then use it to predict the wind farm power. The Direct Prediction Model of Wind Farm Power The direct prediction model is used to predict wind farm power based on the weather forecasting data. The short-term prediction model is expressed in equation (1). yˆ ( t + T ) = f [WSPC ( t + T ) , WDPC ( t + T ) , ADPC ( t + T ) , PTPC ( t + T )]
(1)
The function f(.) describes the underlying relationship between the RUC data and wind farm power. The function will be learned in Section 3.1 by the data-mining algorithms using the SCADA and RUC data. In equation (1), the meaning of the variables are as follows • yˆ is the predicted wind farm power in the future. • T is the prediction horizon. • t is the run-time indicating the time at which a model forecast is started; t + T is the timestamp indicating the time a particular forecast is valid. • WSPC(t + T) is the PCs transformed from the RUC wind speed data, and there are two PCs for wind speed; the other predictors in the function f(.) are the PCs of wind direction, air density and PTdiff_30mb_sfc of RUC data, respectively. • The output of equation (1) is obtained from SCADA while the inputs are the RUC data. The long-term prediction model is shown in equation (2). yˆ ( t + T ) = f [WSPC ( t + T ) , WDPC ( t + T ) , ADPC ( t + T ) , PTPC ( t + T ) , VEGPC ( t + T ) , SHTFLPC ( t + T )] (2) The function in equation (2) is the same as the one is equation (1) except for two minor differences. One is that the predictors in this function are the PCs transformed from NAM data. The other is that two more predictors, VEG and SHTFL, have been added into this function. The function f [.] in equations (1) and (2) will be learned in Sections 3.1 and 4.1 by data-mining algorithms. To derive an accurate power prediction model for a wind farm, the prediction performance of the models learned by data-mining algorithms will be evaluated based on the accuracy metrics described in Section 2.6. The Integrated k-Nearest Neighbour (k-NN) and Wind Speed Prediction Model The basic equation for the wind power density19 is shown in equation (3) Pw = 0.5ρv3 −2
(3) −3
where Pw is the power density (Wm ), r is the air density (kg m ) and v is the horizontal component of the mean free-stream wind velocity (ms−1). Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
Wind Farm Power Prediction
281
As the nacelle of the turbine is usually located at 60 to 80 m above the ground, the air density r is considered as constant at that height. Though the wind direction changes, the nacelle is controlled to face the wind to capture the maximum energy from the wind. Therefore, the wind speed is a significant predictor of the wind farm power, and thus, a lot of research has been done to predict wind speed for the wind farm. In this research, the wind speed prediction model follows almost the same method as the direct prediction model for wind power in Section 2.5.1; the only difference is that the yˆ in equations (1) and (2) becomes the predicted wind speed. The wind speed prediction model is also learned by data-mining algorithms in Sections 3.2.2 and 4.2. Predicting wind farm power curve based on the wind speed as input has been discussed in the literature.20 Kusiak et al.20 showed that the k-NN model accurately predicts wind farm power curve, given the wind speed. In this paper, the wind speed prediction model and the k-NN model are combined to predict wind farm power as shown in Figure 3. The wind speed and wind farm power of SCADA data is used to train and test the k-NN model. The wind speed is generated by the turbine anemometers, while the weather forecasting wind data is provided by the RUC or the NAM models.
Metrics for Prediction Accuracy Different data-mining algorithms and two different methods are used to build a prediction model for wind farm power. Two metrics, the mean absolute error (MAE) and standard deviation of absolute error (Std) are used as the metrics for prediction accuracy. They are computed to select the accurate models (1) and (2) extracted with data-mining algorithms, and to compare the direct prediction model and integrated prediction model. Absolute error (AE), MAE and Std are expressed in equations (4)–(6). AE =
yˆ ( t + T ) − y ( t + T ) × 100% NRP
(4)
N
MAE =
∑ AE (i) i =1
(5)
N
N
Std =
∑ [ AE (i) − MAE ] i =1
(6)
N −1
where yˆ(t + T) is the predicted wind farm power, y(t + T) is the observed (measured) wind farm power, NRP is the nameplate rating power, which is the rating capacity power of the wind farm (the sum of the rate power of all turbines on the wind farm), and is a constant number. N is the number of test data used for the prediction model. The data set for the short- and long-term prediction models will be divided into training and test data sets to train model and test accuracy, respectively.
Figure 3. The structure of the integrated prediction model Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
282
A. Kusiak, H. Zheng and Z. Song
Case Study of Short-term Prediction For short-term prediction, the model is expressed in equation (1), T = 1, 2, 3, . . . 12 h. The prediction model built here follows the same forecasting steps and horizon of the RUC model. To predict wind farm power 1 to 12 h ahead, 12 prediction models will be constructed, one for each prediction horizon. The short-term wind farm power prediction model has the following properties: • The maximum forecast length is 12 h. • A new forecast is issued every hour. • A new 12 h forecast is issued at 00:00, 03:00, 06:00, 09:00, 12:00, 15:00, 18:00 and 21:00 GMT, and a 9 h forecast is issued at all other hours. Note that the output of the prediction model is hourly power (i.e. the average power over an hour), and yˆ (t + T) is the predicted hourly power during t + (T − 1) and t + T. For example, if the run-time t is 00:00, the short-term model can generate predicted hourly power from 01:00 to 12:00. Figure 4 shows an example of the output of the short-term prediction model issued at 00:00 AM. The model used to generate this output is discussed in Sections 3.1 and 3.2. As the calculation time of RUC model takes less than 1 h for each run, the operational forecasting horizon of the short-term wind farm power prediction begins at t + 2. In this paper, a 1 h-ahead prediction is selected for testing, however, a 2 h-ahead power prediction can be also realized.
Direct Prediction Model of Wind Farm Power Algorithm Selection Five data-mining algorithms that appeared to be the most promising have been used to construct the direct prediction model (1) for wind farm power. They include: the support vector machine regression (SVMreg),15,21 multilayer perceptron network (MLP),16,22 radial basis function (RBF) network,16,12 classification and regression (C&R) tree23,24 and random forest25 algorithms. The SVM is a supervised learning algorithm used in classification and regression. It constructs a linear discriminant function that separates instances as widely as possible. The MLP algorithm is usually used to model complex relationships between inputs and outputs or to find patterns in data. The C&R tree builds a decision tree to predict either classes (classification) or Gaussians (regression). The random forest algorithm grows many classification trees to classify a new object from an input vector. Each tree ‘votes’ for every class, and finally, the forest chooses the classification having the most votes over all the trees in the forest. The RBF is usually used in non-linear regression and classification modeling. To fulfil the task of short-term prediction, 12 prediction models with different forecasting horizons (t + 1, t + 2, . . . , t + 12) need to be constructed. In order to select one uniform algorithm to train the 12 prediction
100000 95000 90000 Power (kW)
85000 80000 75000 70000 65000 60000 55000 50000 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:00 10:00 11:00 12:00 Prediction horizon
Figure 4. Example of the short-term prediction generated at 00:00 Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
Wind Farm Power Prediction
283
models, the prediction model that predicts 6 h ahead (t + 6) was selected to investigate the five data-mining algorithms. As the original power recorded by SCADA data is at 10 min intervals, every six consecutive data points were aggregated into hourly power (the average of six measured power values). The principal components transformed from RUC data for 6 h-ahead prediction and the hourly power from SCADA data resulted in 2250 instances (data set 1 in Table 5). The run-time of data set 1 in Table 5 considered for analysis starts at ‘5/29/06 00:00’ and continues to ‘8/31/06 17:00’. During this time period, the overall wind farm performance was normal. Data set 1 was divided into two subsets, data set 2 and data set 3. Data set 2 contains 1798 data points and was used to develop a prediction model with data-mining algorithms. Data set 3 includes 452 data points and was used to test the prediction performance of the model learned from data set 2. Note that the time stamp used in Table 5 is the run-time t, which is different from the time stamp t + T. The MAE in equation (5) and Std in equation (6) were used to select the most suitable algorithm for building the short-term prediction model (1). The small value of the MAE and Std indicate accurate, stable and robust prediction performance. Table 6 summarizes the prediction accuracy of the models trained by different data-mining algorithms in data set 3 of Table 5. The MLP algorithm outperformed the other four algorithms. The C&R tree and random forest algorithms performed worst. The MLP network algorithm was finally selected for building the direct prediction model for short-term wind farm power. The RBF and MLP are actually two different NN (neural network) algorithms; however, the MLP outperformed the RBF in short-term prediction. The NN structure, including different numbers of hidden units and different types of activation functions for hidden and output neurons, has a significant impact on its prediction accuracy. In this research, 100 NNs with different structures have been trained for the MLP and RBF, respectively. Only one RBF and one MLP NN with best prediction performance were retained with the prediction error statistics in Table 6. Figure 5 shows the first 100 observed and predicted wind farm power from data set 3 of Table 5 (the 6 hour-ahead prediction). The predicted power follows the trend of the observed power. Some of the predicted values match the observed values pretty closely, while some do not. Short-term Prediction Based on MLP Algorithm The MLP algorithm has been selected for training all 12 prediction models (1) in this research. Following the steps of Section 3.1.1, 12 direct prediction models for short-term wind farm power were built. Table 7 summarizes the prediction accuracy of the models built for each different horizon prediction. Note that in Table 7,
Table 5. The data set description of 6 h-ahead prediction Data set
Start time stamp
End time stamp
Description
1 2 3
5/29/06 00:00 5/29/06 00:00 8/125/06 18:00
8/31/06 17:00 8/12/06 17:00 8/31/06 17:00
Total data set; 2250 observations Training data set; 1798 observations Test data set; 452 observations
Table 6. Error statistics of different models based on data set 3 of Table 5 Algorithm SVMreg MLP RBF C & R tree Random forest
Copyright © 2008 John Wiley & Sons, Ltd.
MAE (%)
Std (%)
16.89 10.94 20.32 25.43 22.19
17.92 9.99 19.68 22.57 19.89
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
284
A. Kusiak, H. Zheng and Z. Song 140000
Power (kW)
120000 100000 80000 60000 40000 20000 0 1
8
15 22 29 36 43 50 57 64 71 78 85 92 99 Testing Data (Hourly average) Observed power
Predicted power
Figure 5. The 6 h-ahead prediction of wind farm power
Table 7. Error statistics of the MLP direct prediction model of short-term wind farm power Prediction t t t t t t
+ + + + + +
1 2 3 4 5 6
MAE (%)
Std (%)
9.28 9.35 9.76 9.36 9.97 10.49
8.12 8.21 8.69 8.32 8.93 9.99
Prediction t t t t t t
+ + + + + +
7 8 9 10 11 12
MAE (%)
Std (%)
9.82 10.57 8.41 11.06 11.19 11.49
9.19 9.91 8.73 10.63 9.08 10.53
t + 1 means 1 h-ahead prediction, t + 12 means 12 h-ahead prediction. The direct prediction model for short-term wind farm power from t + 1 to t + 12 can be realized by building 12 MLP prediction models. The results in Table 7 show that the prediction performance is stable and robust at the prediction horizons t + 1 through t + 12. Figure 6(a)–(c) illustrate the first 100 predicted power and observed power of models with different prediction horizons, and they are 3, 8 and 12 h-ahead predictions, respectively. Figure 7(a) and (b) illustrate the MAE and Std, two important metrics of prediction accuracy, for 12 different forecasting horizons of the short-term MLP prediction model.
The Integrated Prediction Model and Comparison The k-NN Model for Wind Farm Power Curve Prediction The previous research20 has shown that the k-NN model is accurate for prediction of wind farm power curve given the wind speed as input. The predictor for the k-NN model is the average wind speed measured at the nacelle of every turbine of the wind farm. Using the average wind speed as input to the k-NN model, the wind farm power can be predicted accurately when the wind farm is operating under normal conditions. The normal conditions exclude wind speed that is too low or high, turbines undergoing maintenance and low power output due to control issues and environment issues. To predict hourly power, every six consecutive data points were aggregated into hourly power (the average of 6 measured power values), and the hourly wind speed was aggregated in the same way. The data set used Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
Wind Farm Power Prediction
285 120000
Power (kW)
100000 80000 60000 40000 20000 0 1
8
15 22 29 36 43 50 57 64 71 78 85 92 99 Testing Data (Hourly average) Observed power
Predicted power
(a) 120000
Power (kW)
100000 80000 60000 40000 20000 0 1
8
15 22 29 36 43 50 57 64 71 78 85 92 99 Testing Data (Hourly average) Observed power
Predicted power
(b) 120000
Power (kW)
100000 80000 60000 40000 20000 0 1
8
15 22 29 36 43 50 57 64 71 78 85 92 99 Testing Data (Hourly average) Observed power
Predicted power
(c)
Figure 6. Direct prediction of short-term wind farm power by MLP: (a) 3 h-ahead prediction; (b) 8 h-ahead prediction; (c) 12 h-ahead prediction
in the analysis is shown in Table 5. Note that to develop a prediction model with a k-NN algorithm, the weather forecasting data in Table 5 is not used; only the hourly power and wind speed of SCADA data are used. Data set 1 was divided into two data subsets, data set 2 and data set 3. Data set 2 was used to develop a prediction model with the k-NN algorithm. Data set 3 was used to test the prediction performance of the model learned from data set 2. Table 8 shows the error statistics of the k-NN model over data set 3 of Table 5. The prediction model trained by k-NN performs an accurate, stable and robust prediction given the hourly wind speed as predictor. Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
286
A. Kusiak, H. Zheng and Z. Song 12 11.5
MAE (%)
11 10.5 10 9.5 9 8.5 8 t+1 t+2 t+3 t+4 t+5 t+6 t+7 t+8 t+9 t+10 t+11 t+12 Prediction horizon
(a) 11 10.5 Std (%)
10 9.5 9 8.5 8 t+1 t+2 t+3 t+4 t+5 t+6 t+7 t+8 t+9 t+10 t+11 t+12 Prediction horizon
(b)
Figure 7. MAE and Std of the direct prediction model for short-term wind farm power by MLP: (a) MAE; (b) Std
Table 8. Error statistics of the k-NN model Algorithm k-NN (k = 25)
MAE (%)
Std (%)
1.556
1.465
Comparison of the Integrated and Direct Model Predicting wind farm power curve with the k-NN algorithm calls for a wind speed prediction model. Following the method described in Section 2.5.2, a wind speed prediction model can also be built following the same procedure of the direct prediction model for wind farm power. The wind speed prediction model uses the same predictors as the direct prediction model expressed in equation (1) of Section 2.5.1, which is the RUC data after feature selection and PCA transformation. However, the output yˆ in equation (1) is the wind speed other than wind farm power. The five data-mining algorithms used in building the direct prediction model trained different wind speed prediction models, and again, the MLP network algorithm was proved to outperform the other four algorithms. The integrated model for the short-term power prediction model is composed of MLP and k-NN algorithms. Therefore, the basic procedure for integrated model prediction is to predict wind speed with the MLP model first, and then use the predicted wind speed to predict the wind farm power with the kNN model. As the model-building procedure is obvious, the detail process is not shown in this paper. The statistics of the prediction performance of the integrated model are shown in Table 9. In comparing the error Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
Wind Farm Power Prediction
287
Table 9. Error statistics of the integrated prediction model for short-term wind farm power Prediction t+1 t+2 t+3 t+4 t+5 t+6
MAE (%)
Std (%)
Prediction
MAE (%)
Std (%)
10.21 10.28 10.54 10.11 11.17 11.75
8.93 9.03 9.38 8.98 10.01 11.19
t+7 t+8 t+9 t + 10 t + 11 t + 12
10.21 10.99 9.67 12.72 12.08 12.41
9.56 10.31 10.04 12.22 9.81 11.37
Figure 8. Example of the long-term prediction generated at 00:00
statistics of Tables 7 and 9, it can be found that the integrated model is less accurate than the direct prediction model in all 12 prediction horizons. The computational experience reported in Section 3.2.1 showed that the k-NN algorithm provided accurate power curve predictions. Though the k-NN model and the wind speed prediction model performed well individually, the integrated model produced a larger error when predicting future power. This could be due to the fact that the power is a cubic function of the wind speed, as indicated by the wind power density function (3) of Section 2.5.2. In addition, the wind speed in the k-NN model is too sensitive as a predictor for wind farm power, which implies that a small error in wind speed prediction might lead to a large prediction error of wind farm power prediction. The integration of the two models did not improve prediction accuracy. Even accurate wind speed prediction cannot guarantee accurate wind farm power prediction; therefore, it is better to predict wind farm power directly, rather than predict wind speed first.
Case Study of Long-term Prediction For the long-term prediction model expressed in equation (2), T = 3, 6, 9, . . . , 84 h. The prediction model built here follows the same forecasting steps and horizon of the NAM model. In order to predict wind farm power from 3 to 84 h ahead, 28 prediction models need to be constructed respectively for different long-term prediction horizons. The long-term wind farm power prediction model has the following features: • Day-ahead prediction with maximum 84 h forecast length. • A new 84 h forecast is issued four times daily at 00, 06, 12, 15, and 18 GMT. • A forecast value is saved every 3 h. Note that the output of the prediction model is the 3 h-power (average power over a 3 h interval), and yˆ (t + T ) is the predicted 3 h-power during t + (T − 3) and t + T. For example, if the run-time t is 5/29/06 00:00, the long-term prediction can generate predicted 3 h-power from 5/29/06 03:00 to 6/1/06 12:00. Figure 8 shows an Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
288
A. Kusiak, H. Zheng and Z. Song
example of the forecasting horizon of the long-term prediction model at 00:00, and this model will be built in Sections 4.1 and 4.2. Note that not all 28 prediction horizons from t + 3 to t + 84 are shown in Figure 8. As the calculation time of NAM model takes less than 6 h in each run, the operational forecasting horizon of the long-term wind farm power prediction starts from t + 6. The long-term wind farm power prediction over 6 h-ahead can be realized in practice; however, in this paper, 3 h-ahead prediction is considered to validate the methodology for long-term prediction.
Direct Prediction Model for Wind Farm Power Algorithm Selection Five data-mining algorithms (the same as used previously) were selected to train the direct prediction model (2) for long-term wind farm power. For long-term prediction, 28 prediction models with different forecasting horizons (t + 3, t + 6, . . . , t + 84) need to be constructed. In order to select one uniform algorithm to train the 28 prediction models, the prediction model that predicts 45 h ahead (t + 45) was selected to investigate the five data-mining algorithms. As the original power data recorded by SCADA is at 10 min intervals, every 18 consecutive data points were aggregated into a 3 h-power (the average of 18 measured power values). The principal components transformed from the NAM data for the 45 h-ahead prediction and the 3 h-power from the SCADA data resulted in 141 instances (data set 1 in Table 10). As only the 6 week long NAM data set was available in this research, and the power record is aggregated from 10 min data to 3 h data, the number of training and testing data for long-term prediction is much smaller than that of short-term prediction. The run-time of data set 1 in Table 10 starts at ‘5/29/06 00:00 AM’ and continues to ‘7/13/06 6:00 PM’. During this time period, the overall wind farm performance was normal. Data set 1 was divided into two subsets, data set 2 and data set 3. Data set 2 contains 113 data points and was used to develop a prediction model with data-mining algorithms. Data set 3 includes 28 data points and was used to test the prediction performance of the model learned from data set 2. Note that the time stamps used in Table 10 are the run-time t other than the time stamp t + T. Using the metrics (MAE in equation (5) and Std in equation (6)) for the short-term prediction in Section 3.1.1, five data-mining algorithms are compared. Table 11 summarizes the prediction accuracy of the models trained by different data-mining algorithms in data set 3 of Table 10. The MLP proved to outperform other algorithms in both short- and long-term prediction. Therefore, the MLP network algorithm was selected for building the direct prediction models for long-term wind farm power.
Table 10. The data set description of 45 h-ahead prediction Data set 1 2 3
Start time stamp
End time stamp
Description
5/29/06 00:00 5/29/06 00:00 6/26/06 18:00
7/13/06 18:00 6/26/06 12:00 7/13/06 18:00
Total data set; 141 observations Training data set; 113 observations Test data set; 28 observations
Table 11. Error statistics of different models based on data set 3 of Table 10 Algorithm SVMreg MLP RBF C & R tree Random forest
Copyright © 2008 John Wiley & Sons, Ltd.
MAE (%)
Std (%)
15.93 11.87 31.51 28.76 25.69
17.84 9.87 27.36 26.94 21.86
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
Wind Farm Power Prediction
289
Figure 9 shows the observed and predicted wind farm power from data set 3 of Table 10 (the 6 h-ahead prediction). The predicted power precisely follows the trend of decreasing and increasing observed power. Some of the predicted values match the observed values pretty closely, while others do not. Long-term Prediction Results The MLP network algorithm was selected to train all 28 prediction models (2) in this research. Following the same steps in Section 4.1.1, 28 direct prediction models for long-term wind farm power were built. Table 12 summarizes the prediction accuracy of the models built for different horizon predictions. Note that only 16 of the 28 prediction results are shown in this table, as 16 results are enough to prove the performance of the methods and models built in this research. The direct prediction model for long-term wind farm power from T + 3 to T + 84 can be realized by building 28 MLP prediction models. The prediction performance is stable and robust at different prediction horizons, as illustrated in Table 12. Figure 10(a)–(c) illustrate the predicted power and observed power of models with different prediction horizons, and they are 21, 57 and 81 h-ahead predictions, respectively. Figure 11(a),(b) illustrate the MAE and standard deviation (Std), two important metrics for prediction accuracy, of different forecasting horizons of the long-term prediction model.
120000
Power (kW)
100000 80000 60000 40000 20000 0 1
3
5
7
9
11 13 15 17 19 21 23 25 27
Testing data (3-hour average) Observed power
Predicted power
Figure 9. The 45 h-ahead prediction of the wind farm power by MLP
Table 12. Error statistics of direct prediction MLP model of long-term wind farm power Prediction t+3 t+9 t + 15 t + 21 t + 27 t + 33 t + 39 t + 42
MAE (%)
Std (%)
Prediction
MAE (%)
Std (%)
5.93 9.12 9.92 9.39 10.35 11.81 11.63 11.49
4.23 8.91 8.04 7.28 6.41 12.24 7.79 10.06
t + 45 t + 51 t + 57 t + 63 t + 69 t + 75 t + 81 t + 84
12.87 10.97 13.82 11.88 9.56 10.83 6.37 10.57
10.23 10.92 9.61 9.95 7.68 9.32 6.19 8.78
Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
290
A. Kusiak, H. Zheng and Z. Song
Figure 10. Direct prediction of long-term wind farm power by MLP: (a) 21 h-ahead prediction; (b) 57 h-ahead prediction; (c) 81 h-ahead prediction
Integrated Prediction Model and Comparison The integrated prediction model for long-term wind farm power follows the same procedure of Section 3.2 for the short-term model. The long-term wind speed prediction model uses the same predictors as the direct prediction model expressed in equation (2) of Section 2.5.1, which is the NAM data after feature selection and PCA transformation. However, the output yˆ in equation (2) is the wind speed other than wind farm power. The MLP network algorithm outperforms the other four algorithms when training the wind speed prediction model. The integrated model for the long-term power prediction model is composed of MLP and k-NN algorithms, Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
Wind Farm Power Prediction
291
Figure 11. MAE and Std of the direct prediction model for long-term wind farm power by MLP: (a) MAE; (b) Std Table 13. Error statistics of integrated prediction models for long-term wind farm power Prediction t+3 t+9 t + 15 t + 21 t + 27 t + 33 t + 39 t + 42
MAE (%)
Std (%)
Prediction
MAE (%)
Std (%)
6.22 9.57 11.11 10.52 11.59 13.34 13.14 12.98
4.44 9.35 9.01 8.15 7.18 13.83 8.80 11.37
t + 45 t + 51 t + 57 t + 63 t + 69 t + 75 t + 81 t + 84
14.01 11.93 16.72 14.37 11.09 12.56 7.01 11.62
11.13 11.88 11.62 12.04 8.91 10.81 6.81 9.65
which predict wind speed with the MLP model first, and then use the predicted wind speed to predict the wind farm power with the k-NN model. The statistics of the prediction performance of the integrated model is shown in Table 13. In comparing the error statistics of Tables 12 and 13, it can be found that the integrated model does not improve prediction accuracy for all prediction horizons of long-term prediction, compared with the direct prediction model in Section 4.1. The reason for the difference in predicting the power output has been discussed in Section 3.2.2. Accurate wind speed prediction can improve wind farm power prediction but cannot guarantee it. Determining power based on the predicted wind speed is not a safe way for wind farm power prediction, as the power prediction Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
292
A. Kusiak, H. Zheng and Z. Song
accuracy is sensitive to the wind speed. A small error in wind speed prediction can lead to a large error in wind farm power prediction, so even accurate wind speed prediction and k-NN models cannot guarantee accurate wind farm power prediction.
Conclusion In this paper, a short-term prediction model with a maximum 12 h forecast length and a long-term prediction model with a maximum 84 h forecast length were built using weather forecasting data as predictors. The boosting tree algorithm and PCA transformation were used to reduce the predictor data dimension and enhance prediction accuracy. Among five data-mining algorithms considered in this research, the MLP network algorithm (a neural network algorithm) outperformed the other four algorithms in building both long- and short-term prediction models. Two methods for building prediction models were compared. The integrated models (integrated k-NN and MLP models), which used the predicted wind speed of the MLP model as input for the k-NN model to predict wind farm power, turned out to provide less accurate and stable predictions than direct prediction models for wind farm power in both the short- and long-term. Both short- and long-term prediction models predicted the wind farm power well at different time scales and horizons. The accuracy of the prediction model depends highly on its predictors—weather forecasting data, which means the more accurate the weather forecasting data, the better prediction performance the model. Unlike the time series and persistent model, the prediction models based on weather forecasting data had no obvious tendency to increase error as the prediction horizon became longer. However, for predictions within a 5 h horizon, the time series (persistent) prediction model, using wind farm data as input, could outperform the MLP direct prediction model, using weather forecasting data as input. One limitation in this paper is that there is no specific information about the terrain, wind farm location and weather-forecasting grid points, and the information allowing explaining the data-mining results using the existing theories. The other limitation is that only a 3 month long SCADA data from a wind farm was available, and thus seasonal performance of the proposed methodology presented in the paper could not be validated. These two limitations are due to the current data-sharing practices of the wind energy industry. However, once more data are accessible, the proposed models can be further tested. The long-term prediction models are powerful tools for operation management of the wind energy market, and the short-term prediction model can be helpful to the on-site management of the wind farm. The ultimate goal of the research presented in this paper is to further improve the accuracy of prediction models for wind farms. One avenue to be pursued in future research is to incorporate on- and off-site observations other than weather forecasting data into the prediction model. It is likely that other data mining algorithms, e.g. cascade neural network or fuzzy logic, would further enhance performance.
References 1. Landberg L. Short-term prediction of the power production from wind farms. Journal of Wind Engineering and Industrial Aerodynamics 1999; 80:207–220. 2. Mohandes MA, Reham S, Halawani TO. A Neural networks approach for wind speed prediction. Renewable Energy 1998; 13:345–354. 3. Lange M, Focken U, Physical Approach to Short-Term Wind Power Prediction. Springer-Verlag: Berlin, Heidelberg, 2006. 4. Barbounis TG, Theocharis JB, Alexiadis MC, Dokopoulos PS. Long-term wind speed and power forecasting using local recurrent neural network models. IEEE Transactions on Energy Conversion 2006; 21:273–284. 5. Damousis IG, Alexiadis MC, Therocharis JB, Dokopoulos PS. A fuzzy model for wind speed prediction and power generation in wind parks using spatial correlation. IEEE Transactions on Energy Conversion 2004; 19:352–361. 6. Sfetsos A. A novel approach for the forecasting of the mean hourly wind speed time series. Renewable Energy 2002; 27:163–174. 7. Torres JL, Garcia A, De M, De A. Francisco. Forecast of hourly average wind speed with ARMA models in Spain. Solar Energy 2005; 79:65–77. Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we
Wind Farm Power Prediction
293
8. Kusiak A, Song Z. Combustion efficiency optimization and virtual testing: a data-mining approach. IEEE Transactions on Industrial Informatics 2006; 2:176–184. 9. Harding JA, Shahbaz M, Srinivas S, Kusiak A. Data mining in manufacturing: a review. ASME Transactions: Journal of Manufacturing Science and Engineering 2006; 128:969–976. 10. Berry MJA, Linoff GS, Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management (2nd edn). Wiley: New York, 2004. 11. Backus P, Janakiram M, Mowzoon S, Runger GC, Bhargava A. Factory cycle-time prediction with data-mining approach. IEEE Transactions on Semiconductor Manufacturing 2006; 19:252–258. 12. Tan PN, Steinbach M, Kumar V, Introduction to Data Mining. Pearson Education/Addison Wesley: Boston, 2006. 13. Rapid update cycle. [Online]. Available: http://en.wikipedia.org/wiki/Rapid_Update_Cycle. (Accessed 20 March 2008). 14. North American mesoscale model. [Online]. Available: http://en.wikipedia.org/wiki/North_American_Mesoscale_ Model. (Accessed 20 April 2008). 15. Smola AJ, Schoelkopf B. A tutorial on support vector regression. Statistics and Computing 2004; 14:199–222. 16. Bishop CM. Neural Networks for Pattern Recognition. Oxford University Press: New York, 1995. 17. Espinosa J, Vandewalle J, Wertz V. Fuzzy Logic, Identification and Predictive Control. Springer-Verlag: London, 2005. 18. Johnson RA, Wichern DW. Applied Multivariate Statistical Analysis (4th edn). Prentice Hall: Upper Saddle River, NJ, 2005. 19. Spera DA (Ed.) Turbine Wind Technology: Fundamental Concepts of Wind Turbine Engineering. ASME: New York, 1994. 20. Kusiak A, Zheng HY, Song Z. Models for Monitoring of Wind Farm Power. Renewable Energy (forthcoming). 21. Shevade SK, Keerthi SS, Bhattacharyya C, Murthy KRK. Improvements to the SMO Algorithm for SVM Regression. IEEE Transactions on Neural Networks 2000; 11:1188–1193. 22. Seidel P, Seidel A, Herbarth O. Multilayer perceptron tumor diagnosis based on chromatography analysis of urinary nucleoside. Neural Networks 2007; 20:646–651. 23. Witten IH, Frank E, Data Mining: Practical Machine Learning Tools and Techniques (2nd edn). San Francisco, CA: Morgan Kaufmann, 2005. 24. Breiman L, Friedman J, Olshen RA, Stone CJ. Classification and Regression Trees. Wadsworth International: Monterey, CA, 1984. 25. Breiman L. Random forest. Machine Learning 2001; 45:5–32. 26. Friedman JH. Stochastic gradient boosting. Computational Statistics & Data Analysis 2002; 38:367–378. 27. Friedman JH. Greedy function approximation: a gradient boosting machine. Annals of Statistics 2001; 29:1189–1232.
Copyright © 2008 John Wiley & Sons, Ltd.
Wind Energ 2009; 12:275–293 DOI: 10.1002/we