Ensemble learning for wind profile prediction with missing values

2 downloads 4121 Views 581KB Size Report
Jul 5, 2011 - investigate the missing value recovery for wind data. Due to the complexity of data .... multiple hypotheses (e.g., classifiers, experts) to support.
Neural Comput & Applic DOI 10.1007/s00521-011-0708-1

ISNN 2011

Ensemble learning for wind profile prediction with missing values Haibo He • Yuan Cao • Yi Cao • Jinyu Wen

Received: 27 February 2011 / Accepted: 5 July 2011  Springer-Verlag London Limited 2011

Abstract In this paper, we aim to develop computational intelligence approaches for wind profile prediction. Specifically, we focus on two aspects in this work. First, we investigate the missing value recovery for wind data. Due to the complexity of data collection in such processes, wind data normally include missing values. Therefore, how to effectively recover such missing values for learning and prediction is an important aspect for wind profile prediction. Second, we develop an ensemble learning approach based on multiple neural network models. Our proposed method uses a new strategy based on the temporal information to assign the weights for each model dedicated for

This work was performed when Yi Cao was a Visiting Scholar at the Department of Electrical, Computer, and Biomedical Engineering at the University of Rhode Island. H. He (&) Department of Electrical, Computer, and Biomedical Engineering, University of Rhode Island, Kingston, RI 02881, USA e-mail: [email protected] Y. Cao Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ 07030, USA e-mail: [email protected] Y. Cao School of Electrical and Automation Engineering, Nanjing Normal University, Nanjing, Jiangsu 210042, People’s Republic of China e-mail: [email protected] J. Wen College of Electrical and Electronic Engineering (CEEE), Huazhong University of Science and Technology, Wuhan 430074, People’s Republic of China e-mail: [email protected]

wind profile prediction to achieve better prediction performance. Various simulation studies and statistical testing demonstrate the effectiveness of our approach. Keywords Ensemble learning  Wind profile prediction  Missing value recovery  Neural Network  Machine Learning

1 Introduction With the continuous significant increase in energy demand and environment concerns, renewable energy has become a critical research topic world widely. This means consumers supporting national energy independence are confronted with increasing panoply of alternatives from solar systems, wind farms, geothermal energy, and others, where the value of which differs with terrain locations, regional variations, weather conditions, and many other factors. Furthermore, the driving forces toward lower-carbon generation technologies, improved efficiency on the power delivery, and increased user-supplier interaction, presented ever greater challenges to our society [2, 3, 18]. This has created immense opportunities for academia, industry, and government to develop innovative technologies and solutions to facilitate the development of a smart and efficient energy system. While the entire smart grid system is an extremely complicated social-technological system, in this paper, we focus on the wind profile prediction to support the wind energy development. One of the important research issues for wind energy system is related to the accurate wind profile prediction because it plays an important role in planning and designing of wind farms. Due to the complex intersections among large-scale geometrical parameters such as surface

123

Neural Comput & Applic

conditions, pressure, temperature, and direction, wind forecasting has been considered as a very challenging task in the community [39]. Many approaches have been presented in literature, including the traditional linear autoregressive moving average (ARMA) models [5, 22], radial basis functions [6], neural networks [29] and recurrent neural networks [24], Fuzzy logic methods [23], among others. In this paper, we aim to investigate this issue to tackle two important challenges. First, how to handle the missing values of wind profile data is of critical importance for wind prediction. Due to the complexicity of collecting wind profile data, such data set normally include missing values. Conventionally, there are two major approaches to handle the missing value. The first approach is to simply remove (ignore) all the data examples with any missing feature values. As simple as it seems to be, this approach suffers from the potential of losing important useful information to support the decision-making process. The second approach includes various imputation techniques to recover the missing feature values [10, 30], including mean imputation [32, 42], hot deck imputation [38, 42, 44], regression-based imputation [4], expectation and maximization (EM)-based imputation [30], multiple imputations [36, 37], and many others. All these methods provide useful tools and successful applications for missing data recovery in many fields. In this paper, we try to tackle this problem based on the Mahalanobis distance of the data distribution to recover the missing value of wind data. Second, since ensemble learning has recently demonstrated many successful applications across different domains, we investigate the ensemble learning approach for wind profile prediction. In general, ensemble learning is refereed to the procedure of developing and integrating multiple hypotheses (e.g., classifiers, experts) to support the decision-making processes. Briefly speaking, ensemble learning has the advantage of improved accuracy and robustness compared to the single model based learning methods [25]; therefore, it has attracted growing attention in the computational intelligence community. There are two critical issues related to the ensemble learning. First, how can one develop multiple hypotheses in a principled way? For instance, hypothesis diversity plays a critical role in a successful ensemble learning methodology [26–28, 31, 35]; therefore, how to systematically develop such diversified hypotheses has become a critical issue. Some popular approaches include bootstrap aggregating (bagging) [7], adaptive boosting (AdaBoost) [11, 12], subspace methods [9, 14, 16], stacked generalization [45], mixture of experts [19–21], and ensemble learning with imbalanced data [47]. Second, how can one strategically integrate the outputs of each individual hypothesis for improved final decision?

123

This problem is normally referred to as the combining rule, such as geometric average method, arithmetic average method, median value method, majority voting method, among others. Interested reader can refer to [25, 43] for a detailed discussion on this. Although there are many literature results on the ensemble learning, very limited results have been reported for using such techniques for wind profile prediction. In this work, we adopt a block bootstrap mechanism to demonstrate the power of ensemble learning for wind data prediction. We also propose a new weight strategy to combine the multiple prediction models, which specifically considers the temporal information for wind prediction and has demonstrated competitive performance for wind prediction. The rest of this paper is organized as follows. Section 2 presents the Mahalanobis distance-based missing value recovery for wind profile data. In Sect. 3, a multiple neural network model with a new weighting strategy is presented in detail. Section 4 presents the wind data set we used in our current analysis and the simulation results based on our approach. Finally, a conclusion is given in Sect. 5. 2 Missing value recovery for wind profile data Missing feature value is an unavoidable problem in many real-world applications. In fact, due to the complexcity of collecting wind profile data, the missing value issue is a more common and serious problem in dealing with wind prediction. In this work, we adopt an approach based on the Mahalanobis distance to recover the missing value for wind profile analysis [40, 41, 46]. Here, we consider a general case of data XL with some missing values in the n dimensional feature space X (XL can be a single data example or multiple data examples). Therefore, XL can be represented by: XL ¼ ½Xa Xb 

ð1Þ

where Xa represents the features with known values, whereas Xb are the features with missing values. The objective here is to recover Xb to minimize the Mahalanobis distance to the cluster data from the same class [40, 41, 46]: Xb ¼ arg minðdðX^b ÞÞ

ð2Þ

T dðXÞ ¼ ðX  lc ÞC1 c ðX  lc Þ

ð3Þ

X^b

where d(x) is the Mahalanobis distance, lc and Cc are the mean value vector and covariance matrix for all examples from a given class c, respectively. To find the solution of (2), one can set the derivative of d(X) to zero:

Neural Comput & Applic

odðXÞ j ^ ¼ 0 oXb Xb ¼Xb

ð4Þ

Meanwhile, the inverse of covariance matrix Cc in (3) can be partitioned into known and missing value parts according to the data XL:   Paa Pab C1 ¼ ð5Þ c Pba Pbb where Pba = PTab because Cc is symmetrical. Therefore,     od od Pab P ... ¼ 2f½Xa Xb   lc g aa ¼ 0 ð6Þ Pba Pbb oX1 oXn To find the solution of (6), the missing feature part Xb can be recovered by: Xb ¼ ðXa 

lca ÞPab P1 bb

þ lcb

ð7Þ

where lca and lcb are the parts in lc that correspond to the known feature and missing feature part, respectively. In this paper, we will investigate the application of this approach to recover the missing values in wind profile data. Before we proceed to discuss the proposed ensemble learning method for wind profile prediction, we would like to first give examples on a synthetic data set and a real-world data to show how the proposed missing value recovery works. The synthetic data set used in this example includes 1,000 data instances randomly drawn from a three-dimensional joint Gaussian distribution. The mean of the distribution is [20, 25, 30]T and the covariance matrix is [2, 1.5, 1.4; 1.8, 2, 0.5; 2. 0.2 3]. We first remove 10 feature values from feature 2 and another 10 feature values from feature 3 and verify the recovered results. Figure 1 demonstrates the recovered missing value distribution in this synthetic data set. The black dots represent all the data without missing

Table 1 The original values and recovered values for the synthetic data set Feature 2 Original

25.01

23.16

21.31

24.72

25.52

Recovered

24.38

24.43

21.73

24.83

24.98

Original

24.96

24.49

23.74

26.30

28.04

Recovered

24.32

25.89

24.61

25.64

26.26

Original

30.97

30.67

30.22

30.69

28.54

Recovered

28.93

29.66

29.96

30.85

29.32

Original

31.83

27.29

29.78

29.69

30.60

Recovered

31.84

29.14

30.34

30.71

29.78

Feature 3

values, the red circles represent all the recovered data that originally missed the second feature values, and the green crosses represent all the recovered data that originally missed the third feature values. Table 1 summarizes the true original numerical values and the recovered values by our method. The real-world data set used in this paper is the Abalone data set (4177 examples, 8 features, and 29 classes) from UCI Machine Learning Repository. Because there are no missing values in the original data set, ten examples are randomly selected, and their corresponding ‘‘whole weight’’ feature values are removed so as to be missing feature values. After the recovery process by the proposed algorithm, Fig. 2 illustrates the recovered data set in two-dimensional spaces (‘‘length’’ versus ‘‘whole weight’’), and the corresponding numerical values of the original known and recovered feature values are also presented. From the results in both examples, one can see that the recovered values are reasonable to their corresponding true original values; therefore, we hope this method can provide a useful technique for the missing value recovery for the wind profile data.

35 34 33

Feature 3

32 31 30 29 28 27 26 25 35

Fe

30

at

ur

e

25

2

15

16

17

18

19

20

21

22

23

24

25

Feature 1

Fig. 1 Demonstration of the recovered missing values in 3-dimensional synthetic data set

Fig. 2 Demonstration of the recovered missing values in Abalone data set

123

Neural Comput & Applic

3 Ensemble learning for wind profile prediction In this work, we propose to use ensemble learning approaches to forecast the wind speed based on historical weather features such as wind speed, dew point, direction, temperature, and others. Two ensemble learning strategies are presented in this paper. The first strategy is the bootstrap-based multiple neural network models and the other is the weighted multiple neural network models. Figure 3 illustrates the idea of the bootstrap-based multiple neural network model. For the time series prediction, in order to capture the dependence structure, we use the blocks bootstrap mechanism for re-sampling. In this approach, blocks of consecutive observations are sampled with replacement from the training period. There are two types of blocks bootstrap, nonoverlapping blocks bootstrap and overlapping blocks bootstrap. In the nonoverlapping blocks bootstrap, the series are divided into a set of nonoverlapping blocks of fixed length. Then, the blocks bootstrap is created by sampling (with replacement) from the set of the blocks. The overlapping blocks bootstrap is similar to the nonoverlapping block bootstrap except that the blocks may contain the same examples, i.e., overlapped. In this way, the blocks are created in a way similar to the moving block (moving block bootstrap). Detailed discussions about the block size choice and related issues can be found in [34]. For each bootstrap sample, a neural network model is trained with random initial weights. The testing data sets are sent to all these neural networks, and their outputs will be combined through a combination function to get the final predicted output. In this work, we use arithmetic average combination function to combine the results from Fig. 3 The bootstrap-based multiple neural network models

123

individual hypotheses. Suppose, for testing data xte ; each hypothesis hk in the ensemble of hypotheses H ¼ fh1 ; h2 ; . . .; hn g produces an individual predicted result y^k : P Then, the final predicted value is y^ ¼ nk¼1 y^k =n: In the weighted multiple neural network models, we use the data of each month in the training data set to train a hypothesis and combine the outcomes from these hypotheses through a weighted combination function to predict the wind profile for the testing data set. Consider that we have historical observations from month m1 to month mi, and the target month is mi?1 where 1 B mk B 12 and mk 2 Z; k ¼ 1; . . .; i þ 1: Let the data of each month in the training data set be Dk ; k ¼ 1; . . .; i: For each Dk ; we train a hypothesis hk : X !

Suggest Documents