Improve daily streamflow forecast by combining ARMA and ANN models

International conference on innovation advances and implementation of flood forecasting technology

IMPROVING DAILY STREAM FLOW FORECASTS BY COMBINING ARMA AND ANN MODELS Wen Wang (1,2), P. van Gelder (1), J.K. Vrijling (1) (1) TU Delft, Faculty of Civil Engineering & Geosciences, Section of Hydraulic Engineering, P.O.Box 5048, 2600 GA Delft, Netherlands (2) Faculty of Water Resources and Environment, Hohai Universty, Nanjing, 210098, China

Abstract Combined forecasting has attracted lots of attention in the hydrological community recently. In this study, an autoregressive and moving average model, a periodic AR model, a normal multi-layer perceptron artificial neural network model and a periodic artificial neural network model are fitted to a univariate daily stream flow process for the upper Yellow River in China to forecast stream flows one to ten days in advance. Comparing the performance of these models, we find that while no model outperforms the others throughout all seasons over the year, each model shows its strength in some specific season(s). Four combination techniques, (i.e., simple average method (SAM), rollinglyupdated weighted average method, semi-fixed weighted average method, and modular semi-fixed weighted average), are applied to combine the daily stream flow forecasts. The results show that SAM can improve the accuracy of forecasts with a four to five day lead time, and it generally performs best among four competitive combination methods. Owing to its simplicity and robustness, SAM is recommended for improving stream flow forecast accuracy when no individual models to be combined performs consistently more accurate or more poorer than the others. Key words:

ARMA, artificial neural network, forecast combination, stream flow, time series

INTRODUCTION There are numerous models available for forecasting stream flows. However, when building a forecasting model, it is not an easy task to choose a suitable model, because no one model is powerful and general enough to outperform the others for all types of catchments and under all circumstances; and every model has some degree of uncertainty, including structure uncertainty and parameter uncertainty. Instead of using a single model, we may alternatively handle the model selection problem by combining the forecasts from several models so as to obtain a more reliable and accurate output than would be obtained by selecting a single model. After the seminal paper of Bates and Granger (1969), many combining methods have been proposed such as a simple average method, weighted average method, Bayesian methods (Bunn, 1975; Winkler, 1981), the minimum-variance method (Dickinson, 1975) and regression based methods (Granger and Ramanathan, 1984). More recently, Deutsch et al. (1994) proposed to combine forecasts using changing weights derived from switching regression models or from smooth transition regression models; Donaldson and Kamstra (1996) developed a neural network based approach to the nonlinear combination of forecasts; Fiordaliso (1998) proposed a nonlinear forecast combining method, in which a first order TakagiSugeno fuzzy system is used to combine a set of individual forecasts; He and Xu (2005) proposed to use the self-organizing data mining algorithms to combine forecasts. Many studies and empirical tests have shown the advantage of combining forecasts in practice (see Clemen, 1989). Whilst combined forecasting has a long history in the econometrics community, it has not received much attention in the field of hydrological forecasting until recently. In the pioneering work of McLeod et al. (1987), it was shown that significant improvements in forecast performance can be achieved by combining forecasts produced by different types of models applied to quarter-monthly river flows. Shamseldin et al. (1997) examined three different combination methods in the context of flood forecasting, namely, the simple average method, the weighted-average method and the neural 17 to 19 October 2005, Tromsø, Norway

1

ACTIF/FloodMan/FloodRelief


network method. This confirmed that better discharge estimates can be obtained by combining the outputs from different models. See and Openshaw (2000) used four different approaches (i.e., an average, a Bayesian approach, and two fuzzy logic models) to combine the river level forecasts of three models (i.e., a hybrid neural network, an autoregressive moving average model, and a simple fuzzy rule-based model), and found that the addition of fuzzy logic to the crisp Bayesian approach yielded overall results that were superior to the other individual and integrated approaches. Xiong et al. (2001) showed that the first-order Takagi-Sugeno fuzzy system works almost as well as the weighted average method and neural network method in combining five rainfall-runoff models. Coulibaly et al. (2005) showed that, using a weighted average method to combine three dynamically different models can significantly improve the accuracy of the daily reservoir inflow forecast for up to four days ahead. While many studies confirm the effectiveness of the weighted average method (WAM) for hydrological applications (e.g., McLeod et al. 1987; Shamseldin et al., 1997; Xiong et al., 2001), none of the previous studies has considered the effect of how the weights are estimated on the effectiveness of WAM. In this paper, we will compare the daily stream flow forecasts of four dynamically different models, and investigate the effectiveness of forecast combination methods with different ways of weighting. After the introduction of the daily stream flow data used in the study, a brief description of four data-driven forecasting models applied to the daily stream flow process is provided, then we compare the performance of these models, and present the forecast experimental results with several forecast combination methods. Finally, discussion and conclusions are given.

CASE STUDY AREA AND DATA USED The case study area is the headwaters of the Yellow River in China, located in the north-eastern Tibet Plateau. In this area, the discharge gauging station Tangnaihai (TNH) has a 133,650 km2 drainage basin, including a permanently snow-covered area of 192 km2. The length of the main channel of this watershed is over 1,500 km. Most of the area is between 3,000 and 6,000 metres above sea level. The watershed is located in a monsoon area. Most of the rainfall occurs in summer. Owing to the fact that the watershed is partly permanently snow-covered and sparsely populated, without any largescale hydraulic works, it is fairly pristine. The average annual runoff volume (1956-2000) at the TNH gauging station is 20.4 billion cubic metres, about 35% of the whole Yellow River Basin. Hence it produces most of the runoff in the Yellow River basin. Daily average stream flow at TNH has been recorded since 1 January 1956. In this study, data from 1 January 1 1956 to 31 December 2000 were used. The variations in the daily mean discharge and the daily standard deviation of the stream flow at TNH are shown in Figure 1. 1800 Mean SD

3

Discharge (m /s

1500 1200 900 600 300 0 0

Figure 1

60

120

180 Day

240

300

360

Variation in daily mean and standard deviation of the stream flow at Tangnaihai

SELECTED MODELS FOR DAILY STREAMFLOW FORECASTING

17 to 19 October 2005, Tromsø, Norway

2



To combine the strength of the ARMA and ANN models in daily stream flow forecasting, in this study, four models, i.e., ARMA model (see Wang et al., 2005a), periodic autoregressive (PAR) model (Wang et al., 2004a), normal multi-layer perceptron (MLP) ANN model and periodic ANN (PANN) model (Wang et al., 2005b), are applied to the daily stream flow process of the Yellow River at TNH. Descriptions of these models are given in this section.

ARMA model Amongst the various data-driven univariate stream flow forecasting methods, the autoregressive moving average (ARMA) model is one of the most popular ones (see Hipel and McLeod, 1994). The procedure of fitting a deseasonalized ARMA model to the daily streamflow of TNH includes two steps. Firstly, a log-transform is performed on the flow series, and it is “deseasonalized” by subtracting the daily mean values and dividing by the daily standard deviations of the log-transformed series. To alleviate the stochastic fluctuations in the daily means and standard deviations, we smooth them with eight Fourier harmonics before using them for “deseasonalization”. Then, according to the ACF (AutoCorrelation Function) and PACF (Periodic Auto-Correlation Function) structures of the series, as well as the model selection criterion AIC (Akaike, 1973), a linear ARMA model (ARMA(20,1)) is fitted to the log-transformed and deseasonalized daily flow series.

PAR model The periodic autoregressive (PAR) model essentially is a group of AR models, each of which is fitted to the streamflows that occurred in a separate “season” (notice that, a season here does not mean a real season; it is a group of neighbouring days over the year). When we build a PAR model for monthly flow, one AR model may be built for each month over the year. However, it is unfeasible to fit an AR model for each day of the year when building a PAR model for daily flows. Therefore, an approach was proposed by Wang et al. (2004a) whereby a periodic AR model is fitted to daily stream flow based on the partitioning of the days over the year with clustering techniques. This approach is followed in this study. However, in order to simplify the partitioning procedure, instead of partitioning the daily average stream flow series based on many variables as in Wang et al. (2004a), only the raw average daily discharge data and the autocorrelations at different lag times (1 ~ 10 days) are used with clustering analysis in this study. The clustering technique is the fuzzy c-means (FCM) clustering method (Bezdek, 1981). Details of the clustering procedure are given in Wang et al. (2005b). The clustering result is shown in Figure 2. Comparing Figure 1 and Figure 2, we see that if we just follow the clustering result to partition the days over a year into five groups, the dynamics of stream flow is not well captured because cluster 2 and 3 in Figure 2 mix the stream flow rising limb and falling limb shown in Figure 1. Hence, according to the FCM clustering result and considering the dynamics of the stream flow process, we partition the 365 days over the year into seven hard segments, listed in Table 1. Based on the partitioning results, one AR model is fitted to one partition. The orders of the AR models are determined according to AIC criterion (Akaike, 1973), and the parameters of these AR models are estimated with the least squares method. Together, these AR models for different seasons compose a PAR model. When forecasting, a specific AR model is applied depending on what season partition the date is to be forecast is in.


3



Cluster membership grade

1 1

0.9 0.8

2

2

3 4

0.7 0.6

1

5 3

0.5 0.4 0.3 0.2 0.1 0 0

Figure 2

50

100

150 200 Days over the year

250

300

350

FCM clustering results for the daily streamflow at TNH with 5 clusters

Table 1 Partitioning of the days over the year for the daily streamflow at Tangnaihai Partition 1 2 3 4 5 6 7 Day span 1-77, 349-365/366 78-114 115-167 168-237 238-302 303-322 323-348

MLP-ANN model A major disadvantage of the ARMA model is its linear assumption for the time series of interest. Consequently, no nonlinear patterns can be captured by the ARMA-type model, whereas the linear approximation to complex real-world problem is not always satisfactory. By contrast, artificial neural networks (ANNs) have shown their promise in stream flow time series forecasting applications with their nonlinear modeling capability and has gained more and more popularity since a decade ago (see Maier and Dandy, 2000). Feed-forward multi-layer perceptron (MLP) ANN is most widely used in hydrological modelling (see Maier and Dandy, 2000), which is also adopted in this study. According to results of Wang et al. (2005b), the data are pre-processes with the “deseasonalization” procedure described. The number of inputs of the MLP-ANN is determined with the false nearest neighbour method on the basis of reconstructing the phase-space of the daily stream flows (Wang et al., 2005b). The embedding dimension of the reconstructed space is 5, so the number of inputs is chosen to be 5. With a trial and error procedure, the chosen configuration for the MLP-ANN is 5-3-1, namely, 5 inputs, one hidden layer with 3 hidden neurons and one output. The estimation of the weights of an ANN is an optimization problem with many local minima. Optimization methods in the face of many local minima would yield “optimal” weights which differ greatly from one run to the next when the initial weights change. Consequently, the problem of parameter uncertainty arises in building ANNs. To deal with such a problem, a simple but robust ensemble method is used to improve the robustness of the ANN model. That is, for each ANN model, we train it 10 times to get 10 networks with the same structure, among which we choose five best ones that have the best training performance. Then we take the average of the outputs of the five networks to be the finial output of the ANN model.

PANN model Similar to the PAR model, the periodic ANN (PANN) model essentially is a group of MLP-ANN models, each of which is fitted to the stream flows that occurred in a separate “season”. The seasons, i.e., partitions of the days over the year, for building PANN model are defined the same as those for building the PAR model. 17 to 19 October 2005, Tromsø, Norway

4



When forecasting, on specific MLP model is used according to what season partition the date to be forecast is in. A “soft partitioning technique” is used when using PANN model for forecasting, with which each day may lie simultaneously in multiple seasonal partitions. The essence of soft partitioning is the determination of membership grade (or membership function) of each partition. Following the pattern of FCM clustering result, the membership grade is formed intuitively for the days over the year, shown in Figure 3. When soft partitioning is applied, one day could belong to several season partitions, correspondingly, the final output would be a weighted average of the outputs of several ANN models fitted for these season partitions. The weight is equal to the membership grade obtained with FCM clustering result.

Membership grade

1

1

2

3 7

0.8

1

5

4

6

0.6 0.4 0.2 0 0

50

100

150

200

250

300

350

Days over the year

Figure 3

Soft partitioning the days over the year for the daily streamflow at Tangnaihai according to the FCM of 5 clusters

COMPARING THE SELECTED MODELS Measures of model performance Despite its crudeness and identified weaknesses, the Coefficient of Efficiency (CE) introduced by Nash and Sutcliffe (1970) is still one of the most widely used criteria for the assessment of model performance. The CE, which provides a measure of the ability of the model to predict values that are different from the mean, has the form: n

CE = 1 −

∑ (Q − Qˆ ) i

i =1 n

2

i

∑ (Q − Q )

,

(1)

2

i

i =1

where n is the size of the data, Qi is the observed value, Qˆ i is the predicted value, Q is the mean value of the observed data. A CE of 0.9 and above is generally considered very satisfactory, 0.8 to 0.9 represents a fairly good model, and below 0.8 is considered unsatisfactory. Owing to the fact that CE is a measure of comparing the predicted value with the overall mean value, it may over-exaggerate the model performance when evaluating the predictions for those series whose mean values change significantly with seasons, which is almost always the case for hydrological processes. Therefore, a seasonally-adjusted coefficient of efficiency (SACE) (Wang et al., 2004b) is used here as a global measure for evaluate the model performance over the entire year. SACE is calculated by: n

SACE = 1 −

∑ (Q − Qˆ )

2

∑ (Q − Q

2

i =1 n

i =1

i

i

i

m

)

,

(2)

where m = i mod S (mod is the operator calculating the remainder), ranging from 0 to S-1; and S is the total number of “season” (Notice that, a “season” here is not the real season. It may be a month or a day. For daily streamflow series, S equals to 365 and one season is one day over the year.); Qm is the mean value of season m. Similar to CE, a SACE of 0.9 and above can be considered very satisfactory, 17 to 19 October 2005, Tromsø, Norway

5



0.8 to 0.9 represents a fairly good model, and below 0.8 is considered unsatisfactory. Besides the measure SACE, the root mean squared error (RMSE) is another measure used in this study. RMSE is very sensitive to even small errors, which is good for comparing small differences of model performance. In this study RMSE is used as a local measure to compare the performance of different model for each season, as well as a global measure to compare the overall performance of competitive models over the whole year. RMSE is calculated by RMSE =

1 n ∑ (Qi − Qˆ i )2 . n i =1

(3)

Comparing the performance of ARMA, PAR, MLP, PANN The MLP-ANN and PANN model that are described in the previous section are fitted using the daily discharge data at TNH from 1956 to 1995, and one to ten day forecasts are made for year 1996 to 2000. The ARMA and PAR model are fitted on a rolling-forward basis. Namely, for forecasting stream flow in year 1996, we use the data from 1956 to 1995 to fit the models; for forecasting stream flow in year 1997, we use the data from 1956 to 1996; and so on. The forecast evaluation results of the four competitive models are listed in Table 2. The performance of the models is evaluated in terms of RMSE on the basis of seven seasonal partitions that are shown in Table 1. Notice that, the values in bold style indicate that the performance of the corresponding model behaves best among the four models for the specific season and specific leading time. Table 2 RMSEs of one- to ten-day ahead forecasts of four competitive models Model Partition Day 1 1 6.39 2 17.88 3 62.30 ARMA 4 81.65 5 33.93 6 7.59 7 11.05 1 6.57 2 17.40 3 62.45 PAR 4 80.19 5 33.43 6 8.00 7 10.85 1 6.35 2 18.30 3 60.03 MLP 4 82.86 5 33.51 6 7.42 7 10.93 1 6.41 2 17.31 3 58.33 PANN 4 80.53 5 33.95 6 7.71 7 10.91

Day 2 9.70 31.58 113.08 140.27 59.45 10.57 16.74 12.68 31.64 113.53 140.82 60.15 11.18 16.01 9.63 32.81 107.39 144.24 58.23 10.26 16.41 9.50 31.58 104.68 142.16 59.36 10.75 16.22


Day 3 11.36 39.00 151.60 180.33 77.73 12.39 20.68 11.17 42.48 146.53 183.21 76.93 14.41 19.16 11.32 40.80 143.81 187.40 75.21 11.84 19.95 11.04 39.72 141.07 187.66 74.28 12.51 19.59

Day 4 12.31 44.19 178.51 210.99 94.54 13.51 23.70 12.23 48.44 175.52 219.09 94.21 16.16 21.76 12.38 46.00 169.04 221.14 91.07 12.83 22.64 12.11 45.54 165.98 224.38 88.62 13.94 22.05

Day 5 13.36 48.80 202.01 235.94 110.95 15.67 26.36 13.38 52.10 200.90 250.77 112.29 18.66 24.24 13.45 50.37 191.40 250.14 106.83 14.68 25.11 13.24 50.41 187.70 254.56 104.02 15.92 24.16 6

Day 6 14.50 51.73 227.77 257.13 126.01 17.73 27.53 14.74 54.25 230.17 278.64 127.85 21.03 25.57 14.62 53.24 216.80 274.92 120.53 16.55 26.23 14.43 52.71 211.97 279.89 116.20 18.10 25.15

Day 7 15.63 59.16 254.93 276.31 140.59 19.22 27.57 16.02 61.96 262.02 304.37 142.94 22.21 25.91 15.87 60.99 243.43 296.36 133.56 17.91 26.41 15.59 59.89 237.22 302.66 128.65 19.82 25.35

Day 8 16.43 67.33 281.98 293.15 153.75 20.84 26.55 17.12 70.96 295.32 326.04 156.56 23.49 25.51 16.86 69.46 270.66 314.01 145.30 19.42 25.70 16.64 68.14 262.65 322.00 140.37 21.74 24.68

Day 9 17.25 73.83 307.17 306.18 165.43 22.35 24.81 18.30 77.88 327.11 342.09 169.43 24.62 24.40 17.84 76.10 297.81 327.53 156.05 20.88 24.28 17.73 74.47 287.45 336.96 151.13 23.59 23.38

Day 10 18.11 80.85 330.51 317.60 174.92 23.52 23.15 19.37 85.26 359.93 356.40 181.03 25.18 23.09 18.75 83.21 325.35 339.76 165.19 21.87 22.94 18.74 81.24 311.89 350.20 160.51 24.91 22.15



Similarly, we compared the model performance for making one to ten day ahead daily flow forecasts for period 1991 to 1995. For saving space, the results are not shown here. Generally, the model performance evaluation results show that: 1. No model outperforms the other models throughout all the seasonal partitions, and no model is inferior to the others over all the seasonal partitions. This is in agreement with some other studies concerning the performance of ARMA models and ANN models. For example, Jain et al. (1999) compared an ARMA model and an ANN model fitted to the monthly inflow series, and found that the ANN modeled the high flows better, whereas low flows were better predicted through the ARMA model. 2. The ARMA model outperforms the other models slightly for partition 2 and 4, corresponding to the starting of snow-melt season and heavy-rainfall season; 3. The PAR model outperforms or performs similar to the others for partition 1 and 7, corresponding to low flow recession season; 4. The normal MLP-ANN and the PANN model slightly outperforms the two ARMA-type models in partition 3, 5, and 6, corresponding to stream flow rising and falling season;

COMBINING THE FORECASTS OF ARMA, PAR, MLP, PANN Methods of combining forecasts The forecast combination methods may be roughly broken into two categories. The first one is the ensemble approach, by which a set of models is trained on the same task, and then the outputs of the models are combined. The second one is the modular approach, under which a task or problem is divided into a number of subtasks (regimes), and the complete task solution requires the contribution of all of the individual regimes. Both of these two approaches are applied to combining different models in this study. Ensemble approach Essentially, the ensemble combination is a weighted average of the outputs of ensemble members. While the ensemble prediction technique is normally used to provide probabilistic predictions as in ensemble streamflow prediction (ESP) (Day, 1985; Smith et al., 1992), it may also be extended to be used as a forecast combination technique. What differs between using ensemble prediction technique in ESP and using ensemble prediction technique for combining forecasts is that, in ESP the ensemble members are composed of forecasts from a single model with different inputs, whereas in forecast combination different models (either different parameters, or different structures, or even difference types) with the same (or basically the same) inputs are used. There are two main issues about ensemble combination: Firstly, how to select a set of models and generate an ensemble of forecasts to be combined; and second, how to estimate the combining weights so as to minimize the out-of-sample forecast errors. The selection of the ensemble models should provide the information of a specific process from different perspectives. In this study, we choose an ARMA-type model and an ANN-type model aiming to combine the strength of linear and nonlinear approximation ability of ARMA and ANN respectively. Furthermore, to capture the seasonality of stream flow processes, periodic models (PAR and PANN) are applied. As for the estimation of combining weights, some studies show that equally weighted combination, namely, the simple average method (SAM), can produce forecasts that are better than those of the individual models (Makridakis et al., 1982), and its accuracy depends mainly on the number of the models involved and on the actual forecasting ability of the specific models included in the simple average (Makridakis and Winkler, 1983). Owing to its robustness, the SAM method has consistently been the choice of many researchers (see Clemen, 1989). However, when some of the individual models selected for combination appear to be consistently more accurate than others, in which case the use of the SAM for 17 to 19 October 2005, Tromsø, Norway

7



combination can be quite inefficient (Armstrong, 1989), the use of weighted average method (WAM) would be considered. One of the most common procedures used to estimate the combining weights is to perform the ordinary least squares regression (see e.g., Crane and Crotty, 1967; Winkler and Makridakis, 1983; Granger and Ramanathan, 1984): k

yt +1 = a0 + ∑ a j ft , j + ε t +1

(4)

j =1

where ft,j is the one step ahead forecast made at time t of yt+1 with model i; a0 is a constant term; and aj is the regression coefficient. Another common method to estimate the combining weights is the optimal method (Bates and Granger, 1969), in which the linear weights are calculated to minimize the error variance of the combination (assuming the unbiasedness for individual forecast). Granger and Ramanathan (1984) showed that the optimal method is equivalent to a least squares regression (referred to as Equality Restricted Least Squares, ERLS) in which the constant is suppressed and the weights are constrained to sum to one. One more option for estimating the combining weights is the Nonnegativity Restricted Least Squares (NRLS) regression (see Gunter, 1992), in which the weights are constrained to be nonnegative. Aksu and Gunter (1992) examined the relative accuracy of OLS, ERLS and NRLS and SAM combined forecasts using 40 economic series, and the empirical results revealed that NRLS and SA combinations almost always outperform OLS and ERLS combinations, while NRLS combinations are at least as robust and accurate as SA combinations. Modular approach The modular approach is based on the principle of “divide-and-conquer” (DAC), which deals with a complex problem by breaking it into simple problems whose solutions can be combined to yield a solution to the complex problem (Jordan and Jacobs, 1994). In a narrow sense, the ensemble approach and the modular approach are distinct, in that the modular approach assumes that each data point is assigned to only one model whereas with ensemble combination, each data point is likely to be treated by all the component models in an ensemble. However, the two approaches may be mixed up in a broad sense, in that on one hand, the component model in the ensemble approach may be a modular model (e.g., a PAR model may be viewed as a modular AR model), on the other hand, each component in a modular combination can take the form of an ensemble of models. In fact, the modular approach may be viewed as a modelling strategy as well as a forecast combination approach. When each component in a modular combination is made of a single model rather than an ensemble of several models, the modular approach is simply reduced to a hybrid modelling approach. The modular combination may be expressed in a fashion of switching regime model (see e.g., Goldfeld and Quandt, chapter 9, 1972) where the model parameters change over time: k

yt = ∑ g (t ∈ I j ) ft , j

(5)

j =1

where Ij is the regime; g(t ∈ Ij) = 1 if t ∈ Ij and g(t ∈ Ij) = 0 if t ∉ Ij; ft,j denotes the forecast for regime Ij at time t. The forecast ft,j may come either from a single model fitted to regime Ij, or more generally, from an ensemble combination of the forecasts from several models. Consequently, there are three major issues about the modular combination method: first, the division of the problem under concern; second, the selection of models for each regime; and third, the method of combination if more than one model is chosen for each regime. A sensible division relies on a clear understanding of the problem. Owing to the fact that stream flow generation processes, especially daily stream flow processes, usually have pronounced seasonal means, variances, and at the same time, dependence structures and the under-lying mechanisms of stream flow generation are likely to be quite different during low, medium, and high flow periods, hence, several approaches may be taken to divide a stream flow process: use threshold values to divide the stream flow regimes; cluster the stream flow process into several domains (e.g., low flow, medium flow and flood); or, partition the stream flow process according to the seasonal difference. Hu et al. (2001) developed a threshold-based ANN 17 to 19 October 2005, Tromsø, Norway

8



model to make streamflow forecsts for the Yangtze River. In the studies of Zhang and Govindaraju (2000), See and Openshaw (2000) and Xiong et al. (2001), the model combinations are fundamentally based on dividing the hydrologic process into several domains according to conditions of the hydrologic process. However, the comparison of several hybrid ANN models (Wang et al., 2005b) showed that, among three hybrid ANN models (i.e., threshold-based hybrid ANN, cluster-based ANN and season-based periodic ANN), season-based periodic ANN performs best, indicating a better generality of dividing the daily stream flow according to seasonal difference. After we divide a time series, we may either chose one optimal model or choose a set of models for each partition of the series. In the cases that we choose a set of models, the same methods for estimating the combining weights in the ensemble combination may apply here too.

Results of combined forecasts In this study, both the ensemble approach and the modular approach are applied. It should be noted that, we cannot use the forecasts for the validation period to estimate the weights in either approaches, therefore, one important practical issue for estimating the combining weights is which data are used for the estimation. Some studies use fixed weights that are estimated with the calibration data (e.g., Shamseldin et al., 1997; Xiong et al., 2001), whereas some others considered the combination using changing weights that are estimated with a number of previous forecasts (e.g., McLeod et al., 1987). Taking the selection of the techniques as well as the data used for estimating weights into account, the following four combination methods are compared. 1. Simple average (SA). A simple average of the forecasts from four competitive models, implying equal weights. In fact, the SAM method has already been used in the construction of the MLPANN and PANN model in this study, where the ensemble members are composed of purely neural networks of the same structure. 2. Rollingly-updated Weighted Average (RWA). Weights are updated on the basis of a rollingforward window. That is, we estimate the weights for each day according to the forecasts of previous L days, where L is the length of the rolling window. By trying different values of L, we find that the greater the length of rolling window employed to calculate the weights, the smaller the resulting combined RMSE. But no significant improvement is observed after L is larger than 365. So we choose L to be 365. Weights are estimated with the nonnegative least-squares regression method (see Lawson and Hanson, 1974), in which the constant is suppressed. 3. Semi-Fixed Weighted Average (SFWA). Weights are estimated on the forecasts for the previous two years and these weights are unchanged for making forecasts for the current year. When making forecasts for the next year, we update the weights again. 4. Modular Semi-Fixed Weighted Average (MSFWA). Weights are estimated with a modular approach, where the modules are defined on the basis of seasonal partitions. The weights for each seasonal partition are estimated with the nonnegative regression method based on the previous two years’ forecasts. The weights are updated every year when we make forecasts for a new year. With the above-mentioned methods, we combine the one- to ten-day forecasts of the four models for year 1996 to 2000. We compare the performance of these combination methods for seven seasonal partitions, and list the results in Table 3. It should be noted that, the values in bold style indicate the corresponding model behaves best among all models/methods for the specific leading time. At the same time, we compare the overall performance of the four competitive models and the four combination methods by taking the entire validation period (1996 – 2000) into account. The results are listed in Table 4. An examination of the two tables reveals the following:


9



1. The overall performance of PANN is the best for one-day ahead forecasts according to the fact that it has a minimum RMSE value, whereas the ARMA model outperforms all the other models/methods for long lead-time forecasts. 2. SAM generally performs best among the four competitive combination methods, and it also outperforms all the 4 individual models for up to 4 to 5 day forecasts (except for the PANN model for one-day ahead forecasts), and outperforms three models (except for the ARMA model) for up to 10-day ahead forecasts. The result confirms the robustness of SAM method for improving the forecast accuracy. 3. Among the other three combination methods (i.e., RWA, SFWA and MSFWA), SFWA performances slightly better than the two others, indicating that for a stationary process, like the daily stream flow processes of the Yellow River at TNH, changing the combination weights with a rolling window as in RWA may be not necessary. In addition, RWA has a disadvantage of much more intensive computation cost than the other methods. There is no significant improvement with the modular combination approach for this particular case of the Yellow River. The reason that SAM perform the best in the four competitive methods probably results from the fact that no individual models selected for combination appears to be consistently more accurate than others, and no individual models appears to be consistently poorer than others. Taking a close look at the semi-fixed weights in either SFWA or MSFWA, we find that weights may change significantly from one year to the next, which means that the performance of models are not consistently good or poor throughout the validation period. Evidence from economics also supports the use of equal weights. In an analysis of five econometric models’ forecasts, Pencavel (1971) found no tendency for models that produced the most accurate in one year to do so in the next. Similarly, Batchelor (1990) concluded that “all forecasters are equal” in economics. In a comprehensive review, Cleman (1989) found equal weighting to be accurate for many types of forecasting. Armstrong (2001) suggested using equal weights unless one has strong evidence to support unequal weighting of forecasts.


10



Table 3

RMSEs of one- to ten-day ahead forecasts with four combination methods

Method Partition Day 1 1 6.30 2 17.60 3 60.05 SAM 4 80.42 5 32.58 6 7.59 7 10.83 1 6.48 2 17.24 3 59.60 RWA 4 80.85 5 33.33 6 8.27 7 10.88 1 6.49 2 17.24 3 60.03 SFWA 4 80.59 5 32.95 6 8.84 7 11.36 1 6.46 2 17.39 3 60.21 MSFWA 4 80.91 5 32.82 6 7.59 7 11.37

Table 4

Day 2 9.77 31.76 108.53 140.24 56.97 10.47 16.14 9.81 31.48 107.55 142.58 58.50 11.11 15.69 11.41 31.29 107.81 140.57 57.34 11.96 16.75 9.92 31.40 107.99 140.66 57.31 10.70 17.19

Day 3 11.09 40.29 144.09 182.51 73.40 12.28 19.67 11.04 40.69 145.13 187.11 71.74 12.60 19.00 10.89 41.01 142.87 183.05 72.16 13.90 20.08 11.07 41.69 142.19 183.79 75.14 12.43 20.96

Day 4 12.11 45.77 170.15 216.20 88.89 13.36 22.35 11.99 46.51 171.72 222.75 85.52 13.46 21.82 11.82 46.73 169.33 217.96 86.74 15.48 22.67 12.01 47.57 168.02 219.08 91.46 13.53 24.02

Day 5 13.20 50.10 192.90 244.37 104.79 15.24 24.77 12.98 50.90 194.79 252.74 100.67 14.84 24.17 12.81 50.75 192.71 247.05 102.19 17.70 24.72 13.05 51.32 190.63 249.03 107.72 15.52 26.37

Day 6 14.41 52.56 218.44 268.32 118.37 17.15 25.92 14.19 53.19 220.31 278.13 113.05 16.49 25.51 14.00 52.99 218.87 271.63 114.92 20.16 25.72 14.13 53.39 216.09 274.58 122.11 17.62 27.49

Day 7 15.60 60.03 245.48 289.70 131.66 18.37 26.10 15.37 60.40 246.93 301.01 125.76 17.61 25.72 15.18 60.29 246.07 293.53 127.53 21.93 25.69 15.20 60.70 242.97 297.61 135.97 19.23 27.31

Day 8 16.58 68.47 273.04 307.63 143.78 19.83 25.37 16.25 68.87 273.77 320.82 137.71 19.22 25.02 16.10 68.88 273.28 312.15 139.16 23.84 24.74 15.99 69.54 270.13 317.19 148.38 21.05 26.14

Day 9 17.58 74.99 299.64 321.15 154.97 21.24 23.93 17.28 75.40 300.05 336.17 148.89 20.93 23.66 17.18 75.55 299.50 326.28 149.92 25.60 23.17 16.89 76.49 296.39 332.07 159.62 22.81 24.31

Day 10 18.53 82.00 326.08 333.08 164.60 22.16 22.50 18.50 82.32 325.73 349.80 158.90 22.12 22.34 18.42 82.64 325.48 338.76 159.27 26.67 21.86 17.99 83.52 322.52 345.29 169.02 24.02 22.70

Overall performances of four models and four combination methods

Lead time (days) ARMA PAR MLP PANN RMSE SA RWA SFWA MSFWA ARMA PAR MLP PANN SACE SA RWA SFWA MSFWA

1 45.93 45.38 45.86 44.75 44.87 45.03 44.99 45.11 0.980 0.980 0.980 0.981 0.981 0.98 0.981 0.98


2 80.21 80.68 80.30 79.16 78.95 79.76 79.04 79.05 0.938 0.937 0.938 0.940 0.940 0.939 0.94 0.94

3 104.47 104.40 104.99 104.36 103.11 104.71 102.95 103.48 0.895 0.895 0.894 0.895 0.898 0.894 0.898 0.897

4 122.87 125.05 123.93 124.10 122.16 124.33 122.38 123.16 0.855 0.849 0.852 0.852 0.856 0.851 0.856 0.854 11

5 138.60 143.60 140.65 141.06 138.70 141.42 139.27 140.30 0.815 0.801 0.810 0.808 0.815 0.807 0.813 0.81

6 153.36 161.28 156.21 156.32 153.99 157.01 154.77 156.20 0.774 0.750 0.765 0.765 0.772 0.763 0.769 0.765

7 167.89 178.93 170.98 171.12 168.90 172.20 169.77 171.67 0.729 0.692 0.719 0.718 0.725 0.715 0.723 0.716

8 181.53 195.47 184.53 184.77 182.72 186.40 183.66 185.94 0.683 0.632 0.672 0.671 0.679 0.665 0.675 0.667

9 193.34 209.81 196.61 196.68 194.77 198.90 195.75 198.31 0.640 0.576 0.628 0.628 0.635 0.619 0.631 0.621

10 203.99 223.80 208.26 207.86 206.18 210.65 207.18 209.96 0.599 0.518 0.582 0.584 0.591 0.573 0.587 0.576



CONCLUSIONS Complex problems, like stream flow processes, may be controlled jointly by both linear and nonlinear mechanism, and in different system regimes the dominant acting mechanism may be different. Owing to the presence of strong seasonality, the acting mechanism underlying some stream flow processes may change significantly. In some seasons the linear mechanism dominates while in other seasons the nonlinear mechanism dominates. This is the reason why the combined forecasting works for improving the accuracy of stream flow forecasts. In this study, four dynamically different models are used for forecasting a daily stream flow process. The ARMA model captures the overall linear autocorrelation structure of the stream flow process; the PAR model captures the seasonal difference of linear autocorrelation structure; the MLP-ANN captures the overall nonlinear autocorrelation structure; and the PANN model capture the seasonal difference of nonlinear autocorrelation structure. The dynamical difference of the four models makes it possible to improve the forecast accuracy by combining them together. The comparison of the four models shows that no model outperforms the other models over all the seasons, and no model is inferior to the others over all the seasons. The overall performance of PANN is the best for one-day ahead forecasts, whereas the ARMA model outperforms all the other models/methods for long lead-time forecasts. To combine the strength of the four dynamically different models, four forecast combination methods are adopted in this study: simple average method (SAM), rollingly-updated weighted average method, semi-fixed weighted average method, and modular semi-fixed weighted average method. The results show that SAM can improve the accuracy of up to 4- to 5-day forecasts, and it generally performs the best among the competitive combination methods, which confirms the robustness of the SAM method. Owing to its simplicity and robustness, SAM is recommended for improving stream flow forecast accuracy when no individual models to be combined performs consistently more accurate or more poorer than the others.

REFERENCES Akaike, H. (1973), Information theory and an extension of the maximum likelihood principle. In Proceedings of the 2nd International Symposium on Information Theory, B.N. Petrov and F. Csaki (eds.), Akademia Kiado, Budapest, 267-281 Armstrong, J.S. (1989), Combining forecasts: the end of the beginning or the beginning of the end? Int. J. Forecasting, 5, 585-588 Armstrong, J.S. (2001), Combining forecasts. In: Principles of Forecasting: A Handbook for Researchers and Practitioners, Armstrong, J.S. (ed.). Boston: Kluwer Academic, 417-439. Aksu, C. Gunter, S.I. (1992), An empirical analysis of the accuracy of SA, OLS, ERLS and NRLS combination forecasts. International Journal of Forecasting 8(1), 27-43. Batchelor, R. (1990), All forecasters are equal, Journal of Business and Economic Statistics, 8,143144. Bates, J.M., Granger, C.W.J. (1969), The combination of forecasts. Operational Research Quarterly 20, 451–468. Bezdek, J.C., 1981. Pattern recognition with fuzzy objective function algorithms. Plenum, New York. Bunn, D.W. (1975), A Bayesian approach to the linear combination of forecasts. Operational Research Quarterly 26, 325–329. Clemen, R.T. (1989), Combining forecasts: A review and annotated bibliography. International Journal of Forecasting 5, 559–583. Coulibaly P., Hache M., Fortin V., and Bobee B. (2005), Improving Daily Reservoir Inflow Forecasts with Model Combination. J. Hydrol. Eng., 10(2), 91-99. Crane, D.B., Crotty, J.R. (1967), A two-stage forecasting model: Exponential smoothing and multiple regression, Management Science, 13, B501-B507. Deutsch, M., Granger, C.W.J., Terasvirta, T. (1994), The combination of forecasts using changing weights. International Journal of Forecasting 10, 47–57. 17 to 19 October 2005, Tromsø, Norway

12



Dickinson, J.P. (1975), Some comments on the combination of forecasts. Oper. Res. Q., 26, 205-210 Donaldson, G., Kamstra, M. (1996), Forecast combining with neural networks. Journal of Forecasting 15, 49–61. Fiordaliso, A. (1998), A nonlinear forecasts combination method based on Takagi–Sugeno fuzzy systems. International Journal of Forecasting, 14, 367–379 Goldfeld. S.M. and Quandt, R.E. (1972), Nonlinear Methods in Econometrics. Amsterdam: NorthHolland, 258-277. Granger, C.W.J., Ramanathan, R. (1984), Improved methods of combining forecasts. Journal of Forecasting 3, 197–204. Gunter, S.I. (1992), NonnegativityNonnegativity restricted least squares combinations. International Journal of Forecasting 8(1), 45-59. He, C.-Z, Xu X.-Z. (2005), Combination of forecasts using self-organizing algorithms. Journal of Forecasting, 24, 269–278 Lawson, C.L. Hanson, R.J. (1974), Solving Least-Squares Problems (Chapter 23), Prentice-Hall, Englewood Cliffs, N.J., pp.161. Jain, S.K., Das, A. and Srivastava, D.K. (1999), Application of ANN for reservoir inflow prediction and operation, J. Water Res. Plann. Manage., 125(5): 263-271 Jordan, M. I., Jacobs, R.A. (1994), Hierarchical mixture of experts and the EM algorithm, Neural Comput., 6, 181–214. Maier, H.R., Dandy, G.C. (2000), Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications. Environmental Modelling and Software 15, 101–124. Makridakis, S., Anderson, A., Carbone, R., Fildes, R., Hibon, M., Lewandowski, R., Newton, J., Parzen, E., Winkler, R. (1982), The accuracy of extrapolation (time series) methods: results of a forecasting competition. Journal of Forecasting 1, 111-153. Makridakis, S., Winkler, R.L. (1983), Average of forecasts: some empirical results. Manage. Sci., 29(9), 987-996 Nash, J. E., Sutcliffe, J.V. (1970), River flow forecasting through conceptual models, I, A discussion of principles, J. Hydrol., 10, 282–290. Pencavel, John H. (1971), A note on the predictive performance of wage inflation models of the British economy, Economic Journal, 81, 113-119. Shamseldin, A. Y., O’Connor, K. M., and Liang, G. C. (1997), Methods for combining the output of different rainfall-runoff models. J. Hydrol., 197, 203–229. See, L., Openshaw, S. (2000), A hybrid multi-model approach to river level forecasting. Hydrol. Sci. J. 45 (4), 523–536. Wang, W., Van Gelder, P.H.A.J.M., Vrijling, J.K. (2004a), Periodic autoregressive models applied to daily streamflow. Proceedings of the 6th International Conference on Hydroinformatics. World Scientific, Singapore, pp. 1334-1341. Wang, W., Van Gelder, P.H.A.J.M., Vrijling, J.K., Ma, J. (2004b), Predictability of streamflow processes of the Yellow River. Proceedings of the 6th International Conference on Hydroinformatics. World Scientific, Singapore, pp. 1261-1268. Wang, W., van Gelder, P.H.A.J.M., Vrijling, J.K., Ma, J. (2005a), Testing and modelling autoregressive conditional heteroskedasticity of streamflow processes. Nonlinear Processes in Geophysics, 12, 55-66 Wang, W., van Gelder, P.H.A.J.M., Vrijling, J.K., Ma, J. (2005b), Forecasting Daily Streamflow Using Hybrid ANN Models. Journal of Hydrology, submitted for publication Winkler, R.L. (1981), Combining probability distributions from dependent information sources. Management Science 27, 479–488. Winkler, R.L., Makridakis, S. (1983), The combination of forecasts, Journal of the Royal Statistical Society, Series A, 146, 150-157. Xiong, L., Shamseldin, A., O’Connor, K. (2001), A non-linear combination of the forecasts of rainfall–runoff models by the first-order Takagi–Sugeno fuzzy system. J. Hydrol. 245, 196–217. Zhang B, Govindaraju R.S. (2000), Prediction of watershed runoff using Bayesian concepts and modular neural networks. Water Resrour. Res., 36(3): 753-762. 17 to 19 October 2005, Tromsø, Norway

13


Improve daily streamflow forecast by combining ARMA and ANN models

Improve daily streamflow forecast by combining ARMA and ANN models

Suggest Documents

Forecasting daily streamflow using hybrid ANN models

Use of ALADIN forecast data, ANN and ARMA model

Comparison of daily and sub-daily SWAT models for daily streamflow

ARMA models. - Google Sites

ARMA models - Stat 565

Estimating ARMA Models Efficiently - CiteSeerX

Estimation of AR and ARMA models by stochastic ... - Semantic Scholar

Combining models to improve classifier accuracy and robustness ...

ARMA MODELS AND THE Box JENKINS ...

ARMA MODELS AND THE Box JENKINS METHODOLOGY ...

kalman filters and arma models - Eiris

Combining Application Instrumentation and Simulation to Forecast ...

daily streamflow forecasting using artificial neural

Multisite disaggregation of monthly to daily streamflow

Complex relationship between seasonal streamflow forecast skill and ...

ARMA Identification of Graphical Models - IEEE Xplore

Convolutive speech separation by combining probabilistic models ...

Combining Undirected and Directed Models

Combining Ranking and Classification to Improve Emotion ...

Combining methods to improve speaker

Combining Statistical Models

Using the WRF Model in an Operational Streamflow Forecast System ...

State Space and ARMA Models: An Overview of ... - Semantic Scholar

State Space and ARMA Models: An Overview of