the interval estimation of parameters for neural ...

1 downloads 0 Views 98KB Size Report
The nonlinearity and uncertainty of the flood process are such that estimating or predicting required hydrologic data is often tremendous difficult. Consequently ...
6th International Conference on Hydroinformatics - Liong, Phoon & Babovic (eds) © 2004 World Scientific Publishing Company, ISBN 981-238-787-0

THE INTERVAL ESTIMATION OF PARAMETERS FOR NEURAL NETWORK ON FLOOD FORECASTING CHAO-CHUNG YANG Assistant Professor, Hazard Mitigation Research Center, National Chiao Tung University, Hsinchu, Taiwan CHANG-SHIAN CHEN Assoc. Prof., Department of Hydraulic Engineering, Feng Chia University, Taichung, Taiwan LIANG-CHENG CHANG Professor, Department of Civil Engineering, National Chiao Tung University, Hsinchu, Taiwan The nonlinearity and uncertainty of the flood process are such that estimating or predicting required hydrologic data is often tremendous difficult. Consequently, this study employs a Back-Propagation Network (BPN) as the main structure in flood forecasting to learn and demonstrate the sophisticated nonlinear mapping relationship. However, sophisticated natural systems and highly changeable hydrological environments require that the construction of an artificial neural network (ANN) as a forecasting model should include a risk analysis to reflect the hydrological situation or/and physical meaning of the predicted results. In this paper, a Self Organizing Map (SOM) network with classification ability was applied to the solutions and parameters of BPN model in the learning stage, to classify the network parameter rules and obtain the winning parameters. Hence, hydrologic data intervals can then be forecasted, with the outcomes from the previous stage used as the ranges of the parameters in the recall stage. Overall, this research develops a methodology for providing the decision-maker with more flexibility in forecasting floods. INTRODUCTION ANN has been successfully applied to forecast flood discharge. Yang [4] developed a flood forecasting procedure by integrating LTF (linear transfer function), the ARIMA (auto regressive integrated moving average) model and ANN. Gavin J. Bowden et al. [1] explained that the way in which available data are divided into training、testing and validation subsets can markedly influence the performance of an artificial neural network. In spite of numerous studies, no systematic approach to dividing optimally data for ANN models has yet been developed. They presented two methods for dividing data into representative subsets, namely, a genetic algorithm and a self-organizing map. Those papers focused on the forecasting accuracy and handling of the input

1

2 data of ANN. Some researchers have examined establishing a confident interval for point forecasting using ANN. J.T.Gene Hwang and A.Adam Ding [2] focused on the construction of intervals of prediction. They constructed asymptotically valid prediction intervals and also indicated how to use the prediction intervals to choose the number of nodes in the network. Y.M. Kuo and C.W. Liu [5] used the Back-Propagation Neural Network to forecast future variation in groundwater quality. Their results reveal that the hidden nodes are insignificant in training a BP network and forecasting. Besides, the confidence intervals at each forecasting value are also computed. This study does not emphasize the modification of point forecasting, as discussed above. Rather, this work seeks to build an appropriate procedure for estimating suitable intervals of parameters (weights and biases) for ANN, to increase the flexibility and convenience of practical forecasting of flood discharge. METHODOLOGY The back-propagation network, the linear transfer function and selforganizing maps are applied herein to determine intervals of the parameters in ANN for efficiently forecasting flood discharge. Above mentionable algorithms are described below: Back-propagation network The artificial neural network, ANN, consists of many artificial neurons (commonly referred to as processing units or nodes). The output signal is determined by the algebraic sum of the weighted inputs, i.e., Y k = f ( ∑ W ij X i − θ j ) i

(1)

where, Y k : output signal at node k; : transfer function; W : weights between the node i and node j; X i : input signal at node i; θ j : bias at node j; The back-propagation network (BPN), an extensively used neural network, contains three layers: the input layer (receives input signals from the external world), the hidden layer (represents the relationship between the input layer and the output layer), and the output layer (releases the output signals to the external world). The number of hidden layer nodes herein is determined as (the number of input layer nodes + the number of output layer nodes)/2. Performing supervised learning, BPN gradually adjusts its weights, thereby minimizing the error between the known answers and the actual responses [4]. f

ij

Linear transfer function

3 If the time series of interest, say Xt , is related to one or more other time series, a model called the linear transfer function (LTF), can be constructed. Such a model uses the information in these other time series to help forecast Xt . LTF applies the least-squares method to estimate the impulse response weights and can be expressed as follows. 2 l (2) Yt = C + ( υ0 + υ1 B + υ2 B +......+ υl B ) Xt + at where B denotes the backward shift operator; Yt represents the output time series; Xt is the input time series; C denotes the constant; a t represents the white noise process; and the unknown weights , , ,...... are called the



0

υ1 υ2

υ l)

impulse response weights [4]. This paper estimates the impulse response weights of every gauging station in a watershed using LTF and then proceeds the parameter significance T-test of impulse response weights. Those processes are implemented herein to determine the appropriate number of network input elements. Self-organizing map Teuvo Kohonen introduced the Self-Organizing Map (SOM) in 1982. The SOM (also known as the Kohonen feature map) algorithm is one of the best-known, artificial neural network algorithms. In contrast to several other neural networks that use supervised learning, the SOM is based on unsupervised learning. SOM usually provides a topology-preserving mapping from a high-dimensional space onto two-dimensional map. SOM groups similar input data, which are nearby each other in the input space. Input data are mapped onto nearby map units. SOM can thus serve as a clustering tool as well as a tool for visualizing high-dimensional data [3]. STATEMENT OF THE APPLICATION Data arrangement The Wu-Shi basin, as depicted in Fig 1, is selected herein to test the proposed model with Chien-Feng Bridge as the upstream gauging station and Da-Du Bridge as the downstream gauging station. (The gauging stations of the tributaries, are at His-Nan Bridge over the Da-Li River and Nan-Gang Bridge over the Mao-Luo River.) Six flood events were considered in this study. Three indicators are used herein to evaluate the accuracy of the proposed model. z Coefficient of efficiency, CE (3) ∑ ( Qobs − Qest ) 2 CE = 1 −

∑ ( Qobs − Q obs ) 2

Where Qest denotes the estimating flood discharge (cms); Qobs represents the observed flood discharge (cms); and Q obs is the mean value of the observed flood discharge (cms). The closer value of CE to 1 implies a more accurate model. Error of peak discharge, EQP z Q pest − Q pobs (4) EQ p =

Q pobs

4 Where Q pobs and Q pest are the peak discharges of flood of observation and estimation, respectively. The lower value of EQp implies a more accurate model. Error of time to peak, ET p z

ETp = Tpest − Tpobs

(5)

Where Tpest and Tpobs denote the times to peak discharge of estimation and observation, respectively. The smaller value of ETp implies a more accurate prediction of occurrence of peak discharge.

Figure 1. Wu-Shi basin Construction of integrated model First, representative time steps of every gauging station are chosen by the following procedures: 1. The water in the Wu-Shi basin has an average velocity of 4.95m/sec so that its corresponding time of concentration is 5.8 hours. Therefore, in this study, one to five time steps are selected as candidate numbers of time steps for every gauging station. 2. Next, impulse response weights of one to five time steps of every gauging station are calculated by LTF and proceeded parameter significance T-test of impulse response weights, i.e. in Table 1. If the absolute value of the T value of any time step exceeds 1.96, it indicates this time step is significance in statistics and should be considered as input neuron in the model. Table 1. Parameter significance T-test of impulse response weights Time Da-Du Bridge His-Nan Bridge Nan-Gang Bridge Chien-Feng Bridge Steps (T VALUE) (T VALUE) (T VALUE) (T VALUE) t ﹡18.98 ﹡9.60 ﹡3.72 -5.08 t-1 -3.85 -7.95 -1.13 3.33

5 t-2 -0.34 4.15 -0.45 ﹡4.95 t-3 -0.09 -1.35 0.69 -5.25 t-4 0.59 -0.51 -0.34 2.48 3. Given the above analysis (with one representation of each gauging station selected to simplify the model avoiding too many input variables) the representative time steps are time t for the Nan-Gang Bridge, t-2 for the Chien-Feng Bridge , t for the Da-Du Bridge and t for the His-Nan Bridge. (If the velocity of water in the flood is lower than 4.95m/sec, then the time by the flow between the His-Nan Bridge and the downstream gauging station is less than 3 hours). According to the above results, the numbers of input layer nodes are one (St) for the His-Nan Bridge, one(Nt)for the Nan-Gang Bridge, one (Pt-2) for the Chien-Feng Bridge, and one (Ot) for the Da-Du Bridge. The sum of all input layer nodes is four while the number of the output layer nodes is one (Ot+1). The suitable number of hidden layer nodes is nearly three. The first four flood events are taken as a series of learning data sets and the criterion for learning is CE>0.9. The weights and biases of 200 back-propagation networks to reach the proposed criterion is applied will become the input data of SOM. 4. The learning process of the SOM is as follows. 1. Use the weights and biases of N terms from BPN as the input nodes (N=19 herein) and generate a two-dimensional map (grid) of M output nodes (say a 12-by-12 map of 144 nodes). Initialize the SOM weight Wij from the input node i to the output node j to a random value. 2. Compute Euclidean distance dj between the input vector and each output node j: N (6) 2 dj =

∑(X i =1

i

(t ) − Wij (t ))

3. Search the closest output units within the “Limited Scope” around each node; is the winner node within the range. Select winning node j*, which yields minimum dj. N (7) 2 ∗ d j = min

∑ ( X (t ) − W i =1

i

ij

(t ))

4. Update the weights to nodes j* and its neighbors, to reduce the distances between them and the input vector Xi(t): (8) ∆W = η (X (t ) −W (t ) )⋅ R _ factor ij

i

ij

Wij (t +1) =W ij (t ) + ∆Wij

j

(9)

where R _ factorj is called the neighborhood function that has value one when j = j*

and falls off with the distance |dj* - dj | between units j and j* in the output array. η (0 < η < 1) is the learning-rate factor, which declines monotonically with the regression steps. Such updates cause the nodes in the neighborhood of j* to become more similar to the input vector Xi(t). 5. Use an index, DP , to indicate how well the SOM handles the topology of the data set of interest. The stopping criterion of the learning process of SOM depends on the convergence of this index.

6 DP =

1 n P ∑ ( min d j ) n p =1 j

∑( X

d Pj =

P i

(10)

− Wi j ) 2

i

where n is total number of learning patterns, one for each neural network, d jp is the Euclidean distance dj between the input vector and each output node j in pattern p. 6. Theses 200 patterns are divided into a two-dimensional map (grid) of M output nodes (say a 12-by-12 map of 144 nodes) by steps 1 ~ 5. The net effect of successively presenting the various input patterns to the SOM is that the SOM weights ( Wij ) reflect the topological relationships within the input data. Then, the three representative output nodes are determined from the number of patterns included for each output node. The information for each interested output nodes can be design as the parameters interval for one flood forecasting neural network so that forecasting intervals of weights and biases of three neural networks will be made up from the SOM weights ( Wij ) and the input values of selected winning pattern according to Euclidean distances. Furthermore, feasible solutions are randomly generated from the foregoing predefined intervals, to realize the accuracy of flood forecasting. RESULT Estimating the intervals of weights and biases The change of index D converges after 150 generations, and Fig. 2 presents that the 200 patterns are divided into a two-dimensional map (grid) of 144 output nodes in the final generation. From Fig. 2, the three representative output nodes are obtained from the number of patterns included for each output node. These output nodes are (10,2), (9,5) and (9,6), which include ten , eight and nine patterns, respectively. The forecasting intervals of weights and biases of three neural networks established as in Figs. 3~4, from the SOM weights ( Wij ) and the winning input variable neighbor on SOM weights, according to the Euclidean distance. P

12

6

2

1

11

1

1

10

4

9

2

8

4

1

7

4

1

Y 6

1

1

5

2

1

4

1

1

3

1

2

2

2

2

1

1

6

4

5

5

3

4

2

1

2

3

1

2

1

2

2 1

2

9

2

8

2

2

1

2

2

1

4

1

2 1 1

1

1

1

2

1

1

2

2

1

4

1

2

5

4

2

2

1

10

1

1

1

1

5

2

3 1 3

2

3 3 2 3

0 0

1

2

3

4

5

6

7

8

9

X

Figure 2. Result of grouping by SOM

10

11

12

7

Figure 3. Forecasting intervals of weights and biases at node (10,2) and (9,6)

Figure 4. Forecasting intervals of weights and biases at node (9,5) Verified results of flood forecasting neural networks We generate five feasible solutions randomly according to above predefined intervals to test the application of each flood forecasting neural network. The verified results of three flood forecasting neural networks (Table 2~4; illustrated with 1998.08.04) imply that the performance of CE is between 0.823~0.929; that of EQp is between -0.267~0.249 and that of ETp is between 1~0. Overall, the verified results of the proposed models are satisfactory. However, the best behavior of CE among interested three output nodes is exhibited by node (9,6), which includes nine patterns; the best behavior of EQp is exhibited by node (9,5), which includes eight patterns and the best behavior of ETp is exhibited node (9,6), which includes nine patterns and node (10,2), which includes ten patterns. Table 2. Verified results on 1998.08.04 for node (10,2) Estimated Peak Observed Number CE EQp ETp (cms) Peak (cms) Random 1 1420 0.923 -0.013 -1 Random 2 1336 0.923 -0.071 -1 Random 3 1799 0.861 0.249 -1 1440 Random 4 1685 0.895 0.171 -1 Random 5 1434 0.921 -0.003 -1 Average 1534 0.905 |0.101| * -1 Table 3. Verified results on 1998.08.04 for node (9,6) Observed Estimated Peak Number CE EQp ETp (cms) Peak (cms)

8 Random 1 1073 0.898 -0.254 Random 2 1294 0.928 -0.101 Random 3 1153 0.912 -0.198 1440 Random 4 1107 0.904 -0.231 Random 5 1054 0.891 -0.267 Average 1136 0.907 |0.201| * Table 4. Verified results on 1998.08.04 for node (9,5) Observed Estimated Peak Number CE EQp (cms) Peak (cms) Random 1 1514 0.834 0.051 Random 2 1493 0.835 0.036 Random 3 1499 0.832 0.041 1440 Random 4 1506 0.831 0.046 Random 5 1530 0.823 0.063 Average 1508 0.831 |0.047| *

-1 -1 -1 -1 -1 -1

ETp -1 -1 -1 -1 -1 -1

CONCLUSION 1. This study proposes a novel procedure that integrates LTF and SOM to determine efficiently the intervals of weights and biases of a flood forecasting neural network. This procedure replaces complex sensitivity analysis. 2. The establishment of three neural networks according to the grouping information of SOM output node(10,2)、(9,5) and (9,6) will increase the flexibility and convenient of flood forecasting , regardless of whether the problem concerns trend, peak or time lag. REFERENCES [1] Gavin J. Bowden, Holger R. Maier, and Graeme C. Dandy, “Optimal division of data for neural network models in water resources applications”, Water Resources Research, Vol.38, No.2, (2002), pp.2-1~2-11. [2] Hwang, J.T.G. and A.A. Ding, “Prediction intervals for artificial neural networks”, Journal of The American Statistic Association,Vol.92, No.438, (1997),pp.748-757. [3] Kohonen, T.,”The self-organizing map”, Proc. IEEE, 78(9), (1980),pp.1481-1480. [4] Yang, C.C., Chang, L.C., and Chen, C.S., ”Comparison of integrated artificial neural network to time series modeling for flood forecast”, Journal of Hydroscience and Hydraulic Engineering, JSCE, Vol.17, No.2, (1999), pp.37-50. [5] Yi-Min Kou and Chen-Wuing Liu, ”Analysis on variation of groundwater quality in yun-lin coastal area(II) back-propagation artificial neural network method”, Journal of Taiwan Water Conservancy, Vol.48, No.1, (2000), pp.9-16.