An Improved Genetic Algorithm for Rainfall-Runoff ... - ScienceDirect

1 downloads 0 Views 1MB Size Report
M.S. Donne, D.G. Tilley and W. Richards, The use of multi-objective parallel genetic algorithms to aid fluid ... John Wiley & Sane, England, (1993). 23.
co-

MODELLING PERGAMON

Mathematical and Computer Modellii

33 (2001) 696-706 www.elssvier.nl/locate/mcm

An Improved Genetic Algorithm for Rainfall-Runoff Model Calibration and F’unction Optimization J. G. NDIRITU* Civil Engineering Department, University of Durban- Westville Private Bag X54001, Durban 4000, South Africa [email protected] T. M. DANIELL Civil and Environmental Engineering Department The University of Adelaide, SA 5005, Australia trevordQciveng.adelaide.edu.au Abstract-The standard binary-coded genetic algorithm (GA) has been improved using the three strategies of automatic ssarch space shifting to achieve hill-climbing, automatic search space reduction to effect time-tuning, and the use of independent subpopulation searchtw coupled with shullling to deal with the occurrence of multiple regions of attraction. The degrees of search space shifting and reduction are determined by the distribution of the beet parameter valuee in the previous generations and are implemented after every specified number of generations. If the best psrameter value in successive generations is clustering in a small part of the search range, a higher level of range reduction is used. The search shift .is based on the deviation from the middle of the current search range of the best parameter values of a specified number of previous generations. With each independent subpopulation, a search is performed until an optimum is reached. Shuflling is then pertbrmed and new subpopulation search spaces are obtained from the shuffled subpopulations. The improved GA performs remarkably better than the standard GA with three global optimum location problems. The standsad GA achieves 11% success with the Hartman function and fails totally with the SIXPAR rainfall-runoff model calibration and the Griewank function while the improved GA effectively locates the global optima. Taking the number of function evaluations used to locate the global optimum as a measure of elllciency, the improved GA is about two times less efllcient, three times more eflkient, and 34 times less efficient than the shuffled complex evolution (SCEUA) method for the SIXPAR rainfall-runoff model calibration, the Hartmau function, and the Griewank function, respectively. The modiied GA can therefore be considered effective but not always efIlcient.@ 2001 Elsevier Science Ltd. All rights reserved.

Keywords--Fine-tuning,

Hillclimbing, Independent subpopulation searches, Genetic algorithm.

1. INTRODUCTION Out of the need to implement optimal designs or decisions, the global optimization problem of the form of (1) to (3) often arises. *Currently at The Civil Engineering Department, Esstern Cape lbchnikon, P.O. Elox 19712 lkcoma, East London, South Africa. [email protected] The authors express sincere thanks to S. Sorooshlan and V. Gupta of the Univemity of Arizona, U.S.A., and Q. Duan of NWS/NOAA, U.S.A., for providing software and data for the SIXPAR rainfall-runoff model. 08957177/01/$ - see front matter @ 2001 Elsevier Science Ltd. All rights mserved. PII: SO8967177(00)00273-9

Type=tby&&TlzX

J. G. NDUWTU AND T. M. DANIELL

696

Minimize f(zl, 22,. . . , zi, . . . , z,J, subject to Xmin 5 zi 5 Xm(ur, i i=i,2 ,..., n,* and 9&) 5 0,

0)

(2) l t >

,

(7)

,

where x* is the value of parameter xi for the best performing individual in generation j, and maxxg and minx~ are the maximum and minimum values of xg in the past s2 generations (including the current generation g). The value ‘1~ is therefore twice the proportion of the decision variable range within which the values of the best performing individuals in the past 92 generations are contained. xmini and x mm are the search range limits before finetuning and take on values X mini and X mq, respectively, at the start of the optimization. x mini, and xmqg are the new range liits after generation g. r&in and rl mM are the fine+trmingcontrol parameters. rlmrn serves to check premature convergence as a result of very low rl~, values and rl, checks an enlargement of the search range in case rlc,s values are too high. The values of sl, 82, rlmin, and rl,, used in this study were 5, 5, 0.4, and 0.5, respectively. 4.2. Hillclimbing The hillclimbing strategy involves a range shift of the search towards more promising regions. After every 93 generations and for all i = 1,2,. . . , n,

sh,

(jxg$4+lxti) / 84- 0.5bw

w =

X

msq -X mini

+xmini) ,

(9)

, xrn?p = xmax+shi,g xmax-xmin * > I I (

(10)

xrnr = xmm+shi,#

(11)

t

(

xmax-xr@n ,

* >

.

The shift s~,~(s rnw - x min$) in (10) and (11) is the deviation from the middle of the current search range of the mean of the ‘best’ variables of the last 94 generations including the current one. For all the runs reported, a value of 5 .wss used for both s3 and ~4;

700

J. G. NDIRITIJ AND T. M. DANIELL

4.3. Independent

Subpopulation

Searches and Shufiling

Parallel genetic algorithms use independent subpopulation searchesand have been applied in several studies (e.g., [17-19)). They are mostly implementedon multiprocessorsthat enable simultaneous independent subpopulation searches. ParallelGAs usually incorporate the migration of high performance individuals among the subpopulations as the search proceeds. Nitchlng [SO] is an alternative approach also used in GAS to search for multiple optima. In this study, a single processor was used and the following procedure adopted. The total population p is divided into n, subpopulations of size ps each. Independent subpop ulation searches to equilibrium (an optimum) without any migration of individuals among the subpopulations are performed. The subpopulations are then shuffled and new parameter ranges are obtained for each subpopulation for the next level of search (epoch). Shuffling involves the following steps. The total population from all the subpopulations is ranked in order of performance to form a matrix [ch(i)], i = 1,2,. . . , p, where ch(1) is the best performing and ch(p) is the worst performing individual. The individuals are allocated to the subpopulations using (12) or (13) (see [3]). A detailed comparison of the two approaches has not been done, but the most recent tests indicate Duan’s approach could be more consistent. This differs from the earlier results reported by Ndiritu and Daniel1[21]. Equation (12) wss, however, applied for the results reported here. sch(i,j) = ch(j + pa(i - l)),

(12)

sch(a, j) = ch(i + ps(j - l)),

(13)

where sch(i,j) is the jth (j = 1,2 , . . . ,pd) individual in the ith (i = 1,2,. . . , n,) subpopulation. For all the n dimensions, the least and highest parameter values of each subpopulation are obtained and used as the lower (z mini,l) and upper (z maxi,i) bounds of the initial search space of the next epoch. The initial populations for the new epoch are generated randomly except for the first subpopulation which retains the best p., individuals from the previous epoch and uses them as the initial population.

5. RAINFALL-RUNOFF

MODEL

CALIBRATION

The rainfall-runoffprocess is very complex consideringthe space-time variabilityof rainfall,soil moisture, and evapotranspiration. Other relevant factors involved include land use, vegetation cover, land and channel slopes, and soil drainage properties. Rainfall-runoff models simulate the rainfall-runoff process with differing levels and forms of simplifications. Some of the major applications of rainfall-runoffmodels are l l l l l l

the extension and filling in of missing runoff data, the creation of runoff data for ungauged catchments, real time flood forecasting, flood peak and volume estimation, investigationsfor gaining understandingof hydrological processes, and investigation of the impacts of land use and climate change on water resources.

Wheater et al. [22] and O’Connell and Todini [23] give overviews and point future directions of rainfall-runoff modelling. Grayson and Chiew [24] and Wheater et al. (221present approaches of rainfall-runoff model classificationwhich use three classes: empirical, conceptual, and process models. Empirical and conceptual rainfall-runoffmodels are not designed to use parametersthat are directly measurablein the field and thus need calibration. Process models, most of which are distributed, are designed to representthe physical processes closely enough to enable the use of measurable parameters only. The lack of adequate data and model imperfections has, however, been found to limit the application of process models in this ‘ideal’ manner. The computational cost of the global optimization of the parameters of a typical distributed process model would

An Improved Genetic Algorithm

701

still be prohibitive, but some form of calibration ls generally used in most applications [25-271. Calibration is thus likely to remain 8n important ingredient of hydrological modelling even with the increasing application of distributed process models. Calibration seeks to obtain a parameter set that will give a simulated series that matches the observed series adequately on the basis of some measuressuch as the simple least squares. Graphical comparisons such as time series plots of observed and simulated discharge (hydra graphs) or scatter plots of the observed versus the simulated values 8re also sometimes applied. Calibration can either be implementedmanuallyor automaticallywith the use of 8n optimization method. The possibility of missing superior parameter sets is higher with manual than with effective automatic calibration, and more combinations of good solutions are likely to be obtained with robust automatic approaches and with less effort than manual methods. Automatic rai&llrunoff model calibration has been a subject of research for decades with a recent development, the SCEUA (31,being a notably powerful method [4,6,10]. 6.

EXPERIMENTATION

WITH RAINFALL-RUNOFF

MODEL

This problem uses an artificial200-day rainfall sequence and the SIXPAR model, and has been extensively studied by Duan et al. [3]. The SIXPAR model is a simplified researchversion of the SAC-SMA model of the National Weather Service River Forecasting System (NWSRFS) of the U.S.A. The SIXPAR uses six parameters;two of these 8re thresholdsof the upper and lower zone storages, two control the rates of recession, and the other two are applied in the modelling of the percolation process. Values of the six parameters were selected and used to derive 8n artiiicial 200-day runoff sequence. This then acted 8s the ‘observed’ runoff series in the global optimization tests, and the parameter values formed the global optimum set. The sum of the least squares (SLS) of the difference between the concurrent ‘observed’ and modelled flows ~8s used as the objective function. The global optimum therefore had a SLS value of 0. The reader is referred to [3] for more details of the SIXPAR model problem. With the SIXPAR and the other two problems, 100 trials were performed in the earlier comparisons by Duan et al. [13]. An optimization was considered a success if the objective

100

90 80 70 60 -e-Modified

50 40

GA

30 20 10 0

8ooo

0 Average

number of fimcthn evaluatiom

Figure 1. Comparative performance of Rve methods in optimization of the SIXPAR model.

J. G. NDIRITUAND T. M. DANIELL

702

function value reduced to at least 0.001 of the global optimum value of 0. A trial wss considered a failure if it reached 25,000 evaluationsor if the region spanned by the population converged to within lo-l2 of the parameterrange in each direction without the 0.001 mark being attained. The number of failures out of 106 and the average number of function evaluations for the successful trials were obtained for each run. In order to ensure a valid comparison, the above criteria were used in the GA optimizations, The optimiiation parameters used for the modified GA were subpopulation size ps = 12, crossover probability c = 1.0, probability of mutation m = 0.05, and tournament size to = 4. T&l optimizations were then made with increasing subpopulation numbers starting with one subpopulation. Figure 1 presents the results of the previous study obtained from [13] and those obtained with the improved GA. The results indicate that the improved GA was effective but not ss efficient ss the SCEl, the SCE2, and the CRS2. To achieve 97% success, the improved GA used seven subpopulations and took an average of 6064 function evaluations as compared to 3293 and 3313 for the SCEl and SCE2, respectively, for 99% success. Compared to the MSX, the improved GA performed more effectively and efficiently at the lower failure levels. The simple GA and the simple GA with hillclimbing and fine-tuning were also tested on the problem with a population p of 84 (pd x 7) and a tournament size of 28 (to x 7). The simple GA failed in all the 100 trials. With combined fine-tuning and hillclimbing, the GA achieved one success. The simple GA with independent searches (seven subpopulations of size 12 each) and shufiiingalso failed in all trials.

7. EXPERIMENTATION

WITH THE HARTMAN

FUNCTION

Equation (14) describesthe Hartmanfunction, and the V&ES of the coefficientsoi,j, Q, and pi,j as given by Duan et al. (131are presented in the Appendix. The Hartman function is defined by

h[x] = 3.32 - f:

ci ap

i=l

[xl =

(~1,...,4T,

- f:

“ij

(zj - Pi,j)2

9

0 5

2i

5 1.

I

--e-MSX

+

SCEl

+SCE2 1

I

I

Average mmher

(14

j=l

I

I

1

of function evaluations

Figure 2. Comparative performance of five methods in optimktion function.

of the Hartman

An ImprovedGeneticAlgorithm

703

The global optimum of the Hartman function is 0 and is located at [x] = (0.201,0.150,0.477,0.275,0.311,0.657). The same optimization parameters as those used for the SIXPAR were applied. Figure 2 gives a comparison of the modified GA and the MSX, CRSZ, SCEl, and SCES. The mod&d GA performs better than the other four methods and achieved 100% successwith four subpopulations. The simple GA achieved 11% success.

8. EXPERIMENTATION WITH THE GRIEWANK FUNCTION The Griewank function is given by Duan et al. [13] and has the global optimum at zi = 0, i= l,..., 10. The function possessesthousands of local miniia.

Exy fJos($)

g[x] = *

[xl =

(Xl,

-

*- *, adT,

+1,

(15)

-6OO~Xi~600~

Applying the criteria used by Duan et a2. [13] which specified a maximum of 25,000 evaluations, the modified GA failed in all trials with up to 20 subpopulationsof ten individualseach. A further investigation was carried out to find whether the modified GA could perform better allowing more function evaluations. Allowing a maximum of 200,000 evaluations, 20 subpopulations of ten individuals each achieved 100% success with an average of 101,096 function evaluations. The SCEl method as reported in [13] used an average of 3070 evaluations for 100% success. This is about 34 times more efficient than the modified GA. Mfihlenbein et al. (18) obtained an average of 59,520 runs for 100% success with 50 runs of the Griewank function using a parallel GA. Tomassini [28] also achieved 100% success in 30 runs with a massively parallel GA that used 122,880 evaluations. In this study, the simple GA failed in all 100 trials. Tomassini [28] also reports the total failure of the simple GA in the optimization of the function.

9. DISCUSSION The modified GA performed better than the simple GA with all three test problems. The SIXPAR problem results revealed that the search range-varying routines (fin&uning and hillclimbing) and the use of independent subpopulations with shuSling were both required together for any improvement in performance. Two more recent studies, however, contrast with this observation. Ndiritu [29] found that the individual modifications, acting independently, improved the calibration results of a historical, data-based, rainfall-runoff modelling problem. The GA with search range variation (fine-tuning and hill climbing) alone gave better results than the GA with shufhing and independent subpopulations alone, although the best result was still obtained with the two modifications applied together. In the other study involving the optimization of the design of a pressurevessel with two discrete and two continuous decision variables [30], shuffling and the use of independent subpopulations were found to give a result almost mat&ii that of the GA incorporating both shuflling with independent subpopulations and the search rangevariation routines. In this problem, the GA incorporating the search range variation procedures alone performed only slightly better than the simple GA. For all the optimizations in this problem, however, the search rsnge-variation procedures were only applied to the two continuous variables. It is therefore evident that the significance of the individual modifications is problem dependent. The conceptual basis of the modifications and the test results suggest that all the

704

J. G.

NDIRITU AND T. M. DANE&L

I

modifications should be incorporated for all continuous VaFiables. For discrete variables, fun+ tuning does not apply, and hillclimbing, as described in Section 4.2, would require adaptation to confine the search within the feasible regions. The varied performance of the improved GA and the other optimization algorithms to the test problems is 8n indication of the complexities of optimization. The relatively poor efllciency of the improved GA with the Griewank function in comperison with .the Hartman function could probably be partly attributed to the number of local optima in the two problems. The Hartman function possesses four optima, while the Griewank function possesses gre8ter than a thousand optima in the region of interest according to [13]. The s8me reasoning, however, would not explain the higher efficiency of the SCEUA with the Griewank in comparison to its performance with the Hartman function. It is possible that a study of the characteristics of global optimization problems end the performance of different optimization methods on them can provide evidence on the appropriate optimization algorithms to apply to specific global optimization problems. The application of such results is, however, likely to be conilned to theoretical functions, as practical problems may not lend themselves easily to analysis and classification. An alternative approach is the comparison of different search methods on several test problems with sn aim of finding the search method(s) whose performance(s) is/are consistently satisfactory for a wide range of problems. On this basis, the SCEUA would be preferred to the improved GA, ss it performed better with two of the three test problems. The superior performance of the SCEUA could probably be attributed to a more efficient local hill climbing concept (the simplex search) ss both the improved GA and the SCMJA applled shufllmg. The shuflling approaches for the two methods were, however, not strictly identical. It is noted that the SCEUA wss not clearly superior to the improved GA for all the problems. This is in agreement with Duan et d’s [13] findings where none of the four optimization methods tested on eight problems gave the best performance for all the problems. Where resources allow, it may therefore be appropriate to involve more than one method in the search for the optimal solution.

lo.

CONCLUSIONS

AND

RECOMMENDATIONS

The simple genetic algorithm (GA) method has been modified through l l

l

a fine-tuning strategy involving automatic search range reduction, a hillclimbing strategy consisting of automatic search range &ii to the promising regions of search, and the use of independent subpopulation searches coupled with shuffling.

The modified GA effectively locates the global optima of the three test problems at varying levels of efficiency, while the simple GA performs poorly with 8ll three. It seems that the modified GA c8n be considered effective but not always efficient. Further tests would be needed to reveal more fully the calsabllties and the limit8tions of the modified GA. Such a study could include an investigation of the value of the individual modifications. Shuffling enabled the GA to locate the global optima in the presence of multiple regions of attraction. The shuffling concept could be applied to other optimization methods that lack sn explicit technique of dealing with multiple regions of attraction. The SCEUA method performed better than the improved GA with two of the three optimization problems investigated. The SCEUA would therefore be the first preference for the optimization of unfamiliar continuous variable problems includii reinfall-runoff model calibra tion. A reasonable study 8rea in the future is a comparison of the SCEUA algorithm with the improved GA to discrete variable and mixed discretecontinuous variable problems. For such problems, both the SCEUA and the improved GA would need modifications in order to handle discrete variables efficiently.

An Improved Genetic Algorithm

705

APPENDIX ‘pable 1. The coefficients of the Hartman function (from (131). Values of oct,j and y j

cc1.j

1

10

2

0.05

3

3

4

17

=z,j

a3f

PU

O%j

9

8

1

3

17

3.5

1.7

10

17

0.1

8

14

17

8

3.5

1.7

10

8

0.05

10

ValUe3

j

=3d

=4,j

pZ,j

0.1

14

1.2 3 3.3

Ofp&j

P3d

P&j

p&j

pw

1

0.1312

0.1696

0.5569

0.0124

0.8283

2

0.2329

0.4136

0.8307

0.3736

0.1004

0.9991

3

0.2348

0.1451

0.3522

0.2883

0.3047

0.665

4

0.4047

0.8828

0.8732

0.5743

0.1091

0.0381

0.5886

REFERENCES 1. A. Tiirn and A. Ziiinskas, Glob01 Optimization, Lecture Notes in Computer Science 350, Springer-Verlag, Germany, (1989). 2. J.D. Pint&, Global Gptimixation in Action, Continuous and Lipschitz Optimization: Algorithms, Impkmentations and Applications, Kluwer Acad., Netherlands, (1996). 3. Q. Drum, S. Sorooshian and V. Gupta, Effective and efficient globai optimization for conceptual rainfall-runoff models, Water Reaour. Res. 28 (4), 1015-1031, (1992). 4. S. Sorooshian, Q. Duan and V.K. Gupta, Calibration of rainfall-runoff models: Application of global optimization to the Sacramento soil moisture accounting model, Water Resow. Res. 28 (4), 1185-1194, (lQQ3). 5. B.C. Bake, Calibration of the SFB mode1 using a simulated anneaiing approach, Proc. Int. Hydrol. and Water Resow. Symp., Adelaide 3, 1-6, (1994). 6. H. Tanakam.zru and S.J. Burges, Application of global optimization to parameter estimation of the TANK model, Proc. Int. Conf. on Water Resow. and Environ. Res., Kyoto, Japan 2, 3946, (lQQ6). 12 (3), 7. P.R. Johnston and D.H. Pilgrim, Parameter optimization for watershed modeis, 477-486, (1976). a. Q.J. Wang, F.H.S. Chiew and T.A. McMahon, Calibration of environmental models by genetic algorithms, In Proc. MODSIM 95, Int. Congwas on Modell. and Simul., Vol. 3, pp. 185-190, Modell. and Simul. Sot. of Au&., Newcastle, (1995). 9. J.G. Ndiritu and T.M. Daniell, Time-dome+in tuned rainfall-runoff models optimised using genetic algorithms, In Proc. Int. Conf. on Dev. and Appl. of Computer Techniques to Env. Studies, (Edited by P. Zannetti and C.A. Brebbia), pp. 268-275, Camp. Mech. Publ., Southampton, Boston, (1996). 10. G. Kuczera, Efficient subspace probabilistic parameter optimization for catchment models, Water Resow. Res. 33 (l), 177-185, (1997). 11. GALESIA, First Int. Conf. on Genetic Algorithma in Engineering Systems: Innovations and Applications, Sheffield, (1996). 12. J.A. Nelder and B. Mead, A simplex method for function minimization, Comput. J. 7 (4), 308-313, (1965). 13. Q.Y. Duan, V.K. Gupta and S. Sorooshisn, Shuffled complex evolution for effective and efficient global minimization, J. Gptim. Theory and Appl. 78 (3), 591-521, (1993). 14. J.H. Holland, Adaptation in Natuml and Artificial S@ems, Univ. of Mich. Prees, Ann Arbor, MI, (1975). 15. D.E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison Wesley, (1989). 16. M.J.D. Powell, An efficient method for fmding the minimum of a function of several variablee without calculating derivativea, Comput. J. 7, 155-162, (1964). 17. J.P. Cohoon, S.U. Hegde, W.N. Martin and D.S. Richards, Distributed genetic algorithm for the fioorpian design problem, IEEE l+uns. on Computer-Aided Design 10 (4), (1991). 18. H. Mfihlenbein, M. Schomisch and J. Born, The parallel genetic aigorithm as function optimizer, Pamllei Computing 17, 619-632, (1991). 19. M.S. Donne, D.G. Tilley and W. Richards, The use of multi-objective parallel genetic algorithms to aid fluid power system deeign, Pmt. Inst. Mech. Engru. 209,53-61, (1995). 20. S.W. Mahfoud, Nitching methods for genetic algorithms, IBGAL Heport No. 95001, Univ. of Illinois, (lQQ5).

706

J. G. NDIIUTUAND T. M. DANIELL

21. J.G. Ndiritu and T.M. Deniell, An improved genetic slgorithm for reinfall-runoR model calibration and function optimization, In Proc. MODSZM 97, Znt. Congreae on Modell. and Simul., Vol. 4, pp. 1683-1666, Modell. Bnd Simul. Sot. of Au&., Hobart, (1997). in rainfall-runoff modelling, In Modelling 22. H.S. Wheeter, A.J. Jakeman end K.J. Beven, Progrees end diitione Change in Environmental Sgwteme, (Edited by A.J. J8kemen et al.), John Wiley & Sane, England, (1993). 23. P.E. O’Connell and E. Todini, Modelling of reinfall, flow and mase tmn8port in hydrological systems: An overview, J. Hydrol. 175, 3-16, (1996). 24. R.B. Grayeon and F.H.S. Chiew, An approach to model selection, Prvc. Znt. Hgdwl. and Water Resow. Sgmp., Adelaide 1, 507-512, (1994). 25. J.C. Hefsgeerd and J. Knudsen, Operational validation Bnd intercomperiaon of different typea of hydrological models, Water Resow. Res. 32 (7), 2189-2202, (1996). 26. J.C. F&bg88rd, Parameterisation, calibration and validation of distributed hydrological models, J. Hydrol 198, 69-97, (1997). 27. A.W. Western, T.M. Green and RB. Grayson, Hydrological modelling of the T8rmw8rre catchment: Use of soil moisture patterns, In Proc. MODSZM 97, Znt. Congress on h&dell. and Simul., Vol. 1, pp. 409-416, Modell. and Simul. Sot. of Aust., Hobart, (1997). 28. M. Tonaseini, The parallel genetic cellular automata: Application to global function optimization, In Artificial Neural Nets and Genetic Algorithms, (Edited by RF. Albrecht et al.), pp. 365-391, Springer-Verlag, New York, (1993). 29. J.G. Ndiritu, An improved genetic algorithm for rainfall-runoff model calibration, Ph.D. Thesis, Univ. of Adelaide, Au&8lia, (1998). 30. J.G. Ndiritu and T.M. Daniell, An improved genetic algorithm for continuous end mixed discrete-continuous optimization, Eng. Opt., 23 (l), 63-70, (January 1997).

Suggest Documents