Such pre-analyses may not be affordable in. 229 .... Any optimization algorithm may be used as the search engine in the framework; however in this. 329 study ...
Razavi, S., and B. A. Tolson (2013), An efficient framework for hydrologic model calibration on long data periods, Water Resour. Res., 49, 8418–8431, doi:10.1002/2012WR013442.
An Efficient Framework for Hydrologic Model Calibration on Long Data Periods
1 2 3 4 5 6 7 8 9
Saman Razavi1 and Bryan A. Tolson2 1
Global Institute for Water Security, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
2
Department of Civil and Environmental Engineering, University of Waterloo, Waterloo, Ontario, Canada
Abstract:
10
Long periods of hydrologic data records have become available in many watersheds around the
11
globe. Hydrologic model calibration on such long, full-length data periods is typically deemed
12
the most robust approach for calibration but at larger computational costs. Determination of a
13
representative short period as a “surrogate” of a long data period that sufficiently embeds its
14
information content is not trivial and is a challenging research question. The representativeness
15
of such a short period is not only a function of data characteristics but also model and calibration
16
error function dependent. Unlike previous studies, this study goes beyond identifying the best
17
surrogate data period to be used in model calibration and proposes an efficient framework that
18
calibrates the hydrologic model to full-length data while running the model only on a short
19
period for the majority of the candidate parameter sets. To this end, a mapping system is
20
developed to approximate the model performance on the full-length data period based on the
21
model performance for the short data period. The basic concepts and the promise of the
22
framework are demonstrated through a computationally expensive hydrologic model case study.
23
Three calibration approaches, namely calibration solely to a surrogate period, calibration to the
24
full period, and calibration through the proposed framework, are evaluated and compared.
25
Results show that within the same computational budget, the proposed framework leads to
26
improved or equal calibration performance compared to the two conventional approaches.
27
Results also indicate that model calibration solely to a short data period may lead to a range of
1
28
performances from poor to very well depending on the representativeness of the short data period
29
which is typically not known a priori.
30 31
Keywords: Automatic Calibration, Optimization, Hydrologic Model, Surrogate Modeling, Calibration data
32
1. Motivation and Objective
33
The appropriate length of observation data for effective calibration of continuous hydrologic
34
models has been a challenging research question [Andreassian et al., 2001; Gupta and
35
Sorooshian, 1985; Juston et al., 2009; Perrin et al., 2007; Singh and Bardossy, 2012; Vrugt et
36
al., 2006; Xia et al., 2004; Yapo et al., 1996]. In general, longer calibration periods are deemed
37
more robust and reliable in identifying the model parameters and quantifying their uncertainty
38
[Perrin et al., 2007]. However, longer periods would demand longer model run-times; given that
39
in the calibration process a hydrologic model is required to run a large number of times, utilizing
40
the full period of data available for model calibration and uncertainty estimation may sometimes
41
become computationally demanding or even infeasible. Such computational burdens may
42
become prohibitive for some modern high-fidelity hydrologic models which simulate detailed
43
representations and processes of the real-world system. As such, computational limit is typically
44
one motivation of seeking a representative “short period” out of longer available data for model
45
calibration in any watershed of interest. A representative short period needs to embed diverse
46
climatic and hydrologic conditions to adequately represent hydrologic variability in the
47
watershed of interest such that all simulated processes in the model are well-activated [Juston et
48
al., 2009; Perrin et al., 2007; Singh and Bardossy, 2012].
49
There have been research studies attempting to determine the minimum adequate length of
50
data for hydrologic model calibration. These studies typically calibrated and compared model 2
51
parameters for different sub-periods with different lengths of a long data period and concluded
52
that several years of data could be adequate but their proposed required lengths vary relatively
53
widely in the range of two to eight years [Juston et al., 2009]. Singh and Bardossy [2012]
54
pointed out that any of the proposed required lengths cannot be generalized as different models
55
have different complexity levels and different watersheds have different information content in
56
each year of data period. In addition, Xia et al. [2004] found that different model parameters may
57
require different lengths of data to achieve calibration results insensitive to the period selected.
58
Moreover, the information contained in observation data is not uniformly distributed along the
59
period, and certain sub-periods may contain information useful for identifying some specific
60
parameters while irrelevant for other parameters [Singh and Bardossy, 2012]. Singh and
61
Bardossy [2012] proposed a method based on the statistical concept of data depth to identify and
62
extract what they referred to as “unusual events” out of the full data period for hydrologic model
63
calibration. Their identified “unusual events” in a calibration data period, however, can be far
64
apart in time, and as such, it may not be possible to extract a contiguous sub-period containing
65
all the unusual events that is sufficiently shorter than the full calibration period. Given these,
66
there is a tendency among hydrologists to utilize the full period of available data for calibration
67
of hydrologic models despite possible computational burdens.
68
The other potential issue with model calibration on a short period is the possible instability of
69
some model parameters with time due to climate and/or environmental changes. In this regard,
70
Merz et al. [2011] showed that some hydrologic model parameters may consistently vary with
71
time and such variations can be related to climate fluctuations. Zhang et al. [2011] developed a
72
framework for multi-period calibration of hydrologic models based on hydroclimatic clustering,
73
which identifies and utilizes different calibrated parameter sets for different hydroclimatic
3
74
conditions. Merz et al. [2011], however, argued that explicitly accounting for non-stationary
75
model parameters by predicting them based on climate condition may not be consistent with the
76
“usual philosophy of dynamic models” which is “to use time-dependent boundary conditions
77
(such as rainfall) and non-time-dependent model parameters to make the model directly
78
applicable to predictions using future (or different) boundary conditions”. Wagener [2007]
79
pointed out that to ensure credible modelling of environmental change impact, significant
80
correlations between model parameters and watershed characteristics should be demonstrated
81
and established. Wagener [2007] continued that such a priori knowledge is not currently very
82
accurate, and as such, it will likely lead to very uncertain parameter distributions and
83
consequently very uncertain model predictions. Instead of incorporating different parameter
84
values for different time periods in model predictions, Gharari et al. [2013] presented an
85
approach that tests the model performance on different sub-periods and selects a parameter set
86
that is as close as possible to the optimum parameter values of each individual sub-period.
87
Razavi and Araghinejad [2009] utilized a forgetting factor approach in order to account for
88
environmental change in model calibration, in particular to address the potential non-stationarity
89
problems in time-series modelling, such that older observations have less influence on calibrated
90
parameter values.
91
This study develops a computationally efficient framework for hydrologic model
92
automatic calibration for watersheds with long periods of available data. This framework aims to
93
utilize full-length data periods in model calibration (called “full period” hereafter) while
94
reducing computational burdens by using representative short periods for most of the model runs
95
and running the model over the full period only for selected candidate parameter sets. Most of
96
these candidates are identified to be promising solutions based on the information obtained from
4
97
model runs on a short period. Like the majority of hydrologic model calibration approaches, this
98
framework aims at finding a single set of parameter values that is optimal considering the full
99
calibration period. The proposed framework has roots in surrogate modelling which is concerned
100
with developing and utilizing more efficient but lower-fidelity models as surrogates of
101
computationally intensive high-fidelity (original) models [Razavi et al., 2012]. There are two
102
general families of surrogate modelling strategies: 1- response surface surrogates which utilize
103
statistical or data-driven function approximation methods such as kriging and neural networks to
104
emulate one (or multiple) response(s) of an original model [Broad et al., 2010; Johnson and
105
Rogers, 2000; Regis and Shoemaker, 2007; Yan and Minsker, 2011], and 2- lower-fidelity
106
physically-based surrogates which are simplified models of the original system of interest and
107
conceptually or directly preserve its governing physics [Forrester et al., 2007; Mondal et al.,
108
2010; Robinson et al., 2008]. The proposed framework in this study shares some features with
109
lower-fidelity physically-based surrogate modelling when used for model calibration. However,
110
a major difference is that instead of using surrogate models, the proposed framework solely
111
works with original high-fidelity models but defines and uses “surrogate data” (i.e., short period
112
data).
113
The framework developed in this study enables the more efficient navigation through the
114
parameter space of computationally intensive hydrologic models being calibrated on long data
115
periods and identify the parameter sets of interest (i.e., optimal values over full-length calibration
116
periods) within limited computational timeframes. Like some benchmark studies in the field
117
[e.g., Duan et al., 1992; Tang et al., 2006; Vrugt et al., 2003], this paper focuses on efficient and
118
effective calibration of hydrologic models, and addressing the common issues with model
119
validation with optimized parameters are beyond the scope of this paper. In practice, hydrologic
5
120
model calibration should be followed by a validation step to test the credibility of the calibrated
121
model for the purpose. Therefore, like any other calibration framework, the optimal parameter
122
sets found may need to be validated over an independent set of data (not used for calibration)
123
measured at the gauge used for calibration or other gauges depending on the specific modelling
124
objectives. Interested readers are referred to the work of Klemes [1986] and Refsgaard and
125
Knudsen [1996] who developed and formalized different validation methods for hydrologic
126
simulation models for different purposes.
127
2. Basic Concepts
128
The proposed framework is based on the principle that the performances of a hydrologic model
129
on different data periods are correlated. The magnitude of such correlations varies for different
130
pairs of data periods from very low (perhaps even negligible) for periods with distinctly different
131
hydrologic conditions to very high for periods experiencing similar conditions. Assuming the
132
performance of a hydrologic model is measured by an error function (e.g., mean squared errors),
133
Figure 1 compares the true error function of a hydrologic model (i.e., running on a full
134
calibration period) with three possible surrogate error functions of the same model when running
135
on three different short periods. Figure 1a represents a case where the short period is a good
136
(almost ideal) representative of the full period such that there is a high level of correlation
137
(agreement) between the model performance over the full and short periods. Figure 1b shows a
138
moderate level of correlation where the model performance over the short period almost captures
139
both modes (local and global optima) of the model performance over the full period but the
140
global optimum in the short period performance corresponds to the local optimum of the
141
performance on the full period. The short periods represented in Figure 1a and Figure 1b can be
6
142
very helpful in calibrating the hydrologic model. Conversely, Figure 1c represents a case where
143
the short period is quite misleading as there is a large level of discrepancy between the two
144
functions. This case may happen when the surrogate data period is unreasonably short and/or
145
when it does not include information that can activate most of the simulated processes that are
146
active in the full period.
147
Figure 2 hypothetically depicts possible relationships between the performance of a
148
hydrologic model on a full data period (i.e., full-period error function) versus the performance of
149
the same model on a short data period (i.e., short-period error function). In general, the full-
150
period error function value is monotonically increasing (linearly or non-linearly) with the short-
151
period error function value (the global relationship – see Figure 2a). Such a relationship can be
152
very useful in predicting the performance of a model on the full period using the results of a
153
model simulation over a short period and is the basis of the framework developed in the study.
154
However, there is some level of uncertainty (i.e., discrepancy from the global relationship)
155
associated with such relationships; this uncertainty becomes larger for less representative short
156
periods. Figure 2b demonstrates a case where the short period is very poor such that no clear
157
global relationship between the full-period and short-period error functions is identifiable – the
158
uncertainty level is excessively large. Conceptually, Figure 2a corresponds to the cases depicted
159
in Figure 1a and Figure 1b, while Figure 2b corresponds to the case demonstrated through Figure
160
1c.
161
Figure 3 depicts a conceptual (but presumably typical) example of the convergence trend of
162
an automatic calibration experiment when the optimization objective function to be minimized is
163
the short-period error function. The corresponding full-period error function is also shown in this
164
figure - the parameter sets found on the optimization convergence trajectory are also evaluated 7
165
by running the model over the full period. Early on in the search, reducing the short-period error
166
function would result in reducing the full-period error function as well; however, after some
167
point in the search, further reduction in the short-period error function would result in increasing
168
the full-period error function. This behaviour, which is analogous to “over-fitting” in statistics
169
and curve-fitting, indicates that solely relying on a representative short period may be efficient to
170
initially guide a calibration algorithm to the vicinity of the promising regions in the parameter
171
space, but it may become ineffective or even misleading in extracting near optimal parameter
172
sets. Note that if a short period is a perfect surrogate of a full period, solely calibrating a
173
hydrologic model to the short period would be accurate.
174
3. Methodology and Demonstration of Key Components
175
The framework developed in this study links a search engine with an interface to a hydrologic
176
simulation model which is set up to run over a full period of data and a representative short
177
period of data. A main component of this framework is a performance mapping system which is
178
on the basis of the concepts illustrated in Section 2. The performance mapping system was
179
designed such that it could minimize the uncertainties associated with the predictions of full-
180
period model performance based on representative short periods. As this framework also
181
evaluates select parameter sets over full calibration periods, the over-fitting issue raised in
182
Section 2 could be avoided. In this section, an example case study is used for demonstration of
183
the performance mapping system.
184
3.1. An Example Case Study
185
To illustrate the framework proposed, a SWAT2000 (Soil and Water Assessment Tool, version
186
2000) [Neitsch et al., 2001] hydrologic model calibration case study, originally developed by 8
187
Tolson and Shoemaker [2007a] to simulate streamflows into the Cannonsville Reservoir in New
188
York, was used. This automatic calibration case study seeks to calibrate 14 SWAT2000 model
189
parameters, subject to various range constraints, to measured flow at the Walton gauging station
190
(drainage area of 860 km2) by maximizing the Nash‐Sutcliffe coefficient for daily flow. In this
191
calibration case study, a total of nine years of data from January 1987 to December 1995 are
192
used in simulations such that the first three years are used as the initialization period and the
193
remaining six years are used as the calibration period (see Figure 4). In this case study, a single
194
model run over the nine-year period requires about 1.8 minutes on average to execute on a
195
2.8GHz Intel Pentium processor with 2 GB of RAM running the 32-bit Windows XP operating
196
system.
197
3.2. What is a representative short period?
198
Ideally, a short period of the full data period is deemed “representative” if the information
199
contained in the short period is sufficiently complete and diverse such that model calibration
200
experiments using the short period and the full period would yield almost identical results. Such
201
an ideal representative period with a short duration (reasonably shorter than the full data period)
202
may not necessarily exist in any watershed. In this study, “representative short period” refers to a
203
reasonably short sub-period (contiguous in time) of the full period of data over which the model
204
performance is reasonably well correlated with the model performance over the full period. For
205
demonstration, two different parts of the full period of data available for our case study were
206
selected as representative short periods (see Figure 4).
207
The representative short period (a) was selected arbitrarily without any prior
208
knowledge/analysis of the system to represent a case when the framework user cannot identify
209
the most appropriate representative short period. The only considerations in choosing this period 9
210
were to begin from a dry period for better model initialization and to finish at the end of a flow
211
recession period. The representative short period (a) was nine months preceded by four months
212
used for model initialization. The initialization part used was considerably shorter that the
213
initialization part of the original calibration problem with full calibration period for the sake of
214
efficiency. Generally, having a shorter initialization period in calibration may reduce the
215
credibility of calibrated parameter values, however, this does not matter as the proposed
216
framework uses such surrogate data only as a guide and the final calibrated parameter values will
217
be selected based on the model performance on full calibration data with long initialization
218
periods. A preliminary analysis was conducted to evaluate the correlation of the hydrologic
219
model performance over the representative short period (a) with the performance of the same
220
model over the full period. A total of 1000 (uniformly) random parameter sets were generated in
221
their specified ranges and evaluated through running the hydrologic model over the full period.
222
Values of mean squared errors (deviations of the simulated flows from observations, MSE) were
223
calculated over the full calibration period (i.e., six years) and the representative short period (a)
224
and plotted in Figure 5. According to this figure, the linear correlation between the MSE values
225
over the two periods is not very significant and the linear regression has an R2 value of 0.64. The
226
low correlation implies that the error function surfaces for the two periods are somewhat
227
dissimilar and the minimizer of one may not be close to the minimizer of the other.
228
Identifying the best representative short period requires prior analysis of the hydrologic
229
model performance over the full calibration data. Such pre-analyses may not be affordable in
230
practice with computationally intensive hydrologic models. However, for demonstration
231
purposes, we attempted to identify and use the best one-year period (preceded by an initialization
232
period) of the full data utilizing the full-period simulation results of 1000 random parameter sets.
10
233
To this end, the second representative short period was selected such that the linear correlation
234
between the short-period MSE values and the full-period MSE values was maximized (see
235
Figure 5) - a 12-month window was moved along the six years of data to find a linear regression
236
with maximum R2 value. The preceding six months (starting from a dry season) were used for
237
initialization. In general, higher correlations indicate higher degrees of similarity between the
238
short-period and full-period error function surfaces. When the two surfaces are very similar,
239
model calibration on the short period would be almost equivalent to calibration on the full
240
period. As explained in Section 3.3, the performance mapping system attempts to address and
241
formulate the dissimilarities between the two surfaces such that any short period demonstrating
242
at least a moderate level of correlation with the full period, such as representative short period
243
(a), can be used in the proposed framework. Note that the correlation analysis throughout Section
244
3 is on the basis of MSE for the sake of consistency. Similar results for calibration error
245
functions that are monotonic transformations of MSE, such as sum of squared errors, root mean
246
squared errors, and Nash-Sutcliffe coefficient of efficiency, can be directly derived from the
247
MSE-based analysis. Other calibration error functions including mean absolute error (MAE) and
248
mean log squared error (MLSE) are discussed in Section 4.
249
3.3. Performance Mapping
250
The performance mapping system developed in this study is on the basis of the relationships
251
conceptualized in Figure 2 and numerically derived for our case study in Figure 5. However,
252
instead of considering the short-period error function (e.g., MSE value over the short period) as
253
one single predictor, the mapping system disaggregates this function value into multiple values
254
corresponding to different sub-periods of the representative short period in order to reduce the
255
prediction errors. Each sub-period is to represent a distinct hydrologic behavior/aspect of the 11
256
data, for example, flood events, recession periods, dry periods, etc. This disaggregation enables
257
the mapping system to extract the most of the information contained in a representative short
258
period. Figure 6 shows the sub-periods used in the representative short periods of our case study.
259
The performance mapping system is to relate the performance of the model over all these sub-
260
periods to the performance of the model over the full period of data. The performance mapping
261
system consists of a regression model enhanced by a localized Gaussian deviation model, ( ),
262
based on the kriging concept in the form of:
263
( )
264
where
265
periods j,
266
∑
( )
(
( )
( ))
is the ith candidate parameter set,
(1)
( ) is the model error function value for sub-
is the regression coefficient for sub-period j,
( ) is the full-period error function. We considered
is the number of sub-periods, and ( )
(
) where
is the
267
mean squared of model errors over the sub-period j and ̂
268
predicted MSE value over the full period. The logarithmic transformation is used mainly to
269
generate a mapping system more sensitive and accurate for smaller MSE values. In automatic
270
calibration, accuracy of the mapping system becomes more important when the search
271
approaches the main regions of attractions (regions containing parameter sets with reasonably
272
small error functions). Moreover, the distribution of MSE values collected in the course of
273
calibration is typically skewed to the right mainly due to optimization algorithm search
274
strategies. As such, the logarithmic transformation transforms the inputs and outputs of the
275
mapping function in Equation (1) into a more normally distributed variable space. Other
276
transformations may also be used.
12
( ( )) where ̂ is the
277
The regression-Gaussian model in equation (1) was adopted from Sacks et al. [1989] and
278
implemented with the software package developed by Lophaven et al. [2002]. Best linear
279
unbiased predictor (BLUP) methodology was used to determine the coefficients of Equation (1).
280
This mapping function represents the global relationships by the regression component, while the
281
Gaussian component attempts to represent the residuals in the regression and also the non-linear
282
local relationships. The minimum number of input-output data sets that is mathematically
283
required to develop the mapping function and provide confidence intervals is n+2 (n is the
284
dimension of the function input space). Interested readers are referred to Sacks et al. [1989] and
285
Lophaven et al. [2002] for details on how to develop and fit the model. Notably, unlike
286
regression models, the mapping system developed is an exact interpolator for the input-output
287
sets used for developing/updating the system – the system would generate the exact model
288
performance (i.e., full-period MSE values) for the parameter sets that have been already
289
evaluated over the full data period and included in the input-output sets for the fitting of the
290
mapping system.
291
A representative short period may be disaggregated into multiple sub-periods in different
292
ways based on the hydrologist’s understanding of the processes in a case study or some
293
numerical analyses. The key is each sub-period should ideally represent a distinct hydrologic
294
behavior. For example, each flood event (from the beginning of rainfall or the associated flow
295
rising limb to the end of the flow recession curve) may be considered as a distinct sub-period.
296
Alternatively, rising limbs, falling limbs, recession parts, or long dry periods may constitute
297
different sub-periods. Similar events may be deemed as one sub-period. In this study, the sub-
298
periods of each representative short period shown in Figure 6 were objectively selected through
299
some numerical analyses of the residual time series generated for 1000 random parameter sets in
13
300
Section 3.2 were used. All possible configurations of division points in time for different
301
numbers of sub-periods ranging from two to six were evaluated for both representative short
302
periods by fitting the mapping system with only the regression component ( ( ) was assumed to
303
be zero). The sub-period configurations that maximized the regression R2 value were selected.
304
For example, for the case of having two sub-periods (i.e., one division point) in the
305
representative short period (a), the first sub-period starts on January 1,1990 (the beginning of the
306
representative short period) and ends at the division point where the second sub-period starts
307
from and continues to the end of the period (i.e., September 30, 1991). Assuming a sub-period
308
could not be shorter than a week, the division point varied from January 8, 1990 to September
309
23, 1991 resulting in 263 configurations to be tested. In our case study, the representative short
310
periods (a) and (b) were disaggregated into five and four sub-periods, respectively. Notably, all
311
the division points of the configurations for sub-periods which generated high R2 values coincide
312
with either the lowest or peak flows. We noticed that for a given representative short period,
313
there are many configurations for sub-periods starting or ending at peaks or lowest flows that can
314
generate a reasonably reliable regression model (i.e., with high R2).
315
To demonstrate the accuracy of the developed mapping system, 1000 new parameter sets
316
were generated randomly in their specified ranges and then the model was run for each
317
parameter set over the full period and the representative short periods (a) and (b) independently.
318
For each representative short period, the mapping system was developed using 100 of these
319
generated input-output sets (selected randomly) and then tested over the remaining 900 input-
320
output sets. Figure 7 shows the scatter plots of the actual full-period mean squared errors (MSE)
321
versus the predicted full-period MSE values for the 900 testing parameter sets. As can be seen,
322
the results of the mapping system are almost unbiased as the linear regressions plotted on this
14
323
figure are very close to the ideal (1:1) line. The dispersity of the points around the ideal line is
324
also reasonably low for both representative short periods as the R2 values are 0.92 and 0.96 for
325
the representative short periods (a) and (b), respectively. Replicating this analysis by different
326
100 randomly selected input-output sets of the 1000 generated sets yielded similar performance
327
mapping accuracies (results not reported here).
328
3.4. Search Engine
329
Any optimization algorithm may be used as the search engine in the framework; however in this
330
study, the covariance matrix adaptation-evolution strategy (CMA-ES) [Hansen and Ostermeier,
331
2001] was used. CMA-ES is a stochastic, derivative-free optimization algorithm that generates
332
candidate solutions according to an adaptive multivariate normal distribution. The covariance
333
matrix of this distribution which represents pairwise dependencies between variables is updated
334
iteratively in the course of optimization. Notably, CMA-ES does not need the actual objective
335
function values of the candidate solutions, and instead it only uses their rankings to learn the
336
specifications of the multivariate normal distribution. The algorithm has the form of (μ, λ)-CMA-
337
ES where μ and λ are the parent number and population size, respectively. At each iteration, λ
338
new candidate solutions are generated following the multivariate normal distribution, and then a
339
weighted combination of the μ best out of λ solutions is used to adapt the parameters of the
340
distribution (i.e., mean and covariance matrix) such that the likelihood of previously successful
341
candidate solutions and search steps is maximized/increased. Following the recommendations of
342
Hansen and Ostermeier [2001] for algorithm parameters, (5, 11)-CMA-ES was used for our case
343
study.
15
344
3.5. Calibration Framework using Surrogate Data
345
The framework developed in this study consists of two phases (see Figure 8). Phase 1 is basically
346
a conventional automatic calibration procedure which simply links the optimization algorithm to
347
the hydrologic model but running over the representative short period. As such, the optimization
348
algorithm generates candidate parameter sets, xi, for the hydrologic model with the objective of
349
minimizing the short-period error function, fs(xi). In phase 1, only the parameter sets that
350
improve the current best short-period error function value (i.e., fsbest) are also evaluated through
351
the hydrologic model over the full period of data. Phase 1 continues until a sufficient number of
352
parameter sets are evaluated over both short and full periods; the simulation results of these
353
parameter sets are then used to develop the performance mapping system. The performance
354
mapping system enables the framework to map the performance of new candidate parameter sets
355
on the short period (short-period error function) to their corresponding performance on the full
356
period (i.e., full-period error function). Notably, phase 1 is only active early in the search when
357
minimizing the short-period error function mainly corresponds to minimizing the full-period
358
error function (falling part of the full-period error function in Figure 3).
359
In phase 2, the optimization algorithm seeks the minimum of a new response surface
360
which is a surrogate of the full-period error function such that the majority of candidate
361
parameter sets are still only evaluated by the hydrologic model over the short period, but the
362
approximated full-period error function values, ̂ ( ), are returned to the optimizer. If the
363
approximated full-period error function value for a candidate parameter set is better than the
364
current best full-period error function value (i.e., ffbest), the parameter set is also evaluated by
365
running the hydrologic model over the full period.
16
366
Phase 2 can begin by re-starting the optimization algorithm from the best parameter set
367
found in phase 1 (e.g., by keeping the best parameter set of phase 1 in the initial population).
368
Alternatively, the optimization trial of phase 1 can continue through phase 2. However,
369
switching from one response surface to a different one potentially with a different response scale
370
when moving from phase 1 to phase 2 may corrupt some of the information the optimization
371
algorithm has collected and as such degrade the optimizer performance. For example, the mean
372
and covariance matrix of the multivariate normal distribution, which have evolved in the course
373
of a CMA-ES run in phase 1, may change abruptly at the beginning of phase 2. Defining the
374
“correction factor” at the end of phase 1 is intended to align the two surfaces in the vicinity of
375
the parameter sets in the last generation of the optimization algorithm in phase 1. The correction
376
factor, CF, is calculated as:
377
̅ ⁄̅
(2)
378
where ̅ is the best short-period error function and ̅ is its corresponding full-period error
379
function in the last population of the optimization algorithm in phase 1. At the end of phase 1,
380
the correction function is calculated, and since then (in phase 2) it acts as a constant multiplier
381
for all the error function values when returning to the optimization algorithm (See Figure 6).
382
Notably, the defined correction factor in Equation (2) is only one simple way out of many
383
possible ways to handle the scale differences. In case that the optimization algorithm re-starts at
384
the beginning of phase 2, CF can be ignored (CF=1). Moreover, depending on the optimization
385
algorithm used, there may be other conditions to account for in order to better preserve the
386
optimization algorithm behaviour. For example, in population-based optimization algorithms like
387
CMA-ES, it is advisable to evaluate all the individuals in a generation by the same function. To
17
388
this end, the framework developed only proceeds to phase 2 at the beginning of a new a CMA-
389
ES generation.
390
Both phase 1 and phase 2 reserve a small chance for any candidate parameter set to be
391
evaluated over the full period regardless of its performance over the short period. This chance is
392
controlled by a framework parameter, α, varying in the range of zero and one, zero indicating no
393
chance and one indicating all parameter sets are to be also evaluated over the full period. As
394
shown in Figure 8, after each model run over the short period, a uniform random number in the
395
range of zero and one (i.e., rnd) is generated; if rnd is smaller than α, the candidate parameter set
396
is also evaluated through running the model over the full period. This capability intends to
397
minimize the bias of the performance mapping system by providing and training the system to a
398
range of possible performances. In addition, this capability diminishes the risk of possible
399
framework failures in case the short period selected is not very informative. Larger α values
400
would increase the sample size of data used for developing the mapping system and make the
401
framework more robust, but this would come at the expense of increasing the overall
402
computational burden. The preliminary experiments with our case study demonstrated that the
403
proposed framework with α values being slightly greater than zero (e.g., 0.05-0.1) would be a
404
proper compromise between the framework efficiency and robustness. The optimal α value is
405
problem-specific and not known without extensive experimentation. The default α value for the
406
experiments in this study was set to 0.05.
18
407
4. Results and Discussion
408
4.1. Comparative Assessment
409
The three different approaches to model calibration, namely calibration solely on a representative
410
short period, calibration on the full period of data, and calibration through the proposed
411
framework, were evaluated and compared. Seven independent calibration experiments were run
412
for each approach with seven different random seeds and random starting points. The
413
comparisons were made within four different computational budget scenarios. Computational
414
budget was quantified by the total number of years that were simulated by the hydrologic model
415
over the course of each automatic calibration experiment and budgets of 450, 900, 1800, and
416
2250 years were considered. The smallest computational budget is equivalent to 50, 415, and 300
417
hydrologic model runs (optimization function evaluations) over the full period, representative
418
short period (a), and representative short period (b), respectively. The computational budget of
419
2250 years of simulation is equivalent to 250 (2250 years / 9 years), 2075 (2250 years / 1.083
420
years), and 1500 (2250 years / 1.5 years) objective function evaluations with full period,
421
representative short period (a), and representative short period (b), respectively.
422
Figure 9 shows the results of the numerical experiments. To be more interpretable, the
423
best MSE values found in the calibration experiments were transformed to their associated Nash-
424
Sutcliffe (N-S) coefficient values. The maximum Nash-Sutcliffe coefficient value (referred to as
425
“global optimum” in this figure) for this case study that has been previously identified through
426
extensive optimization is 0.85. As can be seen, model calibration on the full period improves
427
with the increase in the computational budget. However, model calibration solely on
428
representative short period (a) is not as robust, since the performance results of the calibrated
429
model when tested over the full period are more variable, and more importantly, do not improve 19
430
(may even degrade) with the increase in computational budget. Such degradations are obvious
431
between the computational budgets of 900 and 1800 years of simulation as the average N-S
432
value degrades from 0.68 to 0.63. This degradation is the result of over-fitting as illustrated in
433
Section 2 and Figure 3. Moreover, in one experiment, model calibration on representative short
434
period (a) failed to even find a parameter set with an N-S value of greater than 0.4. The proposed
435
framework with representative short period (a) performed consistently well in all the
436
computational budgets and outperformed the other two approaches in terms of the average
437
performance over the seven calibration experiments. At the largest computational budget
438
considered, the proposed framework with representative short period (a) was able to find a
439
parameter set close to the global optimum.
440
Figure 9 also demonstrates that representative short period (b) is a near perfect surrogate
441
of the full period as model calibration solely on this short period performed consistently well and
442
better than calibration on full period in all computational budgets. The proposed framework with
443
representative short period (b) also performed very well. In terms of the average N-S values over
444
the seven calibration experiments, the proposed performance worked better or almost equally
445
well at all computational budgets compared to the short-period calibration performance.
446
Moreover, the range of N-S values obtained by the proposed framework is smaller than the
447
corresponding ranges obtained by the two other approaches in all the computational budgets
448
except the smallest one.
449
The highest computational budget considered was not sufficient to enable the
450
conventional full-period calibration to approach the global optimum (the maximum N-S value
451
found with this approach is 0.75). Therefore, to further evaluate and compare the performance of
452
this framework, its seven calibration experiments were continued with a total budget of 1000 20
453
objective function evaluations (i.e., equivalent to 9,000 years of simulation). The average, best,
454
and worst N-S values found in these seven calibration experiments were 0.79, 0.83, and 0.76,
455
respectively. Whereas, the proposed framework with representative short period (b) at the
456
computational budget of 2,250 years of simulation attained the average, best, and worst N-S
457
values of 0.80, 0.84, and 0.77. As such, one may conclude that model calibration on the full
458
period with the budget of 9,000 years of simulation and the proposed framework with the budget
459
of 2,250 years of simulation performed comparably.
460
4.2. Impact of the parameter α
461
To evaluate the impact of the framework parameter α on the calibration results, the seven
462
replicates of the experiment with the proposed framework on the representative short period (a)
463
were repeated using an α value of zero. Figure 10 compares the average calibration results with α
464
values of 0, 0.05, and 1 for different computational budgets. Note that for α=1, the results shown
465
are the results of the calibration experiment directly on the full period. For the computational
466
budget of 450 years of simulation, the α value of zero helped the framework make the most of
467
this very limited computational budget as it outperformed the framework performance with the α
468
value of 0.05. However, for larger computational budgets, the framework with the α value of
469
0.05 is more effective. This demonstrates that the optimal α value for any given case study also
470
depends on the available computational budget. Obtaining the optimal α value for a case study
471
may not be computationally plausible; however, any small non-zero α value (say in the range of
472
0.05 and 0.1) seems to be an appropriate compromise between the framework efficiency and
473
robustness.
21
474
4.3. Impact of Calibration Error Functions
475
As demonstrated, the proposed framework is on the basis of the correlations between the model
476
performance on the full period and the performance of the same model on a representative short
477
period. As the model performance may be evaluated through different calibration error functions,
478
the significance of correlations and representativeness of a given short period may vary for
479
different error functions. Figure 11 shows such variations for three inherently different
480
calibration error functions, mean squared error (MSE), mean absolute error (MAE), and mean
481
log squared error (MLSE). Using the 1,000 randomly sampled parameter sets generated in
482
Section 3.2, linear regression models were fitted to map the performance of the hydrologic model
483
on the possible short periods to the model performance on the full period. For Figure 11a, a
484
window of nine months was moved along the full period to evaluate all possible nine-month
485
short periods. The length of the moving window in Figure 11b was 12 months. As can be seen,
486
the R2 value (i.e, the representativeness) of any given short period differs significantly for
487
different error functions. However, Figure 11b suggests that these differences are reduced for
488
longer representative short periods. The MLSE error function which focuses on low flows
489
demonstrates a totally different behavior than MSE and MAE, due to the fact that the model
490
performance on high flow periods may not be a representative of the model performance in
491
reproducing low flows on the full periods.
492
Further calibration experiments were conducted to evaluate the proposed framework with the
493
MAE error function. To be consistent with the previous experiments, the representative short
494
period (a) with the same sub-periods as shown in Figure 6a was used. Similar to Figure 9, Figure
495
12 compares the calibration results of seven randomly seeded replicates for the three approaches
496
under different computational budgets when the calibration error function is MAE. As shown,
22
497
calibration solely on representative short period (a) may fail to identify optimal parameter sets in
498
terms of MAE on the full period. However, the proposed framework enabled with this short
499
period is quite effective and is able to converge more efficiently to the optimal parameter sets
500
than the calibration directly on the full period is.
501
4.4. Implications for Parameter Instabilities
502
Figure 13 shows the final parameter sets optimized with respect to the MAE error function on the
503
representative short period (a) and the full period (through the proposed framework). The
504
parameter values shown are those obtained in the seven replicates at the highest computational
505
budget considered. The parameter ranges are scaled in the range of zero and one. As can be seen,
506
the issue of “non-uniqueness” is significant for the majority of the parameters in this case study
507
(this is consistent with our previous work with this case study). The parameters SFTMP,
508
SMTMP, and AWC_f seem to be most sensitive to the calibration period. On the other hand, the
509
parameters GW_DELAY and CN2_f seem to be the most stable parameters. The details of the
510
parameters and their actual ranges can be found in Tolson and Shoemaker (2007b).
511
5. Relations to Surrogate Modelling
512
The framework proposed in this work shares certain features with strategies for lower-fidelity
513
physically based surrogate modelling. Lower fidelity surrogates are simplified models which
514
directly or conceptually preserve the physical processes in the real-world system. In general,
515
there are many different ways with multiple subjective decisions to develop a lower-fidelity
516
surrogate model for a given case study. For example in the hydrologic modelling context, a fully-
517
distributed hydrologic model may be deemed the highest-fidelity model, while a semi-distributed
518
model or a lumped model may be considered as lower-fidelity models. A low-fidelity hydrologic 23
519
model may also ignore some physical processes simulated in a high-fidelity hydrologic model.
520
One common challenge with any surrogate modelling strategy is the identification of the best
521
way to develop a surrogate model for a given case study. The other challenge in developing and
522
utilizing some (not all) lower-fidelity surrogate models is how to map their parameter space to
523
the original model parameter space [Bandler et al., 2004; Robinson et al., 2008]. Such a mapping
524
is not always trivial and may introduce a great deal of uncertainty in a calibration experiment.
525
The proposed framework parallels the primary objective of surrogate modelling which is
526
to circumvent the computational burden of optimization attempts involving computationally
527
demanding models. However, instead of a lower-fidelity model as the surrogate, the proposed
528
framework utilizes the original, high-fidelity model but runs it over a surrogate calibration period
529
which is a representative short period of the full available data period. As such, the proposed
530
framework evades the two aforementioned challenges as it preserves the original model
531
parameters, complexity and processes simulated.
532
As outlined in Razavi et al. [2012], there are various methodologies in the literature to
533
make use of surrogate models in model calibration and optimization. In some studies, surrogate
534
models after being developed are treated as if they are high-fidelity representations of the real-
535
world systems of interest and fully replace the original models [e.g., McPhee and Yeh, 2008;
536
Siade et al., 2010; Vermeulen et al., 2005; Vermeulen et al., 2006]. Such a strategy can be
537
deemed analogous to the strategy for hydrologic model calibration which identifies and solely
538
relies on a representative short data period. There are some other studies addressing the
539
discrepancies between surrogate and original models [e.g., Bandler et al., 2004; Cui et al., 2011;
540
Forrester et al., 2007; Robinson et al., 2006]. Similar to the proposed framework, these studies
541
develop strategies to predict the response (e.g., a performance metric) of a high-fidelity model as 24
542
a function of parameter values and the corresponding response of a surrogate model. However,
543
the performance mapping system developed in the proposed framework is unique in two ways.
544
Unlike common surrogate modelling strategies, 1- model parameters are not directly involved in
545
the mapping system as predictors, and 2- the response of the faster-to-run model is disaggregated
546
into multiple factors (i.e., performance over independent events) inputting into the mapping
547
system to extract the most out of the available response. Importantly, since the mapping system
548
does not involve the model parameters in the approximation procedure (as inputs to the mapping
549
function), the limitations of common surrogate modeling strategies associated with high-
550
dimensional problems (e.g., calibration problems with large numbers of model parameters – see
551
Razavi et al. [2012] for surrogate modelling limitations) may be reduced or diminished.
552
Unlike the proposed framework, most strategies for surrogate modelling use Design of
553
Experiments (DoE) at the beginning of the search to empirically capture the general form of the
554
underlying function, e.g., a “correction function” representing the difference between the
555
responses of lower-fidelity and original models [Razavi et al., 2012]. DoE is an essential part of
556
many surrogate modelling strategies and is very helpful to generate an unbiased estimate of the
557
general form of the function, but at some relatively large computational costs (e.g., 10-20% of
558
the total computational budget). The proposed framework does not fundamentally need such
559
computationally demanding experiments since unlike the common correction function
560
approaches used in lower-fidelity surrogate modelling which are purely empirical [see Razavi et
561
al. 2012 for details], the developed performance mapping system make use of the hydrological
562
and statistical characteristics of the model responses to establish the relationships.
25
563
6. Conclusions
564
This study proposed a framework for efficient hydrologic model calibration on long data periods.
565
This framework becomes appealing when a hydrologic model of interest is computationally
566
expensive and its run-time on such long periods is a few minutes or more. The framework is
567
based on the fact that there are certain relationships between a hydrologic model performance on
568
the full-length data and its corresponding performance on a representative short period of data
569
(i.e., surrogate data). By extracting such relationships, a mapping system was developed to
570
approximate the full-period model performance given the model performance on the surrogate
571
data. A calibration framework utilizing this mapping system was proposed in which the majority
572
of candidate parameter sets are evaluated by running the model on the surrogate data, and the
573
full-length data are only used to evaluate select candidate parameter sets in the course of
574
calibration. Numerical experiments showed that this framework can increase calibration
575
efficiency while being robust, as within a smaller computational budget, it can lead to calibration
576
solutions close to the optimal solution of the full-period calibration problem. Results also
577
demonstrated that the framework can work very well when the selected surrogate data period has
578
only a moderate level of representativeness (i.e., not the best representative sub-period of a full-
579
length data period) which is typically identifiable by experts’ knowledge. In general, experts'
580
knowledge is necessary to select an appropriate representative sub-period of a full-length data
581
period.
582
The proposed framework may be categorized under the large umbrella of surrogate
583
modeling with the major distinction that unlike common surrogate modeling strategies which
584
utilize a fast-to-run surrogate model of an original high-fidelity model, this framework defines
585
surrogate data and only uses the original high-fidelity model. Future extensions to the proposed 26
586
framework may incorporate the approximation uncertainty estimate for model performance
587
predictions (i.e., the variance of errors associated with each prediction) that the developed
588
mapping system is capable of producing [see Sacks et al., 1989]. Moreover, the proposed
589
framework can be extended and modified to be used in conjunction with uncertainty-based
590
approaches to hydrologic model calibration such as GLUE or different Markov-chain Monte
591
Carlo techniques. Such extensions may also involve the approximation uncertainty of
592
performance mapping.
593
Acknowledgements
594
The authors would like to thank the four anonymous reviewers whose comments significantly
595
helped in improving this paper. The first author was financially supported by Ontario Graduate
596
Scholarship, which is hereby acknowledged.
597
References:
598 599
Alexandrov, N. M., and R. M. Lewis (2001), An overview of first-order approximation and model management in optimization, Optimization and Engineering, 2, 413–430.
600 601 602
Andreassian, V., C. Perrin, C. Michel, I. Usart-Sanchez, and J. Lavabre (2001), Impact of imperfect rainfall knowledge on the efficiency and the parameters of watershed models, Journal of Hydrology, 250(1-4), 206-223.
603 604 605
Bandler, J. W., Q. S. S. Cheng, S. A. Dakroury, A. S. Mohamed, M. H. Bakr, K. Madsen, and J. Sondergaard (2004), Space mapping: The state of the art, Ieee Transactions on Microwave Theory and Techniques, 52(1), 337-361.
606 607 608
Broad, D. R., H. R. Maier, and G. C. Dandy (2010), Optimal Operation of Complex Water Distribution Systems Using Metamodels, Journal of Water Resources Planning and Management-Asce, 136(4), 433-443.
609 610 611
Cui, T., Fox, C. , and M. J. O'Sullivan (2011), Bayesian calibration of a large scale geothermal reservoir model by a new adaptive delayed acceptance Metropolis Hastings algorithm, Water Resour. Res., doi:10.1029/2010WR010352, in press.
612 613
Duan, Q. Y., S. Sorooshian, and V. Gupta (1992), Effective and efficient global optimization for conceptual rainfall-runoff models, Water Resources Research, 28(4), 1015-1031.
614 615 616
Forrester, A. I. J., A. Sobester, and A. J. Keane (2007), Multi-fidelity optimization via surrogate modelling, Proceedings of the Royal Society a-Mathematical Physical and Engineering Sciences, 463(2088), 3251-3269.
27
617 618 619
Gharari, S., M. Hrachowitz, F. Fenicia, and H. H. G. Savenije (2013), An approach to identify time consistent model parameters: sub-period calibration, Hydrology and Earth System Sciences, 17(1), 149-161.
620 621
Gupta, V. K., and S. Sorooshian (1985), The relationship between data and the precision of parameter estimates of hydrologic-models, Journal of Hydrology, 81(1-2), 57-77.
622 623
Hansen, N., and A. Ostermeier (2001), Completely derandomized self-adaptation in evolution strategies, Evolutionary Computation, 9(2), 159-195.
624 625
Johnson, V. M., and L. L. Rogers (2000), Accuracy of neural network approximators in simulationoptimization, Journal of Water Resources Planning and Management-Asce, 126(2), 48-56.
626 627 628
Juston, J., J. Seibert, and P.-O. Johansson (2009), Temporal sampling strategies and uncertainty in calibrating a conceptual hydrological model for a small boreal catchment, Hydrological Processes, 23(21), 3093-3109.
629 630
Klemes, V. (1986), Operational testing of hydrological simulation-models, Hydrological Sciences Journal-Journal Des Sciences Hydrologiques, 31(1), 13-24.
631 632
Lophaven, S. N., H. B. Nielsen, and J. Sondergaard (2002), DACE: AMATLAB Kriging toolbox version 2.0,Technical Report IMM-TR-2002-12, Technical University of Denmark, Copenhagen.Rep.
633 634 635
McPhee, J., and W. W. G. Yeh (2008), Groundwater management using model reduction via empirical orthogonal functions, Journal of Water Resources Planning and Management-Asce, 134(2), 161170.
636 637
Merz, R., J. Parajka, and G. Bloeschl (2011), Time stability of catchment model parameters: Implications for climate impact analyses, Water Resources Research, 47.
638 639 640
Mondal, A., Y. Efendiev, B. Mallick, and A. Datta-Gupta (2010), Bayesian uncertainty quantification for flows in heterogeneous porous media using reversible jump Markov chain Monte Carlo methods, Advances in Water Resources, 33(3), 241-256.
641 642 643
Neitsch, S. L., J. G. Arnold, J. R. Kiniry, and J. R. Williams (2001), Soil and Water Assessment Tool theoretical documentation version 2000: Draft–April 2001, 506 pp., Agric. Res. Serv., U.S. Dep. of Agric., Temple, Tex.Rep.
644 645 646
Perrin, C., L. Oudin, V. Andreassian, C. Rojas-Serna, C. Michel, and T. Mathevet (2007), Impact of limited streamflow data on the efficiency and the parameters of rainfall-runoff models, Hydrological Sciences Journal-Journal Des Sciences Hydrologiques, 52(1), 131-151.
647 648
Razavi, S., and S. Araghinejad (2009), Reservoir Inflow Modeling Using Temporal Neural Networks with Forgetting Factor Approach, Water Resources Management, 23(1), 39-55.
649 650
Razavi, S., B. A. Tolson, and D. H. Burn (2012), Review of surrogate modeling in water resources, Water Resources Research, 48.
651 652
Refsgaard, J. C., and J. Knudsen (1996), Operational validation and intercomparison of different types of hydrological models, Water Resources Research, 32(7), 2189-2202.
653 654
Regis, R. G., and C. A. Shoemaker (2007), A stochastic radial basis function method for the global optimization of expensive functions, Informs Journal on Computing, 19(4), 497-509.
655 656 657 658
Robinson, T. D., M. S. Eldred, K. E. Willcox, and R. Haimes (2006), Strategies for Multifidelity Optimization with Variable Dimensional Hierarchical Models, in In: 47th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference, AIAA Paper 2006-1819, edited, Newport, Rhode Island, May 1-4
28
659 660 661
Robinson, T. D., M. S. Eldred, K. E. Willcox, and R. Haimes (2008), Surrogate-Based Optimization Using Multifidelity Models with Variable Parameterization and Corrected Space Mapping, Aiaa Journal, 46(11), 2814-2822.
662 663
Sacks, J., W. J. Welch, T. J. Mitchell, and H. P. Wynn (1989), Design and analysis of computer experiments Statistical Science, 4(4), 409-435.
664 665
Siade, A. J., M. Putti, and W. W. G. Yeh (2010), Snapshot selection for groundwater model reduction using proper orthogonal decomposition, water resources research, 46.
666 667
Singh, S. K., and A. Bardossy (2012), Calibration of hydrological models on hydrologically unusual events, Advances in Water Resources, 38, 81-91.
668 669
Tang, Y., P. Reed, and T. Wagener (2006), How effective and efficient are multiobjective evolutionary algorithms at hydrologic model calibration?, Hydrology and Earth System Sciences, 10(2), 289-307.
670 671
Thokala, P., and J. R. R. A. Martins (2007), Variable-complexity optimization applied to airfoil design, Engineering Optimization, 39(3), 271-286.
672 673
Tolson, B. A., and C. A. Shoemaker (2007a), Cannonsville Reservoir Watershed SWAT2000 model development, calibration and validation, Journal of Hydrology, 337(1-2), 68-86.
674 675
Tolson, B. A., and C. A. Shoemaker (2007b), Dynamically dimensioned search algorithm for computationally efficient watershed model calibration, Water Resources Research, 43(1).
676 677
Vermeulen, P. T. M., A. W. Heemink, and J. R. Valstar (2005), Inverse modeling of groundwater flow using model reduction, water resources research, 41(6).
678 679
Vermeulen, P. T. M., C. B. M. t. Stroet, and A. W. Heemink (2006), Model inversion of transient nonlinear groundwater flow models using model reduction, water resources research, 42(9).
680 681
Vrugt, J. A., H. V. Gupta, L. A. Bastidas, W. Bouten, and S. Sorooshian (2003), Effective and efficient algorithm for multiobjective optimization of hydrologic models, Water Resources Research, 39(8).
682 683 684
Vrugt, J. A., H. V. Gupta, S. C. Dekker, S. Sorooshian, T. Wagener, and W. Bouten (2006), Application of stochastic parameter optimization to the Sacramento Soil Moisture Accounting Model, Journal of Hydrology, 325(1-4), 288-307.
685 686
Wagener, T. (2007), Can we model the hydrological impacts of environmental change?, Hydrological Processes, 21(23), 3233-3236.
687 688 689
Xia, Y. L., Z. L. Yang, C. Jackson, P. L. Stoffa, and M. K. Sen (2004), Impacts of data length on optimal parameter and uncertainty estimation of a land surface model, Journal of Geophysical ResearchAtmospheres, 109(D7).
690 691 692
Yan, S., and B. Minsker (2011), Applying Dynamic Surrogate Models in Noisy Genetic Algorithms to Optimize Groundwater Remediation Designs, Journal of Water Resources Planning and Management-Asce, 137(3), 284-292.
693 694
Yapo, P. O., H. V. Gupta, and S. Sorooshian (1996), Automatic calibration of conceptual rainfall-runoff models: Sensitivity to calibration data, Journal of Hydrology, 181(1-4), 23-48.
695 696 697
Zhang, H., G. H. Huang, D. Wang, and X. Zhang (2011), Multi-period calibration of a semi-distributed hydrological model based on hydroclimatic clustering, Advances in Water Resources, 34(10), 12921303.
698 699
29
700
30
Model Error Function
Full period
Short period
(a) Parameter
701
Model Error Function
Full period Short period
(b) Parameter
702
Model Error Function
Full period Short period
(c)
703
Parameter
704
Figure 1 – Conceptual examples of the performance of a hydrologic model with one parameter
705
running over different short calibration periods out of a long data period (full period) along with
706
the model performance over the full period: (a) a well representative short period, (b) a
707
moderately representative short period, and (c) a poor (misleading) short period
708 709
31
Error Function over full period
710
(a)
Extent of Discrepancy (Uncertainty) between Model Performances
Error Function over short period
Error Function over full period
711 (b)
712
Error Function over short period
713
Figure 2- Hypothetical relationship examples between the performance of a hydrologic model
714
over a full period and the performance of the same model over (1) a representative and (2) a poor
715
short periods – each point in the scatter plots represents a random parameter set that is evaluated
716
by the model over the different periods
717 718 719 720 721
32
Model Error Function
Full period Short period
Optimization Progress (e.g., Iteration Number)
722 723
Figure 3- A hypotetical example of optimization convergence trend when minimizing the model
724
error function over a short period of data along with the coresponding error function values
725
calculated over the full period
726 727 728 729 730 731 732 733 734 735 736 737
33
400
Initialization Period (3 years)
Daily Flow (cms)
350
Calibration Period (6 years)
300 250
Initialization Period
Initialization Period
Representative Short Period (a) (9 months)
(4 months)
Representative Short Period (b)
(6 months)
(12 months)
200
150 100 50 0
1
01-Jan-1987 9 8 7
738
11 99 88 88
11 99 88 99
11
01-Jan-1990 99 99 00
11 99 99 11
Time Month-Year
11 99 99 22
11 99 99 33
11 99 99 44
11 9931-Dec-1995 99 55
739
Figure 4- Streamflow time series at Walton gauging station in Cannonsville Reservoir watershed
740
over a nine-year period used for SWAT model calibration
741
34
MSE on Full Period
2000
1500
1000
500
y = 0.86x + 185 R² = 0.64 0 0
500
1000
1500
2000
MSE on Representative Short Period (a)
742 2000
MSE on Full Period
1500
1000
500
y = 0.98x - 52.7 R² = 0.94 0 0
500
1000
1500
2000
MSE on Representative Short Period (b)
743 744
Figure 5- Scatter plots of the mean squared errors (MSE) calculated over the full period versus
745
the MSE calculated over the representative short periods for 1000 randomly generated parameter
746
sets
747 748 749
35
100
Sub-period 4
Sub-period 3
Daily Flow (cms)
120
Representative Short Period (a) Sub-period 2
Initialization Period
140
Sub-period 1
160
Sub-period 5
80 60 40
20 0
1 13 25 37 49 61 73 85 97 109121133145157169181193205217229241253265277289301313325337349361373385
01-Sep-1989
01-Jan-1990
30-Sep-1990
Time 180
Sub-period 1
Daily Flow (cms)
140 120 100
Sub-period 3
Sub-period 4
Representative Short Period (b)
Initialization Period
160
Sub-period 2
750
80 60 40 20 0
1 18 35 52 69 86 103120137154171188205222239256273290307324341358375392409426443460477494511528
06-Sep-1993
06-Mar-1994
05-March-1995
Time
751 752
Figure 6- Two different representative short periods of the full period of data and the sub-periods
753
used for performance mapping
36
2000
Actual MSE on Full Period
Prediction based on Representative Short Period (a) 1500
1000
y = 1.02x - 8.57 R² = 0.92
500
0 0
500
1000
1500
2000
Predicted MSE on Full Period
754
Actual MSE on Full Period
2000
Prediction based on Representative Short Period (b) 1500
1000
y = 0.99x - 0.88 R² = 0.96
500
0 0
755
500
1000
1500
2000
Predicted MSE on Full Period
756
Figure 7- Scatter plots of the actual sum of squared errors (SSE) values over the full period
757
versus the predicted SSE values for the same data period for 900 randomly generated parameter
758
sets – the mapping (prediction) system is developed based on the model performance results over
759
100 randomly generated parameter sets (independent of the 900 parameter sets)
760 761
37
Phase 1
Phase 2
Optimization Algorithm
Optimization Algorithm ^
xi
Hydrologic Model over Short Period
fs(xi)
Hydrologic Model over Full Period k=k+1
CF. ff (xi)
xi
rnd