Water Resour Manage (2009) 23:1835–1842 DOI 10.1007/s11269-008-9354-5
Bayesian Rating Curve Inference as a Streamflow Data Quality Assessment Tool Asgeir Petersen-Øverleir · André Soot · Trond Reitan
Received: 17 November 2007 / Accepted: 16 September 2008 / Published online: 3 October 2008 © Springer Science + Business Media B.V. 2008
Abstract A streamflow time-series is normally obtained by transforming a timeseries of recorded stage to discharge using an estimated rating curve. The accuracy of this streamflow time-series depends on the characteristics of the available stage-discharge measurements used to fit the rating curve. The Norwegian Water Resources and Energy Directorate (NVE) has developed a method based on rating curve uncertainty for performing objective quality assessment of streamflow timeseries. The method, which is based on a Bayesian statistical framework, uses the available stage-discharge measurements and the corresponding stage time-series to derive statistics utilised for a quality assurance of the streamflow time-series. Nearly one thousand streamflow time-series periods have been classified using the method. This paper presents the results. Keywords Gauging station · Discharge measurement · Rating curve · Streamflow time-series · Quality assurance · Bayesian statistics 1 Introduction The amount of water passing a hydrometric gauging station, the discharge, is seldom measured directly. In most cases one measures and records the stage. The records of stage are subsequently transformed to discharge by means of an estimated stagedischarge relationship, referred to as the rating curve. The rating curve is in most cases determined by correlating measurements of discharge with the corresponding observations of stage. However, since a discharge “measurement” is only an estimate
A. Petersen-Øverleir (B) · A. Soot · T. Reitan Hydrology Department, Norwegian Water Resources and Energy Directorate, P.O. Box 5091, Majorstua, 0301 Oslo, Norway e-mail:
[email protected] T. Reitan Department of Mathematics, University of Oslo, P.O. Box 1053, Blindern 0316, Norway
1836
A. Petersen-Øverleir et al.
of the true discharge, and the employed rating curve model is only an approximation to the true rating curve, all discharges calculated by a rating curve are subject to uncertainty. There are also other factors that contribute to the imprecision in discharge data, e.g. inaccurate stage measurement, low time-resolution in the stage recording, the fill-in for missing data and the manipulation of stage data affected by ice. These error sources are not considered in this study. Uncertainty in streamflow data is seldom addressed in hydrological and hydraulic modelling. This is mainly due to four factors: (1) scientists and engineers who apply streamflow data for modelling purposes often have little knowledge about hydrometric data production and the associated inaccuracies; (2) the hydrological and hydraulic modelling community is still reluctant to embrace the fact that hydrometric data is subject to uncertainty; (3) hydrometric offices are not, in general, very good at providing information concerning the quality of the collected streamflow data; and (4) there has been little research on objective methods for quantifying uncertainty in discharge data, and the methods which are known are sparsely applied in practice. Many recent studies (e.g. Van der Keur et al. 2008) point out the need for incorporating uncertainty assessment as an essential part of water resources management. Tools for identifying and quantifying imprecision in streamflow data are important components in this process. This paper applies a Bayesian approach to rating curve estimation. The method allows for the calculation of 95% credibility intervals (CI) for single discharges derived from the fitted rating curve. This methodology, in conjunction with the recorded time-series of stage, is then used to quantify the uncertainty (caused by rating curve variability) in the mean annual flood, the mean annual low-flow and the mean annual flow. These statistics are finally utilised for quality assuring streamflow time-series. Results from the Norwegian hydrological database Hydra 2, held by the NVE, are presented. To the knowledge of the authors, no similar study is to be found in the literature, though some related studies do exist. Cole et al. (2003) used flow duration curves for the detection of irregularities in hydrometric data in Northern Ireland. They found that data quality from 36 of 51 stations they checked, was suspect. Of these 36, 11 were found to require more measurements to improve the rating curves. Mosley and McKerchar (1989) described a quality assurance programme for hydrometric data in New Zealand, but presented no results for the New Zealand network. Hudson et al. (1999) give many useful perspectives and references concerning quality assurance of hydrometric data.
2 Methods 2.1 Bayesian Rating Curve Fitting The power-law stage-discharge relationship is used by most hydrometric offices as a rating curve model. It is given by ⎧ A1 (H + c1 )b 1 if H ≤ h1 ⎪ ⎪ ⎪ A (H + c )b 2 if h ≤ H ≤ h ⎨ 2 2 1 2 Q= . ⎪ ⎪ .. ⎪ ⎩ An (H + cn )b n if hn−1 ≤ H
(1)
Bayesian rating curve inference as data quality assessment tool
1837
where Q is discharge, H is stage, (A1 ,. . . ,An , b 1 ,. . . ,b n , c1 ,. . . cn ) are rating curve parameters and (h1 ,. . . , hn−1 ) are limits separating the rating curve segments. A rating curve is often segmented as in (1) since the stage-discharge relationship can shift at certain stages, e.g. when a low-flow control is drowned or when the water surface tops the main banks and enters the floodplains. In Norway, as in most other countries, the segmentation limits are decided subjectively and each segment fitted separately. We will therefore proceed by only considering a single rating curve segment. The singlesegment logarithmic regression model; log (Qi ) = a + b log (Hi + c) + εi
(2)
where log (A) = a, is typically used for fitting the rating curve to measurements. The noise term ε is assumed normally distributed with zero expectancy and constant standard deviation σ . It must be stated however, that in some cases the noise will also include violations of model assumptions. These may include: the ignoring of slope as a determinant when the effects of unsteadiness and/or variable backwater are present and significant; temporal changes in the channel geometry and hydraulic resistance; or the fact that the true stage-discharge relationship is noticeably irregular and not describable by a smooth curve. Some of the above-mentioned phenomena might be classified as non-repeating systematic errors, but if one has no knowledge about them, they will be seen as part of the regression noise. Another factor that can affect the magnitude of ε is inaccurate determination of the concurrent stage value, which can be significant if the water surface near the staff gauge is subject to waves or secondary currents. It is therefore acknowledged that the assumption of a constant relative regression error is not always warranted. Homoscedastic measurement error assumptions might also be violated if there is a large variation in gauging technique and in the procedure for selecting measuring sites by the field hydrologists who have performed the measurements at the station. In most cases, the frequentist least squares approach is applied for estimating θ = (a, b , c). However, this approach does not allow the use of prior knowledge about the hydraulic characteristics of the channel section containing the gauging station. Moreover, frequentist methods can give infinite parameter estimates (Reitan and Petersen-Øverleir 2006). To overcome these weaknesses, Bayesian methods can be invoked. Bayesian rating curve fitting consists of three steps: (1) the definition of prior distributions for the rating curve parameters, including the noise standard deviation; (2) updating the prior distribution with data (stage-discharge measurements), i.e. calculating the posterior distribution; and (3) the derivation parameter estimates and uncertainty measures using the posterior distributions. Reitan and Petersen-Øverleir (2008) described a detailed framework for performing these three steps. Formally, the method used for the present study starts with defining the prior distribution for the vector of parameters θ, which now includes σ . That is π0 (θ ) = π0 (a, b |σ ) π0 (c) π0 (σ )
(3)
Note that (3) takes into account that the cease-to-flow stage parameter −c and the measurement noise parameter σ have no dependency with the other parameters. This is intuitive since –c is associated with the lowest point of the hydraulic control, and the magnitude of σ is assumed to be mainly decided by the accuracy of the discharge measurements performed in the river reach near the station. The
Fig. 1 Example of an estimated rating curve (solid line) from the river Trysil at Nybergsund, Norway. Crosses are the measurements and dotted lines represent 95% CI
A. Petersen-Øverleir et al.
Stage (m)
1838
Discharge (m3/s)
parameters a and b are dependent and thus given a joint distribution. The parameter a is a function of hydraulic resistance, slope, channel scale and channel shape, while b is decided by channel shape (Petersen-Øverleir 2005). This study applies π 0 (a) ∼ N(2.84,1.382 ) and π 0 (b ) ∼ N(2.46,0.922 ), given E[σ ]. The correlation between a and b is set to –0.42. This framework is then used to define a bivariate normal prior distribution π (a,b ). The parameters of the two prior distributions are based on the NVE archive of quality assured rating curve parameters estimated by manual or frequentist methods. π 0 (σ ) is an inverse gamma distribution with 95% of the mass between 0.15 and 0.02. This interval is intended to reflect discharge measurements of poor and high accuracy, respectively. The parameter c is discretised with 0.01 m increments within the interval [–Hmin ,–Hmin + 100], where Hmin is the lowest stage in the available stage-discharge measurements. Note that this implies that c is given a flat prior distribution over the interval. The posterior distribution for the rating curve parameters, formally written as L (θ |D ) π0 (θ) |D ) π0 (θ) dθ −∞ L (θ
π (0) = ∞
(4)
where D represents the available stage-discharge measurements and L the likelihood function, can thus be derived by analytical methods. Furthermore, from π (θ) the posterior distribution for any discharge for a given stage can be derived, which in turns allows for the calculation of the corresponding 95% CI. Figure 1 shows an example of a fitted rating curve with 95% CI. 2.2 Measures for Streamflow Time-Series Quality Primarily, one uses daily mean discharges for analysis in Norway. The Hydra 2 database stores stage values corresponding to the daily mean discharges. This means that one first uses the fine-resolution time-series of stage to derive a fine-resolution time series of discharge using the official rating curves (which are derived by manual frequentist methods). One then computes the series of daily mean discharges, which in turn are stored as a (virtual) daily mean stage value on the database by applying the rating curve once again, but in an inverted form.
Bayesian rating curve inference as data quality assessment tool
1839
From the time-series of daily mean stage values we selected the annual maximum and minimum values, and calculated the mean of each of these series. These two values are from here onwards referred to as the mean high-flow and mean low-flow stage values, and denoted HHF and HLP . Using these stage values, the mean highˆ (HLF ), were estimated using a ˆ (HHF ) and Q flow and low-flow discharges, denoted Q rating curve fitted by the Bayesian method presented in Section 2.1. The associated ˆ 95 (·), were also computed. We then ˆ 95 (·) and Q upper and lower 95 % CI, denoted Q U L calculated the relative uncertainty for high-flow and low-flow in per cent as ˆ 95 (HHF ) ˆ 95 (HHF ) − Q Q U L × 100. ˆ (HHF ) Q
(5a)
ˆ 95 (HLF ) ˆ 95 (HLF ) − Q Q U L × 100 ˆ Q (HLF )
(5b)
For the assessment of the mean-flow, the average of the 95% CI of the discharges between the 25% and 75% percentiles of the corresponding daily mean stage values were considered. The relative uncertainty for mean-flow is then given by:
H0.75 H0.25
ˆ 95 (H) ˆ 95 (H) − Q Q U L dH × 100. ˆ Q (H)
(6)
This formula was approximated by a summation. Summarising, the expressions in (5a, 5b, 6) can be interpreted as the average relative high-flow, low-flow and mean-flow uncertainty, all at the 95% level of credibility. Using this methodology, a simple system of quality classes have been developed and is used internally by the hydrological department at NVE. The classes are as follows: relative uncertainty ranges of 0–9%, 10–19%, 20–39%, 40–79% and >80% are classified as very good, good, average, poor and very poor, respectively. It is furthermore intended that this classification system shall be routinely used to communicate data quality to external users of streamflow data from the Hydra 2 database. Clearly, the classification system provides only a rough and commercialised uncertainty measure, so that scientists and engineers can quickly draw conclusions about the applicability of a particular streamflow time-series. Advanced types of data uncertainty assessment, e.g. risk analysis or studies on the propagation of data imprecision in subsequent hydrological modelling (e.g. Vázquez et al. 2008), would require more accurate measures amenable to statistical analysis. In such cases it would be better to communicate the actual posterior distributions or the credibility intervals. It was mentioned in Section 2.1 that a part of the stage-discharge measurement scatter might be caused by model errors and temporal channel changes due to scour, deposition and vegetation growth. Concerning the latter, changes in Norwegian rivers typically affect the stage-discharge relationship with a magnitude smaller than the discharge measurement uncertainty, so that unstableness may be ignored. When significant changes are encountered, one attempts to segment the data into timeperiods for which the control can be considered stable. This procedure is in many cases not capable of completely rectifying the systematic error caused by channel instability. It is therefore worth mentioning that the quality assurance framework
1840
A. Petersen-Øverleir et al.
presented in this paper could be extended to include residual tests for detecting systematic errors caused by model error and temporal channel changes.
3 Results and Discussion
Fig. 2 Results from the quality classification of the 941 streamflow time-series taken from the Norwegian Hydra 2 database
Proportion of 941 series (%)
NVE is currently developing a low-flow map over Norway. Engeland et al. (2006) present details and preliminary results from the project. Streamflow data from virtually all the gauging stations selected as a base for the low-flow map were quality assured using the methodology outlined in Section 2. Data from a total of 581 stations were checked; several of these had been closed down for some time. It is believed the selected stations are representative for the Norwegian network at the present date. Several series were impossible to quality assure, mostly as a consequence of an incomplete record of stage-discharge measurements. These series are labelled NA in Fig. 2 which shows the results. Improper segmentation and other inadequacies concerning the fitted rating curves were also frequently detected, so that new rating curves had to be constructed before carrying out quality assurance. Note that many of the stations had more than one rating curve period due to hydraulic changes. Each period had to be considered separately since it is linked to a unique rating curve. Hence there are more streamflow time-series than stations in this study. WMO (1994) and ISO 9002 of the International Standardisation Organisation recommend, respectively, that the discharge uncertainty (at a 95% confidence interval) is within 5% or 8%. The results in Fig. 2 reveal that in general the Hydra 2 database fails to meet these criteria, in many cases by a wide margin. For low-flow, only just over one third of the checked periods have a relative uncertainty less than 40%. On the other hand, for mean-flow almost three-quarters of the controlled periods have a relative uncertainty less than 40%. The difference in uncertainty between low-flow and mean-flow has two explanations. Firstly, low-flow measurements are difficult to perform accurately. Hydro-acoustic instruments and current meters are known to be inaccurate in shallow and lowvelocity flow situations. Also, cross-sectional irregularities come out strongly in shallow depths, making flow area calculations less precise. Small and semi-random channel changes not detectable by a residual trend analysis might also significantly inflate the variability of the lower part of the rating curve. Secondly, and perhaps
40 35 30 25 20 15 10 5 0
0-9 % 1
10-19 % 2
20-39 % 3
40-79 % 4
>80 %
Relative uncertainty Low-Flow
Mean-Flow
High-Flow
NA
Bayesian rating curve inference as data quality assessment tool
1841
most important, the extreme part of discharge and especially the low-flow, has historically been of little interest in Norway. The evolution of the Norwegian hydrometric network is closely linked to the development of hydro-electric power, where the main purpose was to investigate the mean-flow characteristics. However, low-flow analysis and the mitigation of droughts are currently receiving more attention in Norway due to ecological, economic and climatic issues. Flood management has also become more important recently, especially after a major flood in the river Glomma (the largest river in Norway) in 1995. In addition, stricter legislation concerning the safety of dams and hydraulic structures has been introduced over the years. These factors have resulted in more focus on flood analysis, which explains why the relative high-flow uncertainty is less than 40% for as many as half of the periods. However, taking into account that many of the time-series records are relatively short, a considerable proportion of Norwegian gauging stations lack an adequate database for flood frequency analysis. Finally, it is important to note that in this study we apply daily mean values. Proper flood analysis often requires the application of peak values. If instantaneous stage values were to be considered instead of daily mean values, the results of this study for high-flows would have been poorer. The instantaneous flood stage peak is often significantly higher than the daily mean value.
4 Concluding Remarks This paper presents a Bayesian framework for assigning quality to streamflow timeseries collected at gauging stations. Uncertainty due to rating curve variability and statistics from the recorded stage time-series are used to classify the quality of the streamflow time-series, with regard to low-, medium- and high-flow. Results from 941 Norwegian time-series are presented. The results indicate that a large proportion of Norwegian streamflow datasets are affected by considerable uncertainty, especially for low-flow. It is not known whether the trends revealed by this study are specific to Norway, since the literature on similar studies is very scant. Based on personal communication with foreign hydrometric offices, it is believed that similar results would be found in several other developed countries. The challenges addressed by the results in Fig. 2 can only be resolved by better hydrometric practice. The Norwegian network comprises around 500 gauging stations with only nine teams of two persons for operating them. Clearly, this is a very small number of hydrometric personnel, resulting on average in between one and two discharge measurements performed per station annually, which is a low gauging frequency. This paper presents a framework for determining an optimal way to operate a network of hydrometric gauging stations with respect to minimising rating curve uncertainty, i.e. decide at which stations more low-flow, mean-flow or high-flow measurements are needed. It has therefore been applied as a tool for the field hydrologists at NVE. The effects have not been accurately assessed because of its recent introduction. In any case, due to the personnel constraint, it is believed that an enhancement of the quality of streamflow data in Norway within a reasonable time would require a policy decision to increase the number of field hydrologists and/or reduce the number of stations.
1842
A. Petersen-Øverleir et al.
References Cole RAJ, Johnston HT, Robinson DJ (2003) The use of flow duration curves as a data quality tool. Hydrol Sci J 48:939–951. doi:10.1623/hysj.48.6.939.51419 Engeland K, Beldring S, Hisdal H (2006) A comparison of low flow estimates in ungauged catchments using regional regression and the HBV-model. NVE report 01-06. Norwegian Water Resources and Energy Directorate, Norway Hudson HR, McMillan DA, Pearson CP (1999) Quality assurance in hydrological measurements. Hydrol Sci J 44:825–834 Mosley MP, McKerchar AI (1989) Quality assurance programme for hydrometric data in New Zealand. Hydrol Sci J 24:185–202 Petersen-Øverleir A (2005) A hydraulic perspective on the power-law rating curve. NVE report 05-05. Norwegian Water Resources and Energy Directorate, Norway. Reitan T, Petersen-Øverleir A (2006) Existence of the frequentistic regression estimate of a powerlaw with location parameter, with applications for making discharge rating curves. Stoch Environ Res Risk Asses 20:445–453. doi:10.1007/s00477-006-0037-6 Reitan T, Petersen-Øverleir A (2008) Bayesian power-law regression with a location parameter, with applications for construction of discharge rating curves. Stoch Environ Res Risk Assess 22:3541–365 Van der Keur P, Henriksen HJ, Refsgaard JC, Brugnach M, Oahl-Wostl C, Dewulf A, Buiteveld H (2008) Identification of major sources of uncertainty in current IWRM practice. Illustrated for the Rhine basin. Water Resour Manage. doi:10.1007/s11269-008-9248-6 Vázquez RF, Beven K, Feyen J (2008) GLUE based assessment on the overall predictions of a MIKE SHE application. Water Resour Manage. doi:10.1007/s11269-008-9329-6 World Meteorological Organization (WMO) (1994) Guide to hydrological practices. Fifth ed. WMO-No. 168. World Meteorological Organization, Geneva