Improving Operational Water Quality Forecasting with ...

5 downloads 42955 Views 584KB Size Report
Dong-Jun Seo ... Hydrologic Simulation Program – Fortran (HSPF) model in support of .... disseminated to the decision makers in water resources management agencies ..... It is recommended that estimation of MAP in WQFS-NIER be carefully.
Improving Operational Water Quality Forecasting with Ensemble Data Assimilation Dong-Jun Seo1,*, Sunghee Kim1, Hamideh Riazi1, Changmin Shin2 1

Department of Civil Engineering, The University of Texas at Arlington, Arlington, Texas Water Quality Control Center, National Institute of Environmental Research, Incheon, Korea * Email: [email protected] 2

ABSTRACT For effective control and management of water quality, real-time forecasting of key water quality variables is necessary. Such forecasting provides the water managers with the predictive information necessary to take proactive actions. As in any forecasting, water quality forecasting is subject to various sources of error. Of the errors involved, those in the model initial conditions may be readily reduced, often very significantly, by adjusting or updating them in real time based on the observations available in real time. Such an operation, or data assimilation (DA), keeps the model states in line with what is being observed for improved short-range prediction and hence constitutes an integral component in any real-time environmental forecasting system. Due to the combination of a large number of state variables and sparse observations, however, DA for water quality forecasting is particularly challenging. In this paper, we describe the application and evaluation of maximum likelihood ensemble filter (MLEF)-based DA to the Hydrologic Simulation Program – Fortran (HSPF) model in support of real-time water quality forecasting at the Water Quality Control Center of the National Institute of Environmental Research in Korea. KEYWORDS: Water quality, forecasting, data assimilation, HSPF.

INTRODUCTION Skillful water quality forecasts are necessary for effective protection of the ecosystem and water resources. The need for long-term water quality forecasting has been well recognized (Rowe et al. 2013). Statistical models have been developed to forecast nutrient loadings and to help answer water quality-related questions based on land use change, climate change, and anthropogenic practices. To protect public health and water resources in a timely manner from rapidly occurring events such as bacterial contamination or harmful algal blooms, short-range water quality forecasting is also necessary. Due to the complexities and costs associated with real-time operations, however, short-range water quality forecasting is not widely practiced. It is generally acknowledged also that, to provide actionable information consistently, the accuracy of water quality forecasts needs to improve. Water quality forecasting is subject to parametric and structural uncertainties in hydrologic and biochemical model dynamics, uncertainties in the forcing input, uncertainties in the model initial conditions (Beck 1987), and uncertainties associated with human control of water quantity and quality. Reducing parametric or structural uncertainties by improving calibration or model

dynamics is generally expensive as they arise from limited understanding and representation of the physio-biochemical processes involved. Due to the limited predictability of atmospheric processes, reducing uncertainty in the forcing input is a large challenge and is expected to occur incrementally. Reducing uncertainties in the initial conditions (IC) via data assimilation (DA), on the other hand, has been proven very effective in improving the accuracy of short-term forecasts of many environmental variables. The objective of this paper is to describe the application of DA to the Hydrologic Simulation Program – Fortran (HSPF) model to improve the accuracy of shortrange water quality forecasts and to evaluate its performance for the Nakdong River Basin in Korea in support of real-time water quality forecasting at the Water Quality Control Center of the National Institute of Environmental Research (NIER) in Korea.

DATA ASSIMILATION DA is an objective way of jointly utilizing the actual and model-simulated observations to obtain more accurate estimates of the model ICs. The improved accuracy in the ICs results in improved prediction over the range where the ICs have influence in the memory of the system. DA has gained great popularity in oceanography, atmospheric sciences and hydrology in recent years. The World Meteorological Organization (WMO), e.g., identifies DA as an essential technique for accurate flood forecasting (WMO 1992). Various DA techniques have been used since the 1970s to improve the accuracy of water quality forecasts by reducing uncertainties in the model ICs or parameters (Beck and Young 1976). The most popular choices for the DA techniques used in water quality forecasting have been various types of Kalman filter (1960). Extended Kalman filter (EKF) was applied to forecast algal bloom (Mao et al. 2009) and to update the parameters of a dissolved oxygen-chlorophyll model (Pastres et al. 2003). Kalman filter (KF) (Kalman 1960) was used in the development of a stochastic forecasting system for biochemical oxygen demand (BOD) and dissolved oxygen (DO) (Guo et al. 2003). KF and ensemble Kalman filter (EnKF) (Evensen 1994) were used to improve predictions of groundwater flow and pollutant transport (Chang and Latif 2011, Jin and Chang 2008). Recently, particle filter (PF) was used for prediction of suspended sediment load (Leisenring and Moradkhani 2012). Application of KFs appears much more frequently in the literature than that of PF, due presumably to the former’s longer history and to the fact that, for high-dimensional problems, the latter may be computationally prohibitively expensive. Because not all state variables are directly observed and the relationships between certain observations and model states may be highly nonlinear (e.g. streamflow and soil moisture), it is important that DA for water quality forecasting be able to handle nonlinearities in hydrologic and biochemical observation equations in addition to nonlinear model dynamics (Kim et al. 2014). KF is optimal in the second-order sense for linear dynamics and linear observation equations. EKF may be used for nonlinear dynamics, but it may not be stable if they are strongly nonlinear (Miller et al. 1994). While EnKF can handle nonlinear dynamics, it assumes linear observation equations (Evensen 2009). Maximum likelihood ensemble filter (MLEF), on the other hand, can handle nonlinearities in both model dynamics and observation equations (Zupanski 2005). When EnKF and MLEF were comparatively evaluated in assimilating streamflow, mean areal

precipitation (MAP) and potential evaporation (MAPE) into a hydrologic model for updating the model soil moisture states, MLEF outperformed EnKF (Rafieei Nasab et al. 2014). The purpose of this work is to develop, evaluate and implement MLEF-HSPF, an MLEF-based DA module for the Hydrologic Simulation Program – Fortran (HSPF, Bicknell et al. 2001), for real-time water quality forecasting. The host of the resulting module is the open architecture realtime forecast system, the Water Quality Forecast System at the National Institute of Environmental Research (WQFS-NIER), which is based on the Flood Early Warning System (FEWS, Werner et al. 2013). MLEF-HSPF is currently being implemented operationally at the Water Quality Control Center of the National Institute of Environmental Research (NIER), Korea.

WATER QUALITY FORECASTING AT NIER Before describing MLEF-HSPF, we first describe the real-time water quality forecast process at NIER and the modeling and forecasting environment in which the DA module is to operate. Figure 1 shows the schematic of the overall process for short-range water quality forecasting at NIER. Note that the watershed water quality forecasts aided by MLEF-HSPF provide the Environmental Fluid Dynamics Code (EFDC) with the boundary conditions of the hydrologic and water quality state variables at specific locations. The resulting forecasts are then disseminated to the decision makers in water resources management agencies (http://wqcast.nier.go.kr:8080/).

Figure 1. Short-range water quality forecasting at NIER.

HSPF is a continuous semi-lumped model for simulation of land surface and subsurface hydrology and quality processes (Bicknell et al. 2001). A successor to the Stanford Model developed in the 1960s to model hydrologic processes continuously in time (Crawford and Burges 2004), HSPF simulates hydrology and nutrients loadings in pervious and impervious areas and in-stream processes based on in-land fluxes of individual state variables. HSPF has many state variables of which only a small subset is observed (see Table 1). Due to the semilumped nature of the model, however, the total number of the individual model states is generally much larger. Table 1. HSPF state variables. Module

PERLND (for pervious land)

IMPLND (for impervious land)

RCHRES (for in-stream process)

Variable name CEPS SURS UZS IFWS LZS AGWS GWVS SQO-NH4 SQO-NO3 SQO- PO4 SQO- BOD RETS SURS SQO-NH4 SQO-NO3 SQO- PO4 SQO- BOD VOL TW DOX BOD NO3 TAM PO4 PHYTO ORP ORN ORC

Obser vation data No No No No No No No No No No No No No No No No No No Yes Yes Yes Yes Yes Yes No No No No

Definition interception storage surface (overland flow) storage upper zone storage interflow storage lower zone storage active groundwater storage index to groundwater slope storage of NH4 on the surface storage of NO3 on the surface storage of PO4 on the surface storage of BOD on the surface retention storage surface (overland flow) storage storage of NH4 on the impervious surface storage of NO3 on the impervious surface storage of PO4 on the impervious surface storage of BOD on the impervious surface volume of water in the RCHRES at end of interval water temperature dissolved oxygen concentration biochemical oxygen demand concentration dissolved concentration of NO3 dissolved concentration of TAM (incl. NH3, NH4) dissolved concentration of PO4 phytoplankton concentration organic refractory phosphorus organic refractory nitrogen organic refractory carbon

In reality, only a small fraction of the HSPF state variables is usually observed. Hence, it is very likely that the model ICs are subject to large uncertainties. If one can prescribe more accurate ICs, one may expect improved skill in the forecasts. The purpose of DA is to obtain more

accurate ICs by objectively solving an inverse problem in which the ICs are adjusted, or updated, within the bounds of the model dynamics and uncertainty modeling, such that the modelsimulated observations are in line with the actual observations from the very recent past, or the assimilation window. For low-dimensional updating problems in which the number of state variables involved is very small, it may be possible for human forecasters to make expert adjustments of the model states based on visual inspection. For high-dimensional problems, however, such manual updating is not feasible. DA allows objective, automatic updating of the ICs in such problems.

MLEF-HSPF The DA technique used in this work for HSPF is MLEF, which can handle nonlinearity in both model dynamics and observation equations (Zupanski 2005). Based on the experience in Seo et al. (2003), Seo et al. (2009), Lee et al. (2011, 2012), MLEF-HSPF is formulated as a fixed-lag smoother (Schweppe 1973, Li and Navon 2001) in which one updates the model ICs at the beginning of the assimilation window and the observed boundary conditions (BC) of MAP and MAPE over the assimilation window via two multiplicative adjustment factors. To implement MLEF-HSPF as a plugin module for the real-time water quality forecasting system, two supporting programs have also been developed: the HSPF Processor and the MLEFHSPF adapter. The HSPF Processor allows communication between the HSPF model and MLEF-HSPF. The MLEF-HSPF adapter allows communication between MLEF-HSPF and WQFS-NIER. Figure 3 illustrates how the MLEF-HSPF module interfaces with WQFS-NIER as a FEWS plugin. For the HSPF model, MLEF-HSPF executes WinHSPFLight, a component of BASINS 4.0 publicly available at the EPA website (http://water.epa.gov/scitech/ datait/models/basins/index.cfm).

Figure 3. Schematic of the MLEF-HSPF module (dotted line box) as a plugin for WQFSNIER.

DA in this work addresses uncertainties in the ICs and observed BCs of MAP and MAPE only. When systematic model biases exist, DA is not likely to be very effective as it may adjust the model ICs into unrealistic regions to compensate for the systematic errors. In this work, we implemented a statistical bias correction procedure in the observation equations, which attempts to remove the effects of systematic biases in model-simulated observations. In this way, the DA solution may be found within the dynamic range of the model (Kim et al. 2014). MLEF-HSPF employs a large number of adaptable parameters to optimize performance and to maximize flexibility (see Table 2). In this work, the ensemble size was set to 9 based on sensitivity and information content analyses (Kim et al. 2014). To initiate ensemble DA, the model ICs are perturbed assuming that they are lognormally distributed with mean given by the unperturbed model ICs and standard deviation given by a fraction of the most recently updated model ICs. The observational errors were specified according to the measurement errors established by the Ministry of Environment in Korea (2011). The size of the assimilation window was set at 7 days to ensure that each assimilation cycle include all water quality observations made once a week. For additional details, the reader is referred to Kim e al. (2014). For minimization and eigenvalue decomposition, MLEF-HSPF uses conjugate gradient minimization of the Fletcher-Reeves-Polak-Ribiere algorithm (Press et al. 1986) and LAPACK (Linear Algebra PACKage, http://www.netlib.org/lapack) which requires installation of publicly available C and Fortran compilers, respectively. Table 2. MLEF-HSPF parameter settings for the Kumho Catchment. Description Recommended setting Dynamical model error in one-step transition 0.1 Multiplicative scaling factor to the HSPF state variables for the 0.01 initial perturbation Ensemble size 9 Option for mean daily flow observations 0 for Kumho Assimilation window in hrs 7 x 24 Option for the use of LAPACK solver 1 Tolerance for the stoppage condition for conjugate gradient 10.e+3 minimization Interception storage Surface (overland flow) storage Interflow storage Active groundwater storage Number Pervious 11 Upper zone storage of HSPF Lower zone storage Control state 28 30 Index to groundwater slope variables variables BOD, NH4, NO3, PO4 to be updated Retention storage Impervious Surface (overland flow) storage 6 BOD, NH4, NO3, PO4 Reach Volume of water in the reach 11

water temperature Chlorophyll-a, DOX,ORP,ORN,ORC BOD, NH4, NO3, PO4 multiplicative adjustment factors for MAP and MAPE Mean daily flow, (cms)2 Hourly flow, (cms)2 Instantaneous flow, (cms)2 Water temperature (degrees C)2 NH4, (mg/l)2 , RCH only NO3, (mg/l)2 PO4, (mg/l)2 Observation error BOD, (mg/l)2, RCH only variance CHL-a, (ug/l)2, RCH only DO, (mg/l)2, RCH only TP, (mg/l)2, RCH only TN, (mg/l)2, RCH only TOC, (mg/l)2, RCH only Hourly precipitation, (mm)2 Hourly PE, (mm)2

2 0.1 0.1 1. 0.1 0.001 0.1 0.01 0.1 1. 0.01 1. 1. 1. 1. 1.

EVALUATION AND RESULTS For evaluation, MLEF-HSPF was run for a year period of 2008 for the Kumho Catchment in the Nakdong River Basin, Korea. In this hindcasting experiment, the observed forcings of MAP and MAPE were used instead of the forecast MAP and MAPE. The above assumes clairvoyance and hence removes uncertainty in the future MAP and MAPE. Accordingly, comparison of DA-aided vs. DA-less predictions evaluates only the additional value of DA.

Figure 4. Kumho Catchment (2000 km2, in grey) in the Nakdong River Basin (23,817 km2, see inset) in Korea.

Comparative performance of DA-aided and DA-less predictions was assessed for streamflow and water quality variables, including water temperature (TW), dissolved oxygen (DO), biochemical oxygen demand (BOD), nitrate (NO3), phosphate (PO4) and chlorophyll a (CHL-a) at the most downstream monitoring station (circled triangle in Figure 4) where observations are available. Figure 5 shows the time series of the DA analysis (in red) for CHL-a in 2008 based on the parameter settings in Table 2. Note that the DA-aided analysis (in red) tracks the verifying observations (in green) much more closely, particularly for the all-important, large CHL-a events, than the DA-less analyses (in black and blue). Figure 6 shows the RMSE of the base simulation (left bars), bias-corrected base simulation (middle bars) and DA-aided and bias-corrected simulation (right bars) for analysis (1st 3 bars), and Day-1 (2nd 3 bars), Day-2 (3rd 3 bars) and Day-3 (4th 3 bars) predictions for BOD, CHL-a, DO, NO3, PO4, TW and instantaneous flow. Note that DA yields significant to substantial reduction in RMSE compared to the base simulation. The reduction is the largest for NO3 and PO4 at about 50%, owing largely to the bias correction component of DA. The second largest reduction in RMSE is observed in CHL-a, TW and flow at about 20% for Day-1 through 3 predictions, except for flow for Day-1 prediction.

Figure 5. DA analysis (BC-DA, red) of CHL-a for 2008. Also shown for comparison are the base simulation (BASE, black), bias-corrected base simulation (BC-Base, blue) and verifying observation (OBS, green).

Figure 6. RMSE of base simulation (left bars), bias-corrected base simulation (middle bars) and DA-aided and bias-corrected simulation (right bars) for analysis (1st 3 bars), and Day-1 (2nd 3 bars), Day-2 (3rd 3 bars) and Day-3 (4th 3 bars) predictions.

Figure 7. Comparison of DA-updated model states with the base-simulated at all reaches in the Kumho Catchment for CHL-a on Jun 10, 2008.

Figure 7 shows an example comparison of the DA-updated model states of CHL-a with the basesimulated on June 10, 2008, at all reaches of the Kumho Catchment. The red, green and blue dots denote the base-simulated, DA-updated and observed states, respectively. The smaller the control variable number is, the more upstream the reach is. The control variable refers collectively to the state variables and the two multiplicative adjustment factors of MAP and MAPE. Note that DA was able to adjust the model states upwardly not only at the most downstream location but also through the upper reaches. Due to lack of observations at the interior locations, however, verifification of the upstream DA results was not possible. Enhancement and verification of MLEF-HSPF are currently ongoing for multiple catchments in the Nakdong River Basin using observations both at the catchment outlet and interior locations, and the results will be reported in the near future.

CONCLUSIONS AND FUTURE RECOMMENDATIONS An MLEF-based DA module for HSPF, referred herein as MLEF-HSPF, has been developed and evaluated for operational implementation in WQFS-NIER in support of real-time water quality forecasting in Korea. The DA technique used is MLEF which combines the strengths of VAR and EnKF. For evaluation, DA-aided and DA-less model simulations were compared for the Kumho Catchment in the Nakdong River Basin in Korea. The results show that MLEF-HSPF adds significant to substantial predictive skill for all observed variables except DO. Reduction in RMSE ranges from 11 to 60% for Day-1 through 3 predictions. The reduction is the largest for

NO3 and PO4 at about 47 and 59%, respectively, owing largely to the bias correction component of DA. The second largest reduction is for TW at about 25%. The above results indicate that MLEF handles nonlinear observation equations well, but that correction of model bias is important for DA to be effective. Examination of the multiplicative adjustment factors for MAP and MAPE estimated by MLEF-HSPF for the Kumho Catchment, and HSPF-simulated and observed streamflows for other catchments in the Nakdong River Basin indicates that large uncertainties exist in the observed forcings, which deteriorate the quality of the model ICs. It is recommended that estimation of MAP in WQFS-NIER be carefully examined for possible improvement. Given the various sources of uncertainty in the end-to-end water quality forecast process, one may not expect to consistently produce single-valued forecasts that are sufficiently skillful for proactive decision making. Work is under way to support ensemble water quality forecasting for risk-based decision making. In its support, enhancement of MLEF-HSPF to improve reliability of analysis ensembles is ongoing and the results will be reported in the near future.

ACKNOWLEDGEMENTS This work is supported by the Water quality Control Center of the National Institute of Environmental Research, the Republic of Korea, under the Cooperative Study Agreement between Geosystem Research Corporation, Korea, and The University of Texas at Arlington. This support is gratefully acknowledged.

REFERENCE Beck, M. B., 1987. Water quality modeling: a review of the analysis of uncertainty. Water resources research, 23 (8): 1393-1442. Beck, M.B., Young, P.C., 1976. Systematic identification of DO-BOD model structure. J. Env. Eng. Div-ASCE. ASCE 102, 909-927. Bicknell, B. R., Imhoff, J. C., Kittle, J. L., Jr., Jobes, T.H., Donigian, A.S., Jr., 2001. Hydrological Simulation Program - Fortran (HSPF): User’s Manual for Release 12. U.S. Environmental Protection Agency, National Exposure Research Laboratory, Athens, GA. Chang, S.-Y., Latif, S., 2011. Use of regional covariance in data assimilation method to improve the estimation accuracy of a three dimensional contaminant transport model. World environmental and water resources congress 2011, 1118-1126. Crawford, N. H., Burges, S. J., 2004. History of the Stanford watershed model. Water Resources IMPACT 6 (2): 3-5. Evensen, G., 1994. Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res. 99, 10143-10162. Evensen, 2009. Data assimilation: The Ensemble Kalman Filter. Springer pp307. Guo, H.C., Liu, L., Huang, G.H., 2003. A stochastic water quality forecasting system for the Yiluo River. J. Environmental informatics 1 (2), 18-32. Jin, A., Chang, S.Y., 2008. Kalman filter for subsurface transport models with inaccurate parameters and unknown sources. J. Environmental informatics 12 (1), 37-43.

Kalman, R.E., 1960. A new approach to linear filtering and prediction problems. J. Basic Eng. 82 (1), 35-45. Kim, S., D.-J. Seo, H. Riazi and C. Shin, 2014. Improving water quality forecasting using HSPF via ensemble data assimilation. submitted to Special Issue on Ensemble Prediction and Data Assimilation for Operational Hydrology and Water Resources Management, Journal of Hydrology. Lee, H., Seo, D.J., Koren, V., 2011. Assimilation of streamflow and in situ soil moisture data into operational distributed hydrologic models: Effects of uncertainties in the data and initial model soil moisture states. Adv. Water Resour. 34, 1597-1615. Lee, H., Seo, D.J., Liu, Y., Koren, V., McKee, P., Corby, R., 2012.Variational assimilation of streamflow into operational distributed hydrologic models: effect of spatiotemporal adjustment scale. Hydrol. Earth Syst. Sci. 16, 2233-2251. Leisenring, M., Moradkhani, H., 2012. Analyzing the uncertainty of suspended sediment load prediction using sequential data assimilation. Journal of hydrology, 468: 268-282. Li, Z., Navon, I.M., 2001. Optimality of variational data assimilation and its relationship with the Kalman filter and smoother. Q. J. Roy. Meteor. Soc. 127 (572), 661–683. Mao, J.Q., Lee, J.H.W., Choi, K.W., 2009. The extended Kalman filter for forecast of algal 811 bloom dynamics. Water Res. 43, 4214-4224. Miller, R. N., Ghil, M., and Gauthiez, F., 1994: Advanced data assimilation in strongly nonlinear dynamical systems. J. Atmos. Sci.,51, 1037–1056. Pastres, R., Ciavatta, S., Solidoro, C., 2003. The Extended Kalman filter (EKF) as a tool for the assimilation of high frequency water quality data Ecological modelling 170: 227-235. Press, W.H., Flannerty, B.P., Teukolsky, S.A., Vetterling, W.T., 1986. Numerical Recipes. Cambridge University Press. 818 pp. Rafieeinasab, A., D.-J. Seo, H. Lee and S. Kim, 2014. Comparative evaluation of maximum likelihood ensemble filter and ensemble Kalman filter for real-time assimilation of streamflow data into operational hydrologic models, accepted for publication in Journal of Hydrology. Rowe, G.L., Jr., Gilliom, R.J., and Woodside, M.D., 2013, Tracking and forecasting the Nation’s water quality—Priorities and strategies for 2013–2023: U.S. Geological Survey Fact Sheet 2013–3008, 6 p. Schweppe, F.C., 1973. Uncertain Dynamic Systems. Prentice-Hall, 563 pp. Seo, D.J., Koren, V., Cajina, N., 2003. Real time variational assimilation of hydrologic and hydrometeorological data into operational hydrologic forecasting. J. Hydrometeorol. 4, 627-641. Seo, D.J., Cajina, L., Corby, R., Howieson, T., 2009. Automatic state updating for operational streamflow forecasting via variational data assimilation. J. Hydrol. 367, 255-275. Werner, M., Schellekens, J., Gijsbers, P., van Dijk, M., van den Akker, O., Heynert, K., 2013. The Delft-FEWS flow forecasting system. Environmental Modelling & Software 40: 6577. World Meteorological Organization (WMO), 1992. Simulated Real-Time Intercomparison of Hydrological Models. Operational hydrology report (OHR)-38, WMO series pp 779. Zupanski, M., 2005. Maximum likelihood ensemble filter: theoretical aspects. Mon. Wea. Rev. 133, 1710–1720.