From calibration on tracer test data to computation of

0 downloads 0 Views 5MB Size Report
Datta, B. N. (1995) Numerical Linear Algebra and Applications. ...... Vfq,. (10). Now we insert rj(s), which is calculated below in equation (17), and evaluate atz ...
Calibration and Reliability in Groundwater Modelling (Proceedings of the ModclCARE 96 Conference held at Golden, Colorado, September 1996). IAHS Publ. no. 237, 1996.

253

From calibration on tracer test data to computation of protection zones: upscaling difficulties in a deterministic modelling framework ALAIN DASSARGUES, SERGE BROUYERE & JOHAN DEROUANE Laboratory of Engineering Geology, Hydrogeology and Geophysical Prospecting, University of Liège, Bat. B19, B-4000 Liège, Belgium

Abstract In order to compute accurately the groundwater protection zones in each particular case and relating to the local hydrogeological conditions of each site, a methodology involving in situ tracer tests and flow-transport numerical simulations is commonly used. In the modelling approach, the parameters describing the aquifer (hydraulic conductivity, effective porosity, dispersivities, etc.) are chosen with "equivalent" and depth-averaged values on Representative Elementary Volumes (REV). The difficulty consists in finding a good agreement between the heterogeneous reality of the aquifer and its representation using this REV concept. In each case, calibration of the flow model on measured piezometric maps and calibration of the transport model on observed breakthrough curves (for each tracer) allow the deduction of local parameters of the aquifer. This determination has an advantage over classical analytical interpretations as the aquifer heterogeneity is taken into account for flow and transport parameters. The main problem then consists in obtaining significant parameters at the scale of the volume concerned by pollutant transfer times corresponding to the definition of the protection zones (e.g. 50 days). An upscaling procedure is often necessary and it strongly influences the expected reliability for protection zone computations. Results are shown from two case studies, where a manual upscaling technique is chosen, based on deterministic interpretations of the geological information. INTRODUCTION One of the biggest challenges in groundwater modelling is surely represented by the way of describing the effects of spatial variability on the values of the parameters to be entered in models. Quantifying aquifer heterogeneity is critical to understand the movement of contaminants, to study the possibility for their removal, to delineate protection zones around wells or water catchments, etc. In practical cases, how can we characterize adequately the variability of aquifer properties in order to take it into account in the context of numerical simulations of groundwater flow and transport? Despite important theoretical and practical progress in the last years, it is still very difficult to assess this variability statistically on the basis of information derived from common geological environments. As mentioned by Anderson (1995), "numerous

254

Alain Dassargues et al.

theoretical papers have been published based on a stochastic description of aquifer heterogeneity (Neuman, 1982; Sudicky & Huyakorn, 1991; Yeh, 1992) but the central question of whether the stochastic method, which treats aquifer heterogeneity as a random field, is applicable to real aquifers under field conditions, has not been definitively answered". Moreover, in fractured media, the question is more complex (Wang, 1991) and very few field data sets are suitable for this kind of study. Even in non-fractured media, in practical situations, the hydrogeologist is often faced with the following dilemma: (a) If many and different data are available in terms of geological and hydrogeological information in the studied domain, a very detailed geological interpretation is possible with a reasonable but unquantified error. Then the measured parameters can be correlated or extrapolated consistently with this geological interpretation. A statistical approach can be tried to describe the aquifer heterogeneity but it must include a large amount of "soft" data from the geology in order to condition the system (conditional simulations, indicator geostatistics, etc). (b) Since only a few data are available, consequently neither a statistical approach nor an accurate geological interpretation can be made. Up to now, and bearing in mind the need to obtain practical results in terms of effective transfer time of contaminants, a complete methodology has been proposed to water suppliers for studying protection zones around pumping wells (Dassargues, 1995). This methodology includes the following steps: (a) a characterization of the geological and hydrogeological conditions based on a complete set of data related to geology, hydrogeology, hydrology, morphostructural geology and shallow geophysical prospecting, (b) experimental tracer tests with artificial tracers, (c) modelling of the groundwater flow conditions with calibration on the measured piezometric maps with and without pumping, (d) modelling the transport of a dissolved contaminant with calibration on the measured breakthrough curves, (e) simulations of contaminant transport using the calibrated hydrodispersive parameters, with contaminant injections at different places in order to compute the contaminant arrival time at the pumping well, and (f) delineation of the protection zones on basis of the computed times in respect of the local regulations. This complete methodology is entirely applied in a deterministic framework. According to the local regulations in Belgium, two main protection zones are to be defined on basis of the contaminant travel time in the saturated zone: "zone Ha" corresponding to 1 day, and "zone lib" corresponding to 50 days. For determination of the 1-day isochrone lines, no major extrapolation of the groundwater flow and transport parameters are needed: we are working at the same scale as during the calibration of the model on the data of the in situ tracer and pumping tests. The main problems arise when upscaling of the flow and transport parameters is needed for determination of the 50-day isochrone lines. The chosen upscaling procedure is based here on the effective integration of all the information we obtained in the study area, by interpretation of lithological, morphostructural, geophysical and hydrological data. This is a pure deterministic upscaling procedure assuming that the zones (at the scale of the chosen REV) where preferential flow paths occur can be detected deterministically and with a sufficient spatial resolution by interpretation of all the geological, geophysical and hydrological surveys. Two case studies are described below, where such a deterministic upscaling was applied. The studied pumping sites are located respectively in an alluvial aquifer and in a fissured limestone aquifer.

From calibration on tracer test data to computation of protection zones

255

DETERMINISTIC UPSCALING IN AN ALLUVIAL AQUIFER Lithological, geophysical and hydrological data Four pumping wells, located in the alluvial plain of River Meuse downstream of the city of Liège (point 1 in Fig. 1.) provide about 8000 m3 day"1 of drinking water. Additionally to seven already existing piezometers, 10 new boreholes were drilled for the purpose of the protection zone study. The lithological information provided by boreholes, added to data from the interpretation of many penetration tests (CPT) and from shallow geophysical surveys (electric and seismic sounding methods), have lead to an accurate definition of (a) the setting and lifhology of the geological layers, (b) the geometrical configuration of the aquifer, and (c) the heterogeneity and the limits of its vertical and lateral extension (Fig. 2). The alluvial plain of the River Meuse is characterized by a fluvial sedimentation composed of gravels (average thickness of about 7 m) mixed in a sandy, silty or clayey matrix. High spatial variations in the importance and the composition of the matrix have been detected in the geophysical results and confirmed by the drilling logs. The lateral as well as vertical heterogeneity of the fluvial loose deposits reveals the geomorphological evolution of the course of the River Meuse in the studied area (Calembert, 1964). The shale and sandstone bedrock of Primary age is characteristic of the substratum of the River Meuse valley in that region, and can be considered as the impervious bottom of the alluvial aquifer (Fig. 2). The water in the unconfined aquifer flows in northern direction, with a 0.075 % average gradient outside the direct pumping catchment area.

North Sea

Germany BELGIUM

^ - s fct> MAASTRICHT

1

Legend L i, i , t Carboniferous

,

fcri=ria limestones

} 1

.

N

1

Luxemburg

Fig. 1 Location of the studied sites in Belgium.

Local parameters deduced from measurements and model calibrations The spatial variability of the hydraulic conductivity values in the gravel sediments has been previously studied at a regional scale (Dassargues, 1992). More locally, around the pumping wells, the first values of the hydrodynamic parameters can be obtained from classical interpretation of several pumping tests completed in each production well and

Alain Dassargues et al.

256

1011 Fill x

200

Ux*x*xi Colluviuni

I o o oj

l-~—~—1 ^'"

v//Â

Gravel Carboniferous shales & sandstone

Fig. 2 One of the transverse cross-sections in the alluvial deposits, in the River Meuse valley.

piezometer: transmissivity values range from 1 X 10"4 to 2 X 10"1 m2 sA for zones with a high clay content to zones where the gravels are well-sorted. An averaged storage coefficient of 0.10 has been analytically estimated using the Theis method. Using the Dupuit solution in homogeneous and steady-state conditions, an averaged radius of influence for the wells has been estimated to 500 m with extreme values ranging from 230 to 810 m. Given the strong assumptions under which the Theis and Dupuit expressions are valid, all these values can be considered as1 first estimations only. To obtain the local hydrodispersive parameters, five different tracers (lithium, iodure, uranine, rhodamine WT, naphtionate) have been injected "instantaneously" in six different piezometers to study the contaminant transport in the saturated zone of the gravel aquifer (Derouane & Dassargues, 1994). The distances between injection points and pumping wells ranged from 27 to 115 m. Some of the experimental breakthrough curves are given in Fig. 3. A 2D groundwater and transport model covering an area of 4 km2 with 3000 triangular finite elements has been constructed, with element sides ranging from 200 m to less than 2 m. The finite element code uses the "Streamline Upwind Petrov-Galerkin" (SUPG) method, associated to an implicit time integration scheme. The chosen spatial distribution of the calibrated permeability values, was strongly influenced by the results from interpretation of the pumping tests. During this calibration by trial-and-error method, all the features and information obtained from interpretation of all the geological, geophysical and hydrological surveys are deterministically taken into account. The calibrated transmissivity values are given in Table 1 and Fig. 4. Concerning the calibration of the contaminant transport model, advection, hydrodynamic dispersion and molecular diffusion were considered. For each simulated injection, a computed breakthrough curve was obtained. The shape and the characteristics of each computed breakthrough curve was fitted by trial and error on the corresponding experimental curve, so that the different values and spatial distribution of the major parameters (effective porosity and longitudinal dispersivity) can be assessed. The main results are shown in Table 1. The asymmetrical appearance presented by each breakthrough curve (Fig. 3) corresponding to late arrivals of pollutant cannot be fitted completely by the model. The spatial variability detected in the alluvial deposits (laterally and/or vertically) justifies fully the suggestion that the heterogeneity should be invoked to explain late arrivals of tracers. However, for the vertical variability, the

From calibration on tracer test data to computation of protection

ppm

Iodure : Pz6 —» P6

A

0.5

0.4-

]

V

ppb

zones

257

Naphtionate : P5 H> P4

simulated curve simulated curve

0.3

\

1

\

0.2

0.1

no,

J

0.0

0.2

s experimental points

V^_ 0.4

0.6

experimental points

0.8

0

1

2

3

4

5

6

7

e

9

10 11

12 13

days days R h o d a m i n e W T : Pz9 - » P I

PP'«

Iodure : P3 -> P I

simulated curve • experimental points

simulated curve 0.3 • experimental points

5

6

7

Fig. 3 Measured and computed (fitted) breakthrough curves.

limited number of specific data do not allow a reliable interpretation of the heterogeneity in the vertical direction. At this stage of the study, the introduction of local lateral heterogeneities into the model has lead however to fair results. Upscaling the parameters and delineation of the protection zones At the end of groundwater flow and pollutant transport calibrations stages, the model can be considered as the best representation of the reality at the current investigation stage. Bearing in mind the above mentioned hypotheses, it can be used for provisional studies with situations resulting from various stresses: influence of an increase in pumping rate, evaluation of critical flow rates, intervention means in case of local pollution (optimization of recovery wells, pumping rate, duration, analysis of resident times and transfer velocity of the pollutant, etc.). Moreover, a good assessment of the protection zones around each production well can be provided by the model. The effective velocity field of contaminant can be considered as reliable in the experimentation area. Indeed, for each tracer test, the hydrodispersive parameters have been calibrated only in the sub-area concerned by the particular tracer tests. Nevertheless, for the delineation of the protection zone lib (corresponding to a transfer time of 50 days), more important volumes of porous medium are concerned as longer pollutant migration distances (in comparison with those of the tracer tests) are to be considered. A problem of representative scale comes up (Jensen et al., 1993) and, in the current state of the study, we have preferred to upscale the values of the parameters choosing

258

Alain Dassargues et al.

Fig. 4 Map of the transmissivity values as a result of the calibration at a local scale (up), and map of the values chosen in the entire domain (down). Computed 1 day and 50-day isochrone lines around the pumping wells PI, P2, P4 and P6 (providing perimeters of protection zones).

the values in function mainly of the geological knowledge about the site. Doing so, we have extrapolated the unchanged parameters to the whole modelled area as shown for the transmissivity values in Fig. 4. This choice can be justified by the fact the results of the tracer test have provided dispersivity values ranging from 0.01 to 0.95 m. Physically, it means that the local dispersion (mostly at the pore and grain scale) is

From calibration on tracer test data to computation of protection zones

259

Table 1 Hydrodispersive parameters in zones close to the piezometers or wells where tracer tests were performed and computed breakthrough curves were fitted. Zones near...

riVs- 1 )

nc

aL(m)

aT(m)

Dm (m2 s"1)

P6, Pz5 and P6

8 x 10'2

0.048

0.01

0.003

1 X 10"9

P3

2

8 x 10'

0.072

0.01

0.003

1 x 10"9

P5 and P4

4 x 10"2

0.056

0.95

0.22

1 x 10"9

2

0.059

0.04

0.01

1 x 10"9

Pz9

2

8 x 10-

0.047

0.60

0.20

1 x 10"9

Pz8

4 x 10-2

0.082

0.90

0.25

1 x 10"9

PzlO

8 x irx

predominant in this porous medium. Dispersion at a larger scale can then be explicitly taken into account in the model by considering the heterogeneity, as much as possible, in the permeability values. This way of treating the heterogeneity does not take into account the scale effect with increasing dispersivity values (Gelhar et al, 1992), but with the distinction of different permeability and effective porosity values. Ideally, longlasting tracer tests should be realized from piezometers situated at longer distances from pumping wells in order to justify a posteriori this method. Additionally, in this case study, the hydrodynamic dispersion and the molecular diffusion are turning out to be very weak so that only the dominant advective process has been considered for the isochrone calculations (for the computation of the 50-day isochrone), with extrapolated values of effective porosity ranging from 0.047 to 0.06. The computed isochrone lines are shown at the Fig. 4. Since effective porosity is usually not affected by scale effect, the computed protection zones are expected to be reasonably accurate.

DETERMINISTIC UPSCALING IN A FISSURED LIMESTONE AQUIFER Lithological, geophysical and hydrological data The case described briefly here consists in the protection study of three production wells drilled in a small topographical valley corresponding to the northern part of a east-west syncline in Carboniferous layers (south of Belgium) (point 2 in Fig. 1). The direction of the calcareous layers is approximately east-west with a 80 degree dip. The northern part of the studied zone can be considered as limited by sub-vertical layers characterized by strongly lower hydraulic conductivity values. About 25 piezometers have been drilled and measured piezometric maps have been drawn for natural conditions (natural southnorth gradient of groundwater flow) and in pumping conditions (3600 m3 day"1). Results of the geomorphostructural study, confirmed by shallow seismic and electrical geophysical surveys, have provided information on the main fracture axis in the limestones. The detailed geological survey of the numerous outcrops including mapping of all the collected data has given information on the lithological differences between the successive limestone formations, and on bed and fracture dipping/orientations.

260

Alain Dassargues et al.

Local parameters deduced from measurements and calibrations A local 2D horizontal finite element model has been built with the mesh composed of triangular elements with edges of about 100 m in the farthest zones from the pumping wells. The mesh size decreases to 5 m near the pumping wells and where strong heterogeneities have been revealed. Lateral boundary conditions of the flow model consist in prescribed heads (Dirichlet type) boundaries interpolated from the local and regional piezometric measurements, taking into account the geology and the presence of main fracture axes. Since the natural piezometric conditions are not precisely known, the model has been calibrated on pumping piezometric conditions for two different flow rates (for the three wells). In the model, the distinguished heterogeneities are related to detected fractured zones using information from outcrops geological survey, geophysical methods, geomorphostructural analysis and boreholes. A map of the transmissivity values is deduced (Fig. 5) from the calibration. In the fitting process, particular attention has been given to the geological significance of any value or distribution change for the transmissivity. The main fractured axes have been explicitly distinguished and the different lithologies taken into account. An anisotropy factor had to be introduced in order to take into account that the layers seemed to have a more important longitudinal hydraulic conductivity along the bank direction than transversely. This anisotropy coefficient has been fitted to a 0.67 value. pumping wells

Fig. 5 Map of the transmissivity values (m2 s"1) from calibration at a local scale near the pumping wells and from extrapolation based on geology for the larger scale.

From calibration on tracer test data to computation of protection zones

261

A multi-tracer test has been performed (Meus & Bolly, 1994), and measured breakthrough curves obtained in each pumping well. Two tracers reached the pumping wells: naphtionate and uranine injected in piezometers respectively located at distances of about 50 and 70 m from the wells. The calibration of the model for the transport conditions has lead to the fitting of the values and distributions of the effective porosity (ne) and of the longitudinal and transversal dispersivities (aL and aT). One of the main limitation to this calibration is that depth-averaged concentrations are considered in a 2D flow-transport model. An average value of 15 m has been chosen, consistently with screened levels in the pumping wells. Some of the results of the calibration are shown in Fig. 6 in terms of breakthrough curves in the pumping wells. The deduced values for the transport parameters are as follows: ne = 0.01, aL = 30 m and aTlaL = 0.04 ne = 0.08, aL = 8 m and aTlaL = 0.04 We have interpreted the first couple of values as corresponding to the limestone matrix (eventually microfissured), while the second is more representative of fractured or slightly karstified zones. Indeed, it seems logical to consider that a more fractured zone presents higher values of porosity with lower dispersivity, and matrix blocks of limestones lower values of effective porosity with a higher dispersivity due to multiple single microfractures. Upscaling the parameters and assessment of the protection zones In this case, it was evident that we could not neglect the dispersion component of the transport. Normally, the transport parameters fitted during the calibration on the measured breakthrough curves are only representative for local scale transport model. It has been decided to extrapolate deterministically the local values to the entire

0.0

10.0

20.0

30.0

40.0

50.0

60.0 d a y s

Fig. 6 Measured and calibrated breakthrough curves for naphtionate in the three wells.

Alain Dassargues et al.

262

modelled domain. This extrapolation takes into account, as logically as possible, the knowledge that we have about the geology. In this way the strong scale effect on the dispersivity values that can be expected (e.g. Gelhar etal., 1992) is explicitly considered by a determinist representation of the heterogeneity. The first set of transport parameters has been extrapolated to the whole supposed unfractured domain and the second set to the main fractured zones as revealed by the geophysical and morphostructural studies. Simulations of the contamination scenario have been computed from 114 points of the meshed domain. For each of these points, a first arrival time is recorded and then by interpolation of the results, a map with isochrone lines can be drawn (Fig. 7) and protection zones corresponding to the existing regulations can be assessed.

pumping wells

O

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400

1500

1600

Fig. 7 Computed 1-day and 50-day isochrone lines around the pumping wells (providing perimeters of protection zones).

From calibration on tracer test data to computation of protection zones

263

CONCLUSIONS Dealing with practical cases it is often particularly uneasy to apply geostatisticallybased methods to upscale the groundwater flow and transport parameters. Usually, for local situations, sufficient geological data ("soft data") are available in order to obtain a clear (but to a certain extent always subjective) geological interpretation. Measurements ("hard data") are added and allow a reliable assessment of the spatial variability and heterogeneity in the different layers in order to infer or extrapolate logically the flow and transport parameters needed for the model. According to the generally accepted definition, dispersion is the result of a statistical distribution of flow paths and velocities around local heterogeneities (at a lower scale). Microdispersion caused by flow around grains is on the order of centimetres whereas the macrodispersion caused by macroscopic heterogeneities is on the order of metres (or more). Two trends are observed in the way of including macrodispersion values in the models. In the first way, the heterogeneity of the modelled domain is not fully described but "lumped" into a macrodispersion term. The corresponding dispersivity coefficients are not really physically consistent but they represent statistically the general behaviour of the contaminant around the advective mean position. The main advantage of this method lies in the fact that smaller scale heterogeneities need not be known in detail; the main problem consists in upscaling the values. In the second approach, the main detected heterogeneities are taken into account explicitly with different values of permeability and effective porosity values. In that approach, the dispersivity values obtained by field investigations at a local scale do not have to be really upscaled since we supposed that they are representative at the scale used in the model. The methodology described here, and applied to two local situations, lies between these two possibilities, trying to add the advantages and to avoid disadvantages. We try to consider as accurately as possible the heterogeneity of the domain. For that part of the work, the role of a good geological background is essential. But as there is no hope of having a detailed knowledge of the medium at a small scale, extrapolations of values are still needed from measured values at a local scale to larger scales; these extrapolations being mainly influenced by the geological information. The main difficulty lies in the relative definition of that "intermediate scale" at which one can expect to apply this methodology. Of course, this conceptual choice must be made and balanced as a function of the accuracy needed for the results. In the future, the implementation of long-lasting tracer tests should be considered, as far as it could help to solve this problem of the deterministic upscaling of the parameters more accurately.

Acknowledgements The work of Serge Brouyere, co-author of this publication, has been supported from a grant received from the National Fund for Scientific Research of Belgium.

264

Alain Dassargues et al.

REFERENCES Anderson, M. P. (1995) Characterization of geological heterogeneity. In: Second IHP/IAHS Georges Kovacs Colloq. on Subsurface Flow and Transport: the Stochastic Approach (Paris). Calembert, L. (1964) Observations dans la plaine alluvialede la Meuse, en aval de Liège (Observations in the alluvial plain of the River Meuse, downwards of Liège). Publication du Service Géologique du Luxembourg XIV', 115-135. Dassargues, A. (1992) Kriging and cokriging applied to data: influenceon the results of a regional groundwater FEM model. In: Computational Methods in Water Resources IX, vol. 1: Numerical Methods in Water Resources, 439-448. Dassargues, A. (1995) Applied methodology to delineate protection zones around pumping wells. J. Environ. Hydrol. 2(2), 3-10. Derouane, J. & Dassargues, A. (1994) Modélisation mathématique appliquée à la délimitation des zones de protection: cas du site de captage de Vivegnis — plaine alluviale de la Meuse — Belgique (Modelling applied to delineation of protection zones: pumping site of Vivegnis — alluvial plain of the River Meuse — Belgium). Bull, du Centre d'Hydrogéologie de l'Université de Neuchâtel no. 13, 53-68. Gelhar, L. W., Welty, C. & Rehfeldt, K. R. (1992) A critical review of data on field-scale dispersion in aquifers. Wat. Resour. Res. 28(7), 1955-1974. Jensen, K. H., Bitsch.K. &Bjerg, P. L. (1993) Large-scale dispersion experiments in a sandy aquifer in Denmark: observed tracer movements, and numerical analysis. Wat. Resour. Res. 29(3), 673-696. Meus, Ph. & Bolly, P. Y. (1994) Etude des zones de prévention à Yves-Gomzée - Phase I: Essai de traçage (Study of the protection zones at Yves-Gomezée - Phase I: Tracer test). Ecofox Report: SWDE-942 (unpublished). Neuman, S. P. (1982) Statistical characterization of aquifer heterogeneities: an overview. Geol. Soc. Am. Spec. Pap. 189, 81-102. Sudicky, E. A. & Huyakorn, P. S. (1991) Contamination migration in imperfectly known heterogeneous groundwater systems. Rev. Geophys. Suppl. 240-253. Wang, J. S. Y. (1991) Flow and transport in fractured rocks. In: Contributions in Hydrology, US National Report 19871990, 254. AGU. Yeh, T.C.J. (1992) Stochastic modelling of groundwater flow and solute transport in aquifers. Hydrol. Processes 6, 369.

3

Calibration: Concepts

Calibration and Reliability in Groundwater Modelling (Proceedings of the Model CARE 96 Conference held at Golden, Colorado, September 1996). IAHS Publ. no. 237, 1996.

267

Conceptualization and characterization of envirochemical systems KENNETH E. KOLM Division of Environmental Science and Engineering, Colorado School of Mines Golden, Colorado 80401, USA

PAUL K. M. VAN DER HEIJDE International Ground Water Modeling Center, Colorado School of Mines, Golden, Colorado 80401, USA

Abstract Reliability and uncertainty in groundwater model predictions is tied to the correctness of the conceptualization of the simulated system. The purpose of this paper is to present an integrated, stepwise approach for the conceptualization and characterization of subsurface envirochemical systems, building on the hydrological characterization approach developed by Kolm (1993). In the context of this paper, an envirochemical system is a subsurface hydrogeological and hydrochemical system containing chemical species of concern to environmental management. The conceptualization and characterization process, which is iterative and used at any scale, includes: (a) problem definition and database development; (b) preliminary conceptualization; (c) anthropogenic characterization; (d) surface characterization; (e) geological, geomorphic and geochemical characterization; (f) hydrogeological characterization; (g) groundwater flow system characterization and quantification; and (h) envirochemical system characterization and quantification. This approach may be used in: (a) evaluating natural variations in groundwater flow and envirochemical systems; (b) evaluating anthropogenic stresses on groundwater flow and envirochemical systems, such as pumping for water supply, irrigation, induced infiltration, or well injection; (c) evaluating presence and velocity of groundwater contaminants; (d) designing and selecting mathematical, geochemical, or transport models to simulate groundwater flow and envirochemical systems; (e) completing model schematization and attribution based on the problem defined, characterized groundwater flow and envirochemical system and model(s) selected; and (f) designing groundwater remediation systems.

INTRODUCTION Reliability of groundwater model predictions typically depends on the correctness of the conceptual model, the availability and quality of model data and the adequateness of the predictive tools. This paper describes an integrated, stepwise method for the qualitative conceptualization and quantitative characterization of natural and anthropogenic subsurface envirochemical systems. A subsurface envirochemical system is a hydrogeological and hydrochemical system containing chemical species of concern to environ-

268

Kenneth E. Kolm & Paul K. M. van der Heijde

Problem Definition

X

- Data Base Development Conceptualization Preliminary Envirochemical System Conceptualizations

i

Surface, Subsurface, and Anthropogenic Characterization

i |

?

Adequate lor rrooiemv

Nu

I

d, Identify Data Needs

Yes

4 Hydrogeologic and Geochemical Framework Characterization

i

-** T

7

Auequaie lor rroDiem.'

Nu

1

•i, Identify Data Needs

Yes

1 Ground-Water Flow? •r Yes 4, Ground-Water Flow System Characterization and Quantification



C

I 1 Adequate ior rrooiem.'

'No

Ï

+

Identify Data Needs

Yes Envirochemical System Characterization and Quantification One or More Conceptualizations

i

Data Needs?



Ye s

>

*. \ T

v/

Preferred Envirochemical System Conceptual Model (s)

Address Data Needs

Fig. 1 Procedure for conceptualization and characterization of envirochemical systems.

Conceptualization and characterization of envirochemical systems

269

mental management. It comprises the general concepts of the hydrological and hydrochemical system elements, active physical and chemical processes, sources and stresses, and the interlinkages and hierarchy of elements and processes with respect to the assessment of type, quantity, distribution and evolution of chemicals as influenced by soil, rock and water properties. The conceptualization and characterization process consists of systematic description and analysis of envirochemical system components, including : (a) problem definition and database development; (b) preliminary conceptualization; (c) anthropogenic characterization; (d) surface characterization; (e) geological, geomorphic and geochemical characterization; (f) hydrogeological characterization; (g) groundwater flow system characterization and quantification; and (h) envirochemical system characterization and quantification (Fig. 1). Envirochemical system conceptualization and characterization is an iterative process for developing and attributing multiple working hypotheses (Fig. 1). The process starts with the development of a preliminary understanding of the envirochemical system based on project objectives, chemicals of concern and general physical and chemical principles and is followed by data collection and refinement of the understanding. Additional data collection and analysis and subsequent refinement of the conceptual models occur during the envirochemical model development and use, as required (Fig. 1). This process aims at reducing uncertainty in the formulation of alternative hypotheses derived from observation, interpretation and analysis, and in the final characterized and quantified conceptual model. This paper does not address the specific methods for characterizing envirochemical, hydrogeological and groundwater system properties, nor the quantitative uncertainty associated with specific methods of envirochemical, hydrogeological and groundwater system characterization and quantification. The approach presented in this paper can be used at any scale and for any problem related to a particular envirochemical system. The approach is independent of the manner in which the problem may be solved, or the tools used in the design of the solution, including modelling. The nature of the problem to be solved will determine the type and scale of data collected and influence the specific results of each analysis. Conceptualization and characterization are fundamental steps of envirochemical and groundwater flow system modelling. This overall process of modelling is similar to the modelling processes described by Anderson & Woessner (1992), Zheng & Bennett (1995) and Kolm et al (1996). After conceptualization and characterization are sufficiently complete to meet project objectives, the conceptual model may be translated into a mathematical model. Such a mathematical model typically consists of a set of governing equations and boundary conditions for transport simulation, or the equations describing chemical reactions with or without chemical and physical constraints. Relating such a mathematical model to a particular system requires specific values for system parameters, stresses and boundary conditions as well as rate coefficients. The conceptualization and characterization process is optimized when these inputs are identified early. The application of geochemical and transport models requires making simplifying assumptions with respect to system processes, stresses and geometry, a procedure referred to as model schematization. Efficient model schematization starts early on in the conceptualization and characterization process and continues into the code selection, model design or construction and model attribution and calibration phases of a modelling project. In this paper, discussion regarding model schematization is focused on aspects

270

Kenneth E. Kolm & Paul K. M. van der Heijde

of importance to the conceptualization and characterization process as is typically reflected in the problem definition or project objectives, This approach may be used for project planning and data collection, but does not provide specific details of field characterization techniques. Refer to Tinsley (1979); Thornton (1983); Boulding (1991; 1993a; 1993b; 1994; 1995); and Sara (1994) for further guidance regarding field characterization techniques. CONCEPTUALIZATION AND CHARACTERIZATION PROCESS Problem definition and database development First, the objectives of the project and the required or anticipated level of effort are defined. At this stage, the chemicals of concern may be identified and regulatory requirements considered. Once the project objectives and constraints are defined, the appropriate facets and scale of the envirochemical system for characterization are identified. This is followed by the determination of the study site boundaries using one or more of the following considerations: (a) natural site characteristics (topography, soils, geology, hydrology, biota, chemistry); (b) current and past land use and ownership; or (c) known or suspected extent of site-related contaminants. If site boundaries are initially defined by ownership, natural site characteristics should be evaluated to determine whether the scope of at least parts of the investigation should include areas that are off-site. For example, investigations of groundwater contamination should include areas of potential sources up-gradient and potential migration paths downgradient from a site. Data from existing sources are gathered by locating data sources and collecting and organizing relevant data into a manageable database. During this procedure data in the form of maps, tables and reports are collected from available published and unpublished sources. Furthermore, data from published or unpublished field studies are gathered and the methods used to collect and analyse the data are noted. Also, data from published or unpublished laboratory studies are collected. For each of these types of data, the methods used to collect and analyse the data and the levels of quality assurance and quality control, as required by the project, are noted. Preliminary conceptualization Preliminary qualitative envirochemical system conceptualization and field reconnaissance are conducted using the data bases developed during the problem definition and data base development step. Transforming data into a conceptualization is a rather intuitive process consisting of: (a) qualitative and quantitative data interpretation of individual data elements and grouped data within a particular data type; (b) analysis of spatial and temporal relationships between various data types; and (c) relating data types and interpreted data to elements of the specific envirochemical system (i.e. processes, structure, state and stresses). This process may be enhanced by comparison with previously conceptualized systems (i.e. role of experience). This approach results in the development of one or more initial conceptual models that will

Conceptualization and characterization of envirochemical systems

271

be used for further characterization and quantification. During this procedure, anthropogenic, surface and subsurface (geological, geomorphic, geochemical) features of the study area and chemical constituents, hydrogeological and geochemical framework, groundwater flow system and envirochemical system are characterized qualitatively. The resulting conceptual model of the envirochemical system includes a qualitative assessment of how chemicals enter, move through or are retained in and leave the envirochemical system. The source, transport, fate and resulting distribution of each targeted chemical (e.g. inorganic and/or organic chemical constituents, tracers and isotopes) in the envirochemical system are conceptualized at this time, or, in the case of unknown sources, source locations and strengths are hypothesized from the conceptualized transport and fate processes and the actual distribution of chemicals. The envirochemical system conceptual model(s) is(are) described and visualized using cross-sections and plan view illustrations. This envirochemical system conceptual model may be modified at any stage of quantitative characterization (Fig. 1). Anthropogenic characterization The next step, anthropogenic characterization, includes determining type, distribution and rates of anthropogenic chemical processes and type and distribution of substances in the surface and subsurface (seeBoulding, 1995, for guidance). Analysis of anthropogenic-related envirochemicals includes, but is not limited to, industrial, municipal/ urban, domestic, agricultural and mining/resource related releases and uptakes. At this time, the analysis of the type, distribution and amount of the targeted chemical species will reveal if subsurface flow is an important process. If the targeted chemical species has moved, it may be necessary to conceptualize and characterize the groundwater flow system, including hydrogeological and groundwater system characterization (Fig. 2) (Kolm et al, 1996). Surface characterization Surface characterization, including type, distribution and amount, of natural chemical processes and substances at or near ground surface, is conducted (see Drever, 1982; Ritter, 1994; and Boulding, 1995, for guidance). During this procedure studies regarding vegetation-related (including plant releases and uptake), surface water-related (streams, lakes/wetlands, ocean and springs/seeps) and climate-related (snow, rain, fog) chemical exchanges with the subsurface system are conducted. The physical and chemical processes and resulting mass exchanges across the surface are characterized during this step. Geological, geomorphological and geochemical characterization Geological, geomorphological and geochemical characterization is conducted, including determining type, distribution and amount of natural chemical processes and substances in the subsurface (see Krauskopf, 1979; Berner, 1971; Stumm & Morgan, 1981; and

Kenneth E. Kolm & Paul K. M. van der Heijde

272

conceptualization and characterization Problem Definition

I

Data Base Development 1

1 I 1 1 1 1 1

î

>

Conceptualization Preliminary (Qualitative) Conceptualizations

i

model schematization

1

I

f

1

i

1 i i i i i i

Surface Characterization

1 1 1 1 1 1

I

1

., Identify Data Needs

1

1 Yes

1

1

i

5

1 1 t t 1

I t

. j

1 1 I 1

1 i

i i i

( 1 i Î i

11 i

i i i i i

TÏ _ i _ , _ „ - o

•Is

* Identify Data Needs

*

i

. Preferred Quantified Conceptual Model(s)

i i i i

i i

1

Yes

i

1 Needs Address Data /

'

i i i i i i i i i

Ï

I

.

i

Ground-Water System Characterization One or More Quantified Conceptualizations Data Needs?

1

1 i 1

4

1

1 1 1 1 1 1 i

1

Yes

1

1 i

/\uequaie loi riouiem.'

! 1 1

1 1

i

,_r

1

i

Hydrogeologic Characterization

—)

1 1

J

-f

Identify Data Needs

1

1

|

- -

Yes

1

1

i i i i

> INO

i

1

1

i i i

J,

1

1

1 1

Geologic and Géomorphologie Characterization

1 1

1 i

1 1

i i i i i i i i

i i i

k

\' code selection

J-

model construction and cah bration Fig. 2 Procedure for conceptualization and characterization of groundwater systems. (from Kolm et al., 1996).

Conceptualization and characterization of envirochemical systems

273

Boulding, 1995, for guidance). During this procedure, the penological, mineralogical and geochemical factors and compositions are analysed with respect to spatial and temporal variations caused by geological (igneous, metamorphic and sedimentary) and geomorphic processes. This includes an analysis of the original chemical form and the evolution of solid geochemistry with time. The geomorphological processes and deposits pertinent to the characterization of envirochemical systems include weathering and paedogenesis (e.g. soil horizon texture and chemical effects), mass wasting (colluvium textures and chemistry), fluvial activity (alluvium textures and chemistry), aeolian activity (dunes texture and chemistry), glacial activity (e.g. moraines, till and outwash plain textures and chemistry), coastal activity (beach and lagoon materials texture arid chemistry) and anthropogenic features (e.g. road fill, tailings piles, foundation material texture and chemistry). The geological processes pertinent to the characterization of envirochemical systems, including the sedimentary, igneous and metamorphic processes and the resulting rock materials are evaluated and an analysis of the original chemical form and the evolution of solid geochemistry with time is performed. The spatial and temporal distribution of these geological materials is based on the principles of mineralogy, petrology, geochemistry, stratigraphy (depositional environments and lithology) and structure of the geological units. Geological maps and cross sections, subsurface investigation logs and stratigraphic columns are used, in conjunction with surface characterization, geophysical data and analysis, and geochemical data and, analysis, to develop a part of the geological and geochemical framework that represents the distribution of lithological units and mineralogical and geochemical compositions as envirochemical system materials. Finally, the subsurface fluids (e.g. vadose zone fluids, groundwater and soil vapour) are characterized for envirochemical behaviour, including the reactions between subsurface fluids and matrix materials. Again, the chemical processes are transient in nature and need to be characterized on a temporal and spatial basis.

Hydrogeological characterization Hydrogeological characterization consists of three phases: (a) identification and characterization of the hydrostratigraphic units; (b) identification and characterization of the hydrostructural units; and (c) combining hydrostratigraphic units and hydrostructural units into a set of hydrogeological units. Each hydrostratigraphic, hydrostructural and hydrogeological unit is defined as a discrete volume element of the subsurface geological framework. General discussion regarding groundwater processes is available in Freeze & Cherry (1979) and Fetter (1994); extensive discussion regarding the hydrogeological characterization procedure is available in Kolm (1993) and Kolm et al. (1996). Groundwater system characterization and quantification The groundwater system is characterized and quantified by determining the type, amount, temporal variation and spatial distribution of groundwater recharge and discharge using surface, subsurface and hydrogeological analysis. Furthermore, reaction and flow paths of indicative chemical species are analysed for information regarding the

274

Kenneth E. Kolm & Paul K. M. van der Heijde

groundwater flow system. The groundwater system is quantitatively defined in terms of boundary conditions, flow paths and potentiometric surfaces and groundwater system budget (see Engelen & Jones, 1986; Domenico & Schwartz, 1990; Anderson & Woessner (1992); Kolm, 1993; Boulding, 1995; and Kolm et al., 1996 for additional guidance). Exploratory groundwater modelling of one or more conceptual models, particularly the matching of the results of numerical models with observations of heads and fluxes, may be used for the quantification of the hydrodynamics of the characterized groundwater system, for checking the groundwater flow system characterization for deficiencies (conceptual model or attributes) and for determining subsequent field sampling programmes. Envirochemical system characterization and quantification The natural and anthropogenic surface or subsurface origins of targeted chemical constituents are characterized and quantified, including type, spatial and temporal distribution and amount. The presence, transport and fate of these chemical species, including form and spatial and temporal distribution, is characterized and quantified using the information obtained in the previous steps. At this time, relevant physical and chemical processes of the envirochemical system are mathematically described and quantitatively attributed. Processes of interest include, but are not limited to: (a) advection; (b) adsorption; (c) dispersion; (d) molecular diffusion, (e) volatilization, (f) hydrolysis; (g) oxidation/reduction; (h) chelation; (i) ion exchange; and (j) dissolution/precipitation and biotransformation (see Zheng & Bennett, 1995; Appelo & Postma, 1993; Domenico & Schwartz, 1990; Dragun, 1988; and Thornton, 1983, for additional guidance). The final result of this analysis is a characterized and quantified preferred envirochemical system model(s) (Fig. 1). Upon completion of the conceptualization and characterization process, an adequate computational procedure is chosen, often in the form of a geochemical or transport computer code (Appelo and Postma, 1993; Zheng and Bennett, 1995). After selecting a particular computer code, the model construction phase is entered where code-specific aspects are addressed, followed by model calibration and sensitivity analysis during which the conceptual model may be revisited (Fig. 1).

CONCLUSIONS In many projects dealing with the distribution of hazardous chemicals in the subsurface, decisions are made without obtaining a thorough understanding of the extent, complexity and hierarchical nature of the affected system. Thus it often occurs that important elements of such a system are not studied to the extent necessary to obtain optimal solutions. To address this issue, the conceptualization and characterization process described in this paper has been developed. It takes a top-down, hierarchical approach to envirochemical system analysis, ensuring that all elements, processes and constraints of the envirochemical and hydrological systems involved are addressed and properly evaluated at a level of detail commensurate with project objectives and constraints. As this conceptualization and characterization is aimed at solving real-world problems, the

Conceptualization and characterization of envirochemical systems

275

nature of these problems and specific management requirements in addressing them provide important guidance at different stages of the process. REFERENCES Anderson, M. P. &Woessner, W. W. (1992) Applied Groundwater Modeling, Simulationof Flow and AdvectiveTransport. Academic Press, San Diego, California, USA. Appelo, C. A. J. & Postma, C. (1993) Geochemistry, Groundwater, andPollution. Balkema, Rotterdam,The Netherlands. Berner, R. A. (1971) Principles of Chemical Sedimentology. McGraw-Hill, New York, New York, USA. Boulding, J. R. (1991) Description and sampling of contaminated soils: a field pocket guide. EPA/625/12-91/002, US Environmental Protection Agency, Washington, DC, USA. Boulding, J. R. (1993a) Subsurface characterization and monitoring techniques: a desk reference guide. Vol. I. Solids and ground water, appendices A and B. EPA/625/R-93/003a, US Environmental Protection Agency, Washington, DC, USA. Boulding, J. R. (1993b) Subsurface characterizationand monitoring techniques: a desk reference guide. Vol. II. The vadose zone, field screening and analytical methods, appendices C and D. EPA/625VR-937003b, US Environmental Protection Agency, Washington, DC, USA. Boulding, J. R. (1994) Description and Sampling of Contaminated Soils: A Field Guide, revised and expanded 2nd edn. Lewis Publishers, Chelsea, Michigan, USA. Boulding,J. R. (l995)PracticalHandbookofSoil, VadoseZone, and Groundwater Contamination: Assessment, Prevention, and Remediation. Lewis Publishers, Chelsea, Michigan, USA. Domenico, P. A. & Schwartz, F.W. (1990) Physical and Chemical Hydrogeology. Wiley, New York, New York, USA. Dragun, J. (1988) The Soil Chemistry of Hazardous Materials. Hazardous Materials Control Research Institute, Silver Spring, Maryland, USA. Drever, J. I. (1982) The Geochemistry of Natural Waters. Prentice-Hall, Englewood Cliffs, New Jersey, USA. Engelen, G. B. & Jones, G. P. (eds) (1986) Developments in the Analysis of Groundwater Flow Systems. IAHS Publ. No. 163. Fetter, C. W. (1994) Applied Hydrogeology, 3rd edn. Macmillan, New York, New York, USA. Freeze, R. A. & Cherry, J. A. (1979) Groundwater. Prentice Hall, Englewood Cliffs, New Jersey, USA. Kolm, K. E. (1993) Conceptualization and characterization of hydrologie systems. Technical Report 93-01, International Ground Water Modeling Center, Colorado School of Mines, Golden, Colorado, USA. Kolm, K. E., Van der Heijde, P. K. M., Downey, J. S. & Gutentag, E. D. (1996) Conceptualizationand characterization of ground-water flow systems. In: Subsurface Fluid-Flow (Ground-Water and Vadose Zone) Modeling (ed. by J. D. Ritchey & J. O. Rumbaugh). ASTM STP 1288, American Society for Testing and Materials, West Conshohocken, Pennsylvania, USA. Krauskopf, K. B. (1979) Introduction to Geochemistry, 2nd edn. McGraw-Hill, New York, New York, USA. Ritter, D. F. (1994) Process Geomorphology. Brown Publishers, Dubuque, Iowa, USA. Sara, M. N. (1994) StandardHandbookofSite Characterizationfor Solid and Hazardous Waste Facilities. Lewis Publishers, Boca Raton, Florida, USA. Stumm, W. & Morgan J. J. (1981) Aqueous Geochemistry: An Introduction Emphasizing Chemical Equilibrium in Natural Waters, 2nd edn. Wiley, New York, New York, USA. Thornton, I. (ed.) (1983) Applied Environmental Geochemistry. Academic Press, London, UK. Tinsley, I. J. (1979) Chemical Concepts of Pollutant Behaviour. Wiley, New York, USA. Zheng, C. & Bennett, G. D. (1995) Applied Contaminant Transport Modeling: Theory and Practice. Van Nostrand Reinhold, New York, New York, USA.

Calibration and Reliability in Groundwater Modelling (Proceedings of the ModelCARE 96 Conference held at Golden, Colorado, September 1996). IAHS Publ. no. 237, 1996.

277

Unrealistic parameter estimates in inverse modelling: a problem or a benefit for model calibration? EILEEN P. POETER Department of Geology and Geological Engineering, Colorado School of Mines, 1500 Illinois Street, Golden, Colorado 80401, USA

MARY C. HILL US Geological Survey, Water Resource Division, Box 25046, MS 413, Lakewood, Colorado 80225, USA

Abstract Estimation of unrealistic parameter values by inverse modelling is useful for constructed model discrimination. This utility is demonstrated using the three-dimensional, groundwater flow inverse model MODFLOWP to estimate parameters in a simple synthetic model where the true conditions and character of the errors are completely known. When a poorly constructed model is used, unreasonable parameter values are obtained even when using error free observations and true initial parameter values. This apparent problem is actually a benefit because it differentiates accurately and inaccurately constructed models. The problems seem obvious for a synthetic problem in which the truth is known, but are obscure when working with field data. Situations in which unrealistic parameter estimates indicate constructed model problems are illustrated in applications of inverse modelling to three field sites and to complex synthetic test cases in which it is shown that prediction accuracy also suffers when constructed models are inaccurate.

INTRODUCTION Developing conceptual models, choosing a computer code and developing model designs are the preliminary steps in a groundwater modelling project after first establishing the purpose of the project (Anderson & Woessner, 1992). These steps define the numerical models representing a given field situation, including: (a) the most important sources and sinks of water in the field system and how they are to be simulated; (b) the available data on the geohydrological system; (c) the system geometry (generally the number and type of model layers and the areal extent of these layers); (d) the spatial and temporal structure of the hydraulic properties (generally using zones of constant value or deterministic or stochastic interpolation methods); and (e) boundary condition location and type. We call the resulting models constructed models. To produce acceptable models, the best set of parameter values (and associated confidence intervals) of each constructed model need to be determined. In addition, the best calibrated models need to be identified. Thus, there are three levels of model calibration: (a) model construction; (b) parameter estimation; and (c) model discrimination. While for some constructed models the parameter estimation is unique, the overall model calibration is never unique for the complicated groundwater systems commonly

278

Eileen P. Poeter & Mary C. Hill

considered. In practice, multiple constructed models must be developed at the beginning of a project and, depending on the character of the incoming data and results of ongoing analyses, each model is either retained for further consideration or eliminated during the modelling process. Parameter estimation can be approached using an inverse model (Poeter & Hill, 1996); for example, nonlinear regression can be used to find the set of parameter values that provides the best fit of model results to field observations, where "best fit" is defined as minimizing the value of the sum-of-squared weighted residuals. However, parameter estimation is often accomplished by a trial-and-error approach, during which the modeller iteratively selects parameter values to improve the fit of model results to field observations using intuition about model response to changes in parameters, and knowledge of reasonable parameter ranges. The time consuming nature of intuitive parameter value adjustment limits the range of alternative constructed models that are considered and, given the lack of rigorous analysis of parameter correlations, variance/ covariance, and residuals, there is no assurance that the estimated parameter values for any model are "the best". Consequently, conclusive model discrimination is nearly impossible. This shortcoming is often so extreme that only one constructed model is considered. With inverse models used to determine parameter values that optimize the fit of the model results to the field observations for a given model configuration, the modeller is freed from tedious trial-and-error calibration involving changes in parameter values so more time can be spent addressing insightful questions about the hydrological system. If a constructed model is not an adequate representation of the groundwater flow system, then the estimated parameter values are not likely to reflect field conditions, predictions made using the model are likely to be in error, and decisions based on those predictions may not produce the best, or even a reasonably accurate, result. Indicators of inaccurately constructed models also include non-random weighted residuals and poor fit to the data, but the indicator focused on in this work is unreasonable parameter values (e.g. a lower hydraulic conductivity for a sand than a clay, or a boundary flux with the wrong sign) yielding the best fit between the observed and simulated values (such as hydraulic heads, concentrations and flows). Thus, during calibration it is important to determine if the best model fit is achieved for unreasonable parameter values and whether these parameter values are well estimated with the available data. Best fitting unreasonable parameter values are more likely to be discovered using the inverse modelling approach than by trial and error, because the inverse model will determine the parameters that provide the best fit evei though they may be of unreasonable absolute or relative magnitude. A modeller using the trial-and-error approach generally will not try unreasonable values (e.g. a much higher hydraulic conductivity for a silt deposit than an adjacent sandy-gravel deposit). Instead, usually unknowingly, the modeller accepts greater discrepancies in the match between field observations and model results (i.e. sacrifices the best fit between the model and the data) to maintain reasonable model parameters. The output of unreasonable values is often disillusioning to new users of inverse models, but it is actually a great benefit. At first glance it may seem that the trial-anderror approach provides the more reasonable result and so would be the preferred method. However, when the modeller sacrifices honouring data to maintain "reasonableness", important information about the system is ignored and evidence of error in the constructed model is not allowed to surface. It is our experience that well estimated,

Unrealistic parameter estimates in inverse modelling

279

unreasonable parameter values produced by the inverse model indicate that the constructed model is incorrect. With this information and the statistics calculated by the inverse model, the modeller can call upon experience with the geohydrological systems and knowledge of reasonable parameter values to develop and test numerous constructed models, increasing the likelihood that accurate models will be used for prediction.

IMPACT OF THE CONSTRUCTED MODEL ON PARAMETER ESTIMATION Synthetic problems are useful for illustrating the impact of a poorly constructed model on the estimated parameter values because the "true" subsurface conditions are known and can be compared to the outcome of the modelling process.

Synthetic model configuration A simple basin configuration, roughly 30 X 40 km is presented in Fig. 1(a). Flow is toward the streams and northward with groundwater discharge to streams and a body of surface water occupying three cells at the north end of the basin. The perimeter of the basin is defined by the groundwater divide in the east, west, and south. The saturated (a) Hydraulic Head Distribution

•5000

0

5000

10000

Fig. 1 Simple deterministic synthetic basin configuration: (a) hydraulic head distribution; (b) hydraulic conductivity (0.1, to 1, 5 and 10 m day"1 respectively from black to lightest grey); and (c) recharge increases linearly with decreasing grey tone from 1.5 x 10"5 to 3 X 10'4 m day"1, black is zero recharge or outside the basin boundary.

280

Eileen P. Poeter & Mary C. Hill

Observation Locations

b

Base Case (Cases 1 through 4)

c

Case 5

Fig. 2 Observation locations and alternative zonations for hydraulic conductivity.

thickness of the unconfined aquifer varies from 30 to 150 m within the basin. This synthetic basin is hydraulically two-dimensional: that is, there is no variation of hydraulic character with depth. Consequently, the hydraulic conductivity distribution (analogous to hydrofacies distribution) shown on the map prevails to the depth at which bedrock occurs. Four hydrofacies are delineated with increasing mean grain size, and hydraulic conductivity (K) increases from 0.1, to 1, 5 and 10 m day"1 respectively from black to lightest grey in Fig. 1(b). Recharge is in the form of infiltration, varying from 1.5 X 10"5 m day"1 in the south to 3 X 10"4 m day"1 in the northern area (Fig. 1(c)). Recharge is zero in the regional discharge area at the north end of the basin where vegetative consumption prevents infiltration to the water table. Groundwater flow is simulated in the synthetic basin to provide true values of hydraulic heads (Fig. 1(a)). True hydraulic head and hydraulic conductivity at 16 observation locations (Fig. 2(a)) and one true groundwater discharge measurement of 0.2 m3 s"1 to the northeastern tributary are used in the inverse modelling illustration to estimate hydraulic conductivity throughout the area and recharge over the basin. Parameter estimation using alternative constructed models We use MODFLOWP (Hill, 1992) to estimate parameters and find it to be robust. In

Unrealistic parameter estimates in inverse modelling

281

calibrating the simple basin model to the observed heads and flows, we find that errors in field observations, in the initial estimates of parameters or in minor variations in zonation of geohydrological units do not cause problems in obtaining a reasonable parameter estimation solution. Problems arise when a poorly constructed model (e.g. zonation significantly different from the true zonation) is hypothesized. In such a case, unreasonable parameter values are obtained even when error free observations and true starting values are used for the parameters. Although this may be perceived as a problem, it is actually beneficial because it renders inverse modelling an excellent tool for differentiating accurate and inaccurate constructed models. For this simple basin model, the hydraulic conductivity value of the four hydrofacies (K1-K4) and the magnitude of the recharge R (given its relative spatial distribution) are estimated. Hydraulic conductivity distributions and estimated parameter values for the following cases are presented in Figs 2 and 3 respectively. Case 1 - the base case: with error free observations, a perfect constructed model, and correct "true" values of parameters as a starting point, the parameter estimation process quickly converges on the true parameter values (Figs 2(b) and 3). Case 2 - incorrect initial estimates: with error free observations, a perfect constructed model and incorrect initial values of the parameters including various combinations of parameters defined as two of orders of magnitude too small or large, the parameter estimation process quickly converges on the true parameter values (Figs 2(b) and 3). Case 3 - error in observations: with error laden observations having normally distributed noise with a standard deviation of 2 m on the head observations and the flow underestimated by 4%, a perfect constructed model and true parameter values for initial estimates, the code converges readily to values near the true values. The estimated parameter values are slightly different from the true values, reflecting parameter values that better fit the erroneous observations (Figs 2(b) and 3).

Parameter Estimates

7 w

6

n E 5

z

a> 4 tn to

"

3 2 1 n 0.001

0.01 0.1 Parameter Value (m/day)

1 10 [recharge x 100]

100

Fig. 3 Estimated parameter values for alternative models presented in Fig. 2.

282

Eileen P. Poeter & Mary C. Hill

Case 4 - larger error in observations: identical to Case 3, but with a standard deviation of 4 m on head observations and a flow measurement underestimated by 7.5 % (Figs 2(b) and 3). Case 5 - minor zonation error no. 1: with error free observations and accurate initial values for the parameters, a more discontinuous definition of the coarsest grained (high hydraulic conductivity) zone than exists in the base case (Fig. 2(c)) results in rapid convergence on parameter values close to the correct values (Fig. 3). Case 6 - minor zonation error no. 2: when the percentage of finest grained facies is overestimated and the coarsest grained facies is connected to the hydraulic boundary on the north end of the domain (Fig. 2(d)), errors in parameter estimations begin to arise. In order to match the observed heads better, hydraulic conductivity of the fine grained facies (Kl) is overestimated, compensating for the overabundance of the facies in the model. Hydraulic conductivity is underestimated for the coarse grained facies (K4), compensating for the hydraulic connection that does not actually exist (Fig. 3). The cases presented to this point pose some formidable problems to the parameter estimation algorithm without difficulty in obtaining reasonable estimates. The remaining two cases show how larger errors in the facies zonation pattern can create large errors in the estimates of parameter values. Case 7 - major zonation error: when the percentage of fine grained facies (Kl) is substantially over estimated and the percentage of coarse grained facies (K4) is underestimated and is much more discontinuous than in the base case (Fig. 2(e)), an unsatisfactory set of parameter estimates results. All of the parameters, including recharge rate, are estimated to be nearly an order of magnitude or more lower than the base case values. The estimated K of the coarse grained facies is lower than the medium grained facies (Fig. 3). Case 8 — major zonation error: sometimes the predominant geology in a small borehole is not thought to represent the geology of a 4 x 106 m2 flow model grid block, so this variation includes three locations where the geological observations are not honoured for the flow model grid block (Fig. 2(f)) and yields a more reasonable recharge rate, but overestimates the K of the three finest grained facies. The relative order of hydraulic conductivity between facies is again incorrect with the finest grained facies exhibiting a higher hydraulic conductivity than the coarsest grained facies and a reversal in the expected Ks for the two coarsest grained facies (Figs 3).

EXAMPLES OF MODEL CONSTRUCTION PROBLEMS IN COMPLEX SYSTEMS The methods described above for a simple test case are powerful in the analysis of more complex systems. This has been demonstrated using complex synthetic test cases by Poeter & McKenna (1995) and in the analysis of field sites using groundwater flow

Unrealistic parameter estimates in inverse modelling

283

models of Otis Air Force Base (Anderman et al., 1996) and a Colorado School of Mines Test Site (McKenna & Poeter, 1995), and using a convective-dispersive model of the Grindsted Old Landfill in Denmark (Christiansen et al, 1995; Barbelo et al, 1996). Highlights from these analyses are summarized below. Complex synthetic test cases The three-dimensional test cases investigated by Poeter & McKenna (1995) using MODFLOWP demonstrate that the conclusions drawn above for the simple test case are also valid for a more complicated system characterized by three-dimensional flow and substantial heterogeneity. This further supports the value of application of inverse modelling to field problems. Otis Air Force Base The two-dimensional areal model of LeBlanc (1984), which was calibrated by trial and error, was recalibrated with nonlinear regression methods to test a new method of using concentration data in inverse modelling (Anderman et al, 1996). A version of MODFLOWP modified to include observations of the path and time of advective travel was used. The observed path and time of advective travel was inferred from concentration measurements of a sewage plume that was introduced into the groundwater system in the early 1940s. Other data included in the regression were hydraulic heads and net flow into the groundwater system from Ashumet pond. Estimated parameters included a homogeneous aquifer hydraulic conductivity, the hydraulic conductivity of the bed of Ashumet Pond, flow rates along the northern model boundary and at the sewage disposal site, and a uniform areal recharge rate. Using the model construction described by LeBlanc (1984), it was found that the best fit to the advective travel, head and flow data was achieved with reasonable parameter values except for the areal recharge rate, which was about half the expected rate. In addition, the recharge rate was estimated precisely enough that a linear 95% confidence limit interval constructed about the estimate did not even come close to including reasonable values. This indicates that the data used are sufficient to distinguish between an accurate and inaccurate model, and that the constructed model was significantly inaccurate in some way. A variety of potential model construction problems that might cause the unrealistic recharge rate were tested by using regression to find the best fit parameter values in each case. One of the considered alternatives proved to be a plausible explanation of the problem; it involved the constant head boundary condition imposed along the southern boundary and southern parts of the east and west boundaries of the model. These boundaries represent surface water bodies. The elevations of the surface water bodies were derived from 10 foot contour topographic maps, and these elevations were used as the defined head at these boundaries, as is common practice in the development of groundwater models. It was found that if these "measured" heads were consistently higher (by just 1 foot) than the surface water body levels when the hydraulic heads used in the regression were measured, the estimated recharge rate would be about half of a reasonable rate. This indicates that, in systems such as this in which

284

Eileen P. Poeter & Mary C. Hill

surface water bodies define boundary conditions through which most or all of the water flows, and groundwater levels are only on the order of 5 feet above the surface water bodies, it is important to measure the elevation of the surface water bodies accurately at the same time as groundwater heads are measured. Colorado School of Mines test site McKenna & Poeter (1995) combined geological and geophysical information into a description of hydrological heterogeneity. The relationship between hydrofacies (defined using geological description of cuttings and cores, geophysical logs and permeability measurements of core and from packer tests) and seismic velocity was defined subjectively and discriminant analysis was used to determine the probability that a given location belonged to the hydrofacies. The hydrofacies were described with indicator values. Multiple indicator, geostatistical realizations of hydrofacies at locations between boreholes were generated first using only hard data (data with negligible uncertainty, i. e. those with greater than 95% probability of belonging to the hydrofacies), and again, supplementing these data with soft data (data with non-negligible uncertainty, i.e. those with less than 95 % probability of belonging to a hydrofacies and seismic tomography measurements). The plausibility of each stochastic hydrofacies zonation was determined via parameter estimation using MODFLOWP. Use of soft data, coupled with elimination of realizations when parameter estimation revealed a poor fit and/or unreasonable parameter values, resulted in narrower confidence limits on the estimated values. This sensitivity to fine scale geologically based zonation patterns is fortunate because it allows use of hydraulic head data and prior information on reasonable parameter values to delineate the small scale heterogeneity that is critical to the migration of contaminants and reduces the uncertainty associated with predicted flow through the site. Grindsted Old Landfill This study is the first to use nonlinear regression in the calibration of a threedimensional advective-transport model (Christiansen et al, 1995). The system is characterized by a steady state flow field and layered, apparently homogenous hydrographic units. Data included in the regression were hydraulic heads and concentrations. Estimated parameters included horizontal and vertical hydraulic conductivities for three relatively permeable layers, the vertical hydraulic conductivity of a confining unit, and the longitudinal dispersivity. Regression results using only the hydraulic head data produced unreasonable estimates of several parameters, but the confidence intervals on these parameter estimates were extremely large and included reasonable values. This indicated that the head data were insufficient to determine whether model construction error was a problem. When concentration data were also included in the regression, the residuals were unbiased, normally distributed, and were randomly distributed in space, and all estimated parameter values were reasonable, indicating that model construction was reasonably accurate.

Unrealistic parameter estimates in inverse modelling

285

REFERENCES Anderman, E. R., Hill, M. C. & Poeter.E. P. (1996) Two-dimensional advective transport in groundwater flow parameter estimation. Groundwater (in press). Anderson,M. P. &Woessner,W. W. (1992) AppliedGroundwaterModeling Simulation oj'Flow and Advective Transport. Academic Press. Christiansen, H., Hill, M. C , Rosbjerg, D. & Jensen, K. H. (1995) Three-dimensional inverse modeling using heads and concentrations at a Danish landfill. In: Models for Assessing Groundwater Quality (ed. by B. J. Wagner, T. H. Illangesekare&K. H. Jensen)(Proc.BoulderSymp.,Colorado,July 1995), 167-175.IAHSPubl.No.227,167175. Hill, M. C. (1992) A computer program (MODFLOWP) for estimating parameters of a transient, three-dimensional groundwater flow model using nonlinear regression. USGS Open File Report 91-484. LeBlanc, D. R. (1984) Digital model of solute transport in a plume of sewage contaminated ground water. In: Movement and fate of solutes in a plume of sewage-contaminatedwater. USGS Open File Report 84-475 (ed. by D. R. LeBlanc) (Cape Cod, Massachusetts). McKenna, S. A. & Poeter, E. P. (1995) Field example of data fusion for site characterization. Wat. Resour. Res. 33(6), 3229-3240. Poeter.E. P. &Hill,M. C. (1996) Inverse models: A necessary next step in groundwater modeling. Groundwater (in press). Poeter, E. P. & McKenna, S. A. (1995) Reducing uncertainty associated with ground-water flow and transport predictions. Groundwater 33(6), 899-904.

Calibration and Reliability in Groundwater Modelling (Proceedings of the ModelCARE 96 Conference held at Golden, Colorado, September 1996). IAHS Publ. no, 237, 1996.

287

Diagnosis of structural identifiability in groundwater flow and solute transport equations DAVID E. SPEED & DAVID P. AHLFELD Department of Civil and Environmental Engineering, University of Connecticut, Storrs, Connecticut 06269, USA

Abstract This paper describes a diagnostic methodology which we use in conjunction with nonlinear least squares parameter estimation in order to assess the cause and determine the consequence of structurally related parameter identifiability problems. The methodology utilizes a singular value decomposition of the parameter sensitivity matrix and subsequently inter-relates the information content of individual data points, the structure of the model, the stability of the least squares solution and the statistical reliability of the estimated parameters.

INTRODUCTION Parameter estimation by means of obtaining inverse solutions to the groundwater flow equations has been an active research topic for over 25 years (Yeh, 1986). More recently, a growing body of research has been devoted to extending these techniques to estimate solute transport model parameters. However, despite this rich and active research history, automated parameter estimation techniques are seldom used by practitioners. Aside from computational difficulties and the lack of available software, one of the principal barriers to wide spread use of automated parameter estimation is lack of clear cut criterion governing the conditions under which a particular inverse problem is solvable. The objective of this paper is to examine parameter identifiability in terms of the algorithmic requirements and limitations of Gauss-Newton type methods, and in particular to focus on the issue of how model structure impacts identifiability and the statistical validity of the estimated parameters. A parameter is termed identifiable if it can be uniquely estimated from the data. Identifiability problems can be classified as resulting from two principal causes; data dependent causes and model dependent causes (Seber & Wild, 1989). When an identifiability problem results from the data, selection of a new set of data observation points (space-time) or more accurately determined data will mitigate the identifiability problem. Whence removal or minimization of data dependent causes of non-identifiability falls within the realm of experimental design. Several researchers, including Sun & Yeh (1990a,b) and, Knopman & Voss (1987) in particular, have made significant progress in evaluating the role of data dependent identifiability problems for the groundwater solute transport problem. Model dependent identifiability problems, however are a direct consequence of the structure of the model. That groundwater solute transport model dependent identifiability problems do occur is evidenced in the simple and' obvious case of the perfect linear correlation between groundwater velocity and the linear equilibrium retardation coefficient; the result of

288

David E. Speed & David P. Ahlfeld

which is that the two parameters cannot be simultaneously estimated using data for a single solute. In this simple case the retardation coefficient essentially functions as a scaling coefficient and the identifiability problem is obvious from a simple inspectional analysis of the governing transport equations. As more complex and, as is often the case, more realistic transport processes are added to the model, identifiability problems can be anticipated to manifest not necessarily as a complete inability to achieve a solution, but rather more subtly, in the form of decreased reliability of the estimated parameters. Consequently, we break the degree of a potential identifiability problem into two categories; acute and chronic. Acute identifiability problems manifest as rankdeficiency, with the consequence that the particular parameter or parameter combination simply cannot be identified. On the other hand, what we are terming chronic identifiability problems manifest as unstable, poorly determined parameters with high estimation variances. Sorooshian & Gupta (1985) examined model structure dependent identifiability concerns in rainfall-runoff models. They used a measure of the magnitude to which individual parameters variances are impacted by interdependencies in the model structure to determine the presence of model structure caused identifiability problems. Carrera & Neuman (1986) examined model structure dependent identifiability concerns in the groundwater flow equations, using an eigenanalysis of the parameter sensitivity cross-products matrix, which they in turn related to the parameter covariance matrix. In this work we introduce alternative diagnostic methods which utilizes a singular value decomposition of the parameter sensitivity matrix and which can be used to interrelate; the information content of individual data points, the structure of the model, the stability of the least squares solution, the statistical reliability of the estimates and the identifiability of specific parameters. In the first section of this paper we outline the Gauss-Newton type equations, laying the groundwork for using the linearized least squares matrix conditioning as grounds for a first-order identifiability criterion. The second section introduces the singular value decomposition and shows several related computations which have application to the identifiability problem as described above. PARAMETER IDENTIFICATION VIA NLLS The nonlinear least squares (NLLS) parameter estimation process is predicated upon the hypothesis that the best or optimal estimate of the true field parameters is that set of estimated parameters for which the sum of squared deviations between observed and simulated data is minimized. We pose the nonlinear estimation model in the usual sense as: yi = r1(xi;d)+ei (i = l,2,.../0 (D where yt is z'th observed response, ^(x,-; d) is the simulated output corresponding to the unknown parameter vector 9, x; is the vector of independent variables e.g. time and location of the z'th observation and e(- are the residuals which represents the error between the simulated and observed output, and which we take to be independent and normally distributed random variables with mean of zero and variance of cr2. The least squares estimate is that which minimizes the least squares objective function:

Diagnosis of structural identifiability in groundwater flow and solute transport equations 289

/ = !'

where the right hand side of equation (2) uses vector notation. Computationally, minimization of the least squares objective function requires an algorithm which iteratively updates the trial parameters and computes a new set of modeled heads and concentrations for insertion in the objective function and subsequent comparison with convergence criterion. This task can be accomplished by substituting a truncated Taylor series approximation for the real model function 17(0). The purpose of the substitution is to provide a local approximation of the true objective function, which has a readily determined local minima. Our focus lies with Gauss-Newton type methods which use a first order, or linearized approximation of the model function. Substitution of a first order Taylor series, results as follows:

m

= \\y-v(0o)-J(eo)(e-do)\\22

(3)

where J(d), is the parameter sensitivity matrix, which we describe below. Equation (3) is a quadratic approximation of the true objective function, S(0). Taking the derivative of equation (4) with respect to the parameters and setting to zero; the minimum of the quadratic model is found at:


(7)

where it is clear that small perturbations in the data are amplified when the singular values ai are small. The superscript (+) in equation (7) denotes the pseudoinverse and in particular (Golub & Van Loan, 1989), S+ = diag(l/ffj, ..., l/an 0, ..., 0).

COLUMN VECTOR INDEPENDENCE The condition of column interdependences has been termed collinearity by statisticians, or simply ill-conditioning by numerical analysts. Exact dependence corresponds to rankdeficiency, but even mild interdependence between the columns of the sensitivity matrix / leads to instability in the least squares solution and ultimately to statistically illdetermined parameters. To diagnose this situation we note that near-linear dependencies between the sensitivity matrix column vectors can be defined relative to the usual condition for linear dépendance between vectors (Datta, 1995), as follows. Near-linear dépendance between the set of vectors, / = \Juj2 ;'„], occurs when there exists a corresponding set of scalars, {au a2 an}, not all of which are zero, such that:

£«*/* s °

(8)

David E. Speed & David P. Ahlfeld

292

The a coefficients can be obtained from the singular value decomposition. Gunst & 0 Mason (1980) have previously shown that an eigenanalysis can be used to provide the coefficients for equation (6). However, we show with reference to Gunst & Mason (1980, p. 119) that the coefficients can be obtained by similar argument, but more accurately and directly using the right singular vectors corresponding to small singular values of J. From the SVD, USVT = J, we have Jvi = ofc. A property of singular vectors is that they are always less than 1.0, and so if the singular value is near zero it follows that oft = 0, and so Jvt = 0. Thus we have that Jvt = Y,%\vriir = 0- One equation of the form of equation (8) can be written for each near zero valued singular value. The coefficients, ak, in equation (8) are given by the corresponding right singular vectors, each one of which traces back to a corresponding column in the parameter sensitivity matrix. The number of small singular values indicates the number of dependencies and their magnitudes relative to the largest singular value indicate the importance of the dependencies. Just as the ratio of the largest to smallest singular value provides the matrix condition number, the ratio of the largest to the z'th singular value provides a corresponding measure termed the condition index. Thus the impact of each individual small singular value can be assessed on a scale relative to the matrix condition number. As an example, consider a coupled groundwater flow and solute transport estimation problem involving eight active parameters as listed in Table 1. The full eight-parameter estimation problem is nominally full rank, but the condition number is 3.92 X 108, and thus the system is highly ill-conditioned. The raw parameter estimation variances were computed and are listed in Table 1, along with the true parameter values and the coefficient of variation. The coefficient of variation is given as ratio of the standard deviation of the pth parameter to the magnitude of the pth parameter value, or [var(0 )]'/2/6> , and thus being nondimensional provides a convenient measure of the relative uncertainty in the parameters. In this case five of the eight parameters, Kx,Ky, 6, B and Qin exhibit high variances and high raw coefficients of variation and we deem them to be chronically unidentifiable. The remaining three parameters, aL, aT and Cin are shown to have desirable raw coefficients of variation and thus in principal to be identifiable. To establish the source of the high estimation variances we examine whether the model structure has resulted in interdepencies between the columns of the sensitivity matrix. The column interdependencies are found from the singular value decomposition as given in Table 2. This table shows two singular values whose magnitudes are particularly small in relation to the largest singular value, indicating two separate sources of ill-conditioning. Two equations of the form of equation (8) then need be Table 1 Raw parameter estimation variances and coefficients of variation for the full 8 parameter set. Kx True value Rawvariance

Ky

24.0 7.57 X 10

Raw coefficient 114.65 of variation

6

18.0 6

4.26 X 10 114.65

aL

0.3 6

1.18 X 10 114.65

vT

30.0 3

5.98 X 10 0.0082

B

5.0 2

Q,„

20.0 4

5.85 X 10"

4.13 X 10

0.0048

101.58

C„

2000.0 6

100.0 9

3.43 X 10 6.25 x 10'3 29.37

0.0008

Diagnosis of structural identifiability in groundwater flow and solute transport equations

©

O

o

X

X

X

X

X m

X

X

Tt

SO

CO CO

TT

•3-

ir, 1

rt

o CO

1

o

X

X

X

CO

CO

CO

^H

r-i

^•«

*~*

o

o

o-

o

X

X

X

t-

OS OS in

o\

SO

o X

©

S£>

O

O

o

X

X

X

t'-

IN CN

00 CO

as 1

i—i

O

X cOs OS

1

1

o

c-l O

X

X

X

CN CN

Sd>

so —< so

O O O O O O S O O

X

X

X

OS

CO d>cicic>o

X

o

^ o s c n s o r-i t^- — t CM OOOOHSOSO-ist

qoooincNOO o

O

o

b

X

X

X

X

X

X

r-

Os

00 CN

00 00

IN

CN

OS CN

~

1

1

i—i

1

ooososinooc—oc— rfin-tstsooowi - H — I , - I O < - - S O O O qo-HHpjod>d>d>

so

1

O

o

o

o

O

o

o

O

X

X

X

X

X

X

X

X

X

OS

CO OS

CO SO in

in OS

r-

t-

Os

OS IN

CO SO

7

IN

(N

o r-

• *

1

1

O

o

o

o

o

O

o

o

X

X

X

X

X

X

X

X

X

00 Os OS

•*

CO

Suggest Documents