Kriging and Conditional Geostatistical Simulation ... - Semantic Scholar

11 downloads 0 Views 5MB Size Report
Oct 20, 2003 - Kriging and Conditional Geostatistical Simulation. Based on Scale-Invariant Covariance Models. Diploma Thesis by. Rolf Sidler. Supervisor: ...
Kriging and Conditional Geostatistical Simulation Based on Scale-Invariant Covariance Models

Diploma Thesis by Rolf Sidler Supervisor: Prof. Dr. Klaus Holliger

INSTITUTE OF GEOPHYSICS DEPARTMENT OF EARTH SCIENCE

October 20, 2003

Copyright Diplomarbeiten Departement Erdwissenschaften, ETH Z¨ urich ‘Der/Die Autor/in erkl¨art sich hiermit einverstanden, dass die Diplomarbeit f¨ ur private und Studien-Zwecke verwendet und kopiert werden darf. Hingegen ist eine Vervielf¨altigung der Diplomarbeit oder die Benutzung derselben zu kommerziellen Zwecken ausdr¨ ucklich untersagt. Wenn wissenschaftliche Resultate aus der Arbeit verwendet werden, m¨ ussen diese wie in allen wissenschaftlichen Arbeiten u ¨blich entsprechend zitiert werden.’

Copyright Diploma Thesis, Department of Earth Sciences, ETH Z¨ urich ‘The author hereby agrees that the diploma thesis may be copied and used for private and personal scholarly use. However, this is to emphasize that the making of multiple copies of the thesis or the use of the thesis for commercial purposes is strictly prohibited. When making use of the results found in this thesis, the normal scholarly methods of citation are to be followed.’

Unterschrift des/der Diplomierenden Signature of Diploma-Student

Contents 1 Introduction

2

2 Theory 2.1 First- and Second-Order Stationarity 2.2 Covariance Function . . . . . . . . . 2.3 von K´arm´an Covariance Model . . . 2.4 Kriging Interpolation . . . . . . . . . 2.4.1 Search Neighborhood . . . . . 2.4.2 Basic Types of Kriging . . . . 2.4.3 Simple Kriging . . . . . . . . 2.4.4 Ordinary Kriging . . . . . . .

. . . . . . . .

3 Description of the Implementation 3.1 Program Options . . . . . . . . . . . . 3.2 Overall Structure of the Code . . . . . 3.3 Description of the Individual Functions 3.3.1 vebyk . . . . . . . . . . . . . . 3.3.2 kriging3 . . . . . . . . . . . . . 3.3.3 inputmatrix . . . . . . . . . . . 3.3.4 neighborhood . . . . . . . . . . 3.3.5 buildbigc . . . . . . . . . . . . . 3.3.6 buildsmallc . . . . . . . . . . . 3.3.7 displacement3 . . . . . . . . . . 3.3.8 covcalc . . . . . . . . . . . . . . 3.3.9 ordinary . . . . . . . . . . . . . 3.3.10 rotation . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

. . . . . . . .

. . . . . . . . . . . . .

4 Testing of the Implementation 4.1 Comparison with a published example . . . . . . . . . . . . . . . . 4.2 Structural Anisotropy . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Search Neighborhood . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Leaving-One-Out Cross-Validation . . . . . . . . . . . . . . 4.4.2 Jackknife Cross-Validation . . . . . . . . . . . . . . . . . . . 4.5 Sensitivity of Kriging Estimation with Regard to Auto-Covariance Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

. . . . . . . .

4 4 5 7 8 8 10 10 11

. . . . . . . . . . . . .

14 14 14 15 15 15 17 17 18 19 19 20 20 21

. . . . . .

23 23 24 26 28 29 31

. 37

5 Application of Kriging to Conditional Geostatistical Simulations 5.1 Unconditional Simulation . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Conditional Simulations of Porosity Distributions in Heterogeneous Aquifers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Boise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Kappelen . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

40 41

6 Conclusions

58

7 Acknowledgments

60

Appendix

42 43 50

61

A Kriging

61

B Boise

64

C Kappelen

70

Abstract Kriging is a powerful spatial interpolation technique, especially for irregularly spaced data points, and is widely used throughout the earth and environmental sciences. The estimation at an unsampled location is given as the weighted sum of the circumjacent observed points. The weighting factors depend on a model of spatial correlation. Calculation of the weighting factors is done by minimizing the error variance of a given or assumed model of the auto-covariance for the data with regard to the spatial distribution of the observed data points. As part of this work, I have developed a flexible and user-friendly matlab-program called vebyk (value estimation by kriging), which performs ordinary kriging and can be easily adapted to other kriging methods. Extensive tests demonstrate that (i) heavily clustered data require an adaption of the search neighborhood, (ii) kriging may cause artefacts in anti-persistent media when using the “correct” auto-covariance model and (iii) best performance for kriging scale-invariant media is obtained when using smoother auto-covariance models than those indicated by the observed dataset. Conversely, my results indicate that kriging is relatively insensitive to the absolute value of the correlation lengths used in the auto-covariance model as long as the structural aspect ratio is approximately correct. Finally, this kriging algorithm has been used as the basis for conditional geostatistical simulations of the porosity distribution in two heterogeneous sedimentary aquifers. The stochastic simulations were conditioned by porosity values derived from neutron porosity logs and georadar and seismic crosshole tomography. The results indicate that conditional simulations in “hydrogeophysics” will prove to be similarly useful for quantitatively integrating numerous datasets of widely differing, resolution, coverage and “hardness” as it has found to be in more established fields of reservoir geophysics.

Chapter 1 Introduction Gathering information about subsurface properties is always a difficult task, as direct access is generally not possible. Measurements are therefore more sparsely sampled than desired. Although more samples can not be generated without additional measurements, it is possible to take advantage of characteristic properties of a dataset and to estimate values at unsampled locations that fit these characteristics. The mathematical methods to evaluate these characteristics lie in the realm of statistics, but for reliable results in earth science, the characteristics have to fit not only mathematical but also geological criteria. For this reasons, the methode for estimating values at unsampled locations is therefore referred to as geostatistics. Basic classical statistic data analysis consists of data posting, computation of means and variances, scatter-plots to investigate the relationship between two variables and histogram analysis. In the early 1950s, these techniques where found to be unsuitable for estimating disseminated ore reserves. D. G. Krige, a South African geoscientist, and H. S. Sichel, a statistican, developed a new estimation technique (Krige, 1951). The method was based on statistical properties of the investigated region. Matheron (1965) formalized the concepts of the new branch of geoscience that combines “structure and randomness” and proposed the name geostatistics for this new scientific discipline. By the early 1970s, kriging has proved to be very useful in the mining industry. With the arise of high-speed computers, geostatistics spread to other areas of earth science and has become more popular ever since. Journel and Huijbregts (1978) summarized the state of the art of “linear” geostatistics with an emphasis mostly with regard to practical and operational efficiency. They also described the then novel technique of conditional simulation that leaves the field of “linear” statistics. The goal of a conditional simulation is to model a region numerically so that the auto-covariance of the model complies with the auto-covariance of the observed data and the model data coincide with the data at sampled locations. As most conditional simulations go hand in hand with unconditional simulation, the quality of which depend strongly on computational capabilities, the technique has only recently seen wider application. Conditional simulation has proved to be essential for the planing and control of mining and hydrocarbon recovery processes as well as for the optimization of ground-water and contaminant flow simulations. In the course of this diploma thesis I have developed a Matlab implementation of 2

3

the kriging algorithm. A major motivation of this work is that there are few available codes and that these codes are gennerally inflexible and/or poorly documented. Given the rapidly increasing importance of geostatistics in applied and environmental geophysics, the availability of a comprehensible, expandable and well documented code was considered to be essential. After outlining the theory in Chapter 2 and describing the implementation of the code in Chapter 3, to convey confidence in its reliability and present various tests of the code to point to issues that have to be considered when interpolating with kriging. Conditional simulations for “hhydrogeophysical” field data from Boise (Idaho,USA) and Kappelen (Bern, Switzerland) are presented in Chapter 5. The simulations were obtained by adapting stochastic models to measured data trough kriging interpolation. Finally, the Appendices contains results of kriging with different ν-values and correlation lengths as well as additional realizations of the presented conditional simulations.

Chapter 2 Theory The goal of this chapter is to provide a summary of the theory needed for understanding the implementation of the kriging technique, realised as part of this work. Detailed descriptions of the theory of kriging are given, for example, by Armstrong (1998); Isaaks and Sarivastava (1989); Journel and Huijbregts (1978); Kelkar and Perez (2002) and Kitanidis (1997). Interpolation with kriging is based on the spatial relationship of random, spatial variables. A random variable can take numerical values according to a certain probability distribution. For instance, the result of casting an unbiased dice can be considered as a random variable that can take one of six equally probable values (Journel and Huijbregts, 1978). The spatial relationship is given by the covariance function or, equivalently, by the semi-variogram. A thorough understanding of the concept of covariance is therefore necessary for working with kriging interpolation. For this reason, this topic is covered before the methodological foundations of kriging are outlined.

2.1

First- and Second-Order Stationarity

Geostatistics tries to predict the values of a random spatial variable at unsampled locations by using values at sampled locations. For reasonable predictions some basic assumptions have to be made. An inherent problem of this approach is that for the unsampled locations no information is available and a verification of the assumptions is not possible until these missing samples are available. Nevertheless, practice has proven geostatistics to be a powerful tool for improving data-based subsurface models, for optimizing the recovery and sustainable use of natural resources and for assessing environmental hazards. A key assumption in geostatistics is that the data are first- and second-order stationary. First-order stationarity implies that the arithmetic mean within the considered region is constant and independent of the size of the region and of the sampling locations. This in turn implies that local means inside a region are constant and correspond to the global mean of the entire region. The mathematical definition of first-order stationarity is more general: f [X(~u)] = f [X(~u + ~τ )], 4

(2.1)

2.2 Covariance Function

5

where X is a random variable, f [...] is any function of a random variable and ~u and ~u + ~τ are two locations of the random variable. For all practical intents and purposes, however, the condition of constant arithmetic mean within a certain region is most important for ensuring first-order stationarity. Second-order stationarity requires that the relationship between two data values depends only on the distance of the two points but is otherwise independent on their absolute position. Mathematically, this is expressed as: f [X(u~1 ), X(u~1 + ~τ )] = f [X(u~2 ), X(u~2 + ~τ )].

2.2

(2.2)

Covariance Function

The covariance function describes the spatial relationship of the data points as function of their distance vectors. The definition of the covariance function can be expressed in terms of the expected value. The expected value is the most probable value for a random data distribution and is defined as: Z ∞ E[X(k)] = Xp(X) dX = µX , (2.3) −∞

where X(k) is a random variable and p(X) is its probability of occurrence (Bendat and Piersol, 2000). The variance of a random variable is given by: σ 2 = V ar(X) = E(X 2 ) − [E(X)]2 ,

(2.4)

where σ is the standard deviation. The cross-covariance function describes the relation between distance and the variance. It is defined as: C(X, Y ) = E[(X − µX ) · (Y − µY )],

(2.5)

where µX and µY are the expected values of the two random variables X and Y and therefore constants. A constant factor can be taken out of the expression for the expected value (see equation 2.3) and the above definition for the cross-covariance function can be rewritten as: C(X, Y ) = E(XY ) − E(X)E(Y ).

(2.6)

For the so called auto-covariance C(X, X) the value for zero distance, or zero lag, thus corresponds to the variance (equation 2.4). With increasing lag the covariance decreases depending on the spatial relationship of the dataset. If there is cyclicity in the data set, the auto-covariance will mirror this cyclicity as a function of the lag. It should, however, be noted that, in principle, the presence of cyclicity, does, however violate the assumption of second order stationarity. The auto-covariance function can alternatively be calculated through the socalled autocorrelation function (Bendat and Piersol, 2000): RXX (τ ) = E[X(u)X(u + τ )],

(2.7)

6

Theory

where u is a location and τ is the lag to this location. The auto-correlation function distinguishes itself form the covariance function only for non-zero mean values. The relation between the both is: CXX (τ ) = RXX (τ ) − µ2X .

(2.8)

This is important as the Wiener-Khinchine theorem describes the relation between the autocorrelation and the spectral density function, which is often used to generate random data with a given auto-covariance function. N.Wiener and A.I.Khinchine proved independently of each other in the USA and in the UdSSR, respectively, that the spectral density function SXX is the Fourier transformed of the correlation function (Bendat and Piersol, 2000): Z ∞ RXX (τ )e−j2πf τ dτ, (2.9) SXX (f ) = −∞

√ where f is the frequency, τ the lag and j = −1. The validity of this theorem rests on the condition that the integral over the absolute values of the correlation function is finite, which is indeed always the case for finite record lengths. The inverse Fourier transform of the spectral density function therefore yields to the correlation function: Z ∞ SXX (f )ej2πf τ df. (2.10) RXX (f ) = −∞

An alternative way to describe the spatial relationship of a dataset is the variogram, or the semi-variogram (i.e. half the variogram). The semi-variogram is commonly used in geostatistical data analysis, because, as opposed to the covariance function, it can also be calculated if the mean of a dataset is not known. The semi-variogram is therefore more convenient to analyze the spatial relationship of an unknown dataset. It is defined as: 1 γ(~τ ) = V ar[X(~u) − X(~u + ~τ )], 2

(2.11)

where ~u is a location and τ is the lag to this location. For a first- and secondorder stationary function, the relation between the auto-covariance function and the semi-variogram is given by (Kitanidis, 1997): γ(~τ ) = C(0) − C(~τ ).

(2.12)

As the auto-covariance function starts at the variance of the data and decreases with increasing lag, the variogram starts at zero and increases as lag increases (Figure 2.1). If the data are stationary, the semi-variogram will reach a sill where γ(~τ ) reaches the value of the variance. The distance τ at which the sill is reached is called range. When the semi-variogram does not reach a sill and the auto-covariance is therefore not defined, a “pseudo-covariance” can be defined, such as γ(τ ) = A−C(τ ), where A is a positive value of γ(τ ) at a distance τ that lies beyond the region of interest. This does not affect the calculation of the kriging weights, but the

2.3 von K´arm´an Covariance Model

7

C(0) = s2

semi-variogram

covariance function lag

Figure 2.1: Shematic illustration between the auto-covariance function and the corresponding semi-variogram

error variance, which is a measure of the accuracy of the estimation (Journel and Huijbregts, 1978; Kelkar and Perez, 2002). The “experimental semi-variogram” is calculated from the given data and therefore a discrete and often irregularly sampled function. The experimental variogram can then be approximated through a continuous parameter model. Due to the limited scale of sampled values, it is, however, often not possible to estimate the correct variogram or auto-covariance function of a region (Western and Bl¨oschl, 1999). As we shall see, the use of the auto-covariance function instead on the semivariogram is preferable for the purposes of kriging, because the equations are then largely identical for simple and ordinary kriging.

2.3

von K´ arm´ an Covariance Model

Theodore von K´arm´an introduced a novel family of auto-covariance functions to characterize the seemingly chaotic, random velocity fields observed in turbulent media (von K´arm´an, 1948): σ2 (r/a)ν Kν (r/a), (2.13) 2ν−1 Γ(ν) where Γ is the gamma function, and Kν is the modified Bessel function of the second kind of order 0 ≤ ν ≤ 1, r is the lag, and σ is the variance. The media described by this correlation functions are self-affine for 0 ≤ ν ≤ 0.5 and selfsimilar for 0.5 ≤ ν ≤ 1, at distances considerably shorter than the correlation length a (Klimeˇs, 2002). Figure 2.3 shows von K´arm´an correlation functions for different values of ν. Small ν-values characterize rapidely decaying covariance and thus highly variable media. The von K´arm´an auto-covariance function with ν = 0.5 corresponds to the well-known exponential auto-covariance function: C(r) =

8

Theory

1

0.9

0.8

Auto−Covariance

0.7

0.6

ν =1

0.5

ν =0.5 0.4

0.3

ν =0.9

ν =0.1

0.2

ν =0.7 ν =0.4 ν =0.3 ν =0.2

0.1

0

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

r/a

Figure 2.2: Set of one-dimensional von K´arm´an covariance functions with a correlation length of a = 1.5, varying ν, and variance σ 2 = 1.

C(r) = σ 2 e(−r/a) .

2.4

(2.14)

Kriging Interpolation

ˆ at an unsampled Kriging is a linear interpolation technique, in which the value X location ~u is estimated as: ˆ u~0 ) = X(

n X

λi X(~ ui ),

(2.15)

i=1

where X(~ ui ) are the values at neighboring sampled locations and λi are the kriging weights assigned to these values. The estimated value is therefore a weighted average of the surrounding sampled values. The “crux” in kriging is to compute the kriging weights, which depend on the spatial relationship of the data. This spatial relationship is quantified by the semi-variogram, or equivalently, by the autocovariance function, which is inferred from available data and/or constrained by complementary or a priori information. Kriging is an unbiased estimator and is therefore referred to by the acronym “BLUE”, which stands for best linear unbiased estimator.

2.4.1

Search Neighborhood

The search neighborhood defines the sampled points used for estimation of values at unsampled locations. Using all sampled values would lead to the most accurate solution. In practice, there are, however, several reasons not to do so and to choose a smaller search neighborhood. The reasons are the following:

2.4 Kriging Interpolation

9

• Computing the kriging weights involves a matrix inversion. The size of this matrix increases with the number of points in the search neighborhood. The increase in computational cost for inverting the matrix does not scale linearly to the number of sample points used. For the algorithm considered here, doubling the number of sample points results in an eightfold increase of CPU time and a fourfold increase in memory. • Figure 2.1 shows that with increasing distance, the spatial relationship between data, as defined by the auto-covariance function, is decreasing. Therefore, distant points are associated with small kriging weights and hence, do not necessarily improve the estimation. • Kriging algorithms are based on the assumption of first- and second-order stationarity. This assumption is not always satisfied for the entire sampled region. By restricting the search neighborhood, local stationarity is enforced and the estimation is therefore more representative. • Estimating a variogram using a fixed number of data points in a limited sampling region is quite tricky and succeeds in most cases only for small lags (Western and Bl¨oschl, 1999). For this reason, using points with large lags, may in fact be detrimental. • If too many points are used, especially clustered sample points, there is a possibility that the matrix to be inverted becomes quasi-singular, thus making its inversion very problematic. On the other hand, it is clear that the use too few sampled points cannot provide an accurate and representative estimation. So a decision has to be made with regard to the appropriate number of points used for interpolation. In practice, this number often lies somewhere between 12 and 32 (Armstrong, 1998; Kelkar and Perez, 2002). The search neighborhood can have a significant influence on the result, particularly for irregularly sampled data. The size, direction and shape of the search neighborhood therefore depends on the nature of the dataset considered and has to be chosen carefully. It has to be taken in account that conditions may change within a region and it may not be sufficient to estimate and test the parameters at a single location. The size of the search neighborhood defines also the size of a region where the mean is assumed to be constant. Problems can arise if many sample points are clustered or if only few sample points are close to the point to be estimated. If the number of points used for interpolation is constant and a large number of points is clustered in only one direction and sparse, far away points lying in other directions are neglected, the estimation may lose reliability. If not enough data points are present whitin a reasonable distance and the samples come from farther and farther away, the validity of stationarity may not be given anymore. Nevertheless, the estimation often improves if also points outside the range of the covariance function are considered, rather than when too few points are used for interpolation (Isaaks and Sarivastava, 1989).

10

2.4.2

Theory

Basic Types of Kriging

There are a number of different versions of kriging. The most important ones are: ordinary kriging, simple kriging, cokriging, indicator kriging and universal kriging. In cokriging, the estimation of a variable is not only based on its own auto-covariance function, but also on its spatial relationship to another variable. This can be useful if a variable is sparsely sampled but has a similar spatial relationship as extensively sampled variable. Indicator kriging not only estimates a value for an unsampled location, but also provides information about the uncertainty of the estimation. The data is indicator transformed (Kelkar and Perez, 2002) and for every threshold, interpolation is applied. Therefore indicator kriging is computational intensive as several kriging runs are necessary for a single estimation. An other common kriging procedure is universal kriging, in which the sample data are assumed not to be stationary, but to follow a trend. The most common kriging versions are simple kriging and ordinary kriging, which are discussed in some detail below and have been implemented in vebyk .

2.4.3

Simple Kriging

In simple kriging the mean of the kriged region is assumed to be known and constant. As this is not often the case, simple kriging is relatively rarely used. The method is sometimes used in very large mines, such as those in South Africa, where the mean of the kriged areas is known because the region has been mined for a long time. ˆ at location ~u, as: Simple kriging estimates an unsampled value X ˆ u~0 ) = λ0 + X(

n X

(λi X(~ ui )),

(2.16)

i=1

where λ0 is the regional mean and λi is the kriging weight at location u~i with the sampled value X(~ ui ). By assuming that over a large number of estimations the errors cancel each other out, we call for the condition of unbiasedness: ˆ u0 )] = 0. E[X(~u0 ) − X(~

(2.17)

This implies that the mean error of the estimation is zero. Together with equation (2.16) this yields: E[X(u~0 )] = λ0 +

n X

(λi E[X(~ui )]),

(2.18)

i=1

where the expected values E[X(u~0 )] and E[X(~ ui )] could in principle be different form each other. Enforcing the first-order stationarity, i.e. E[X(~u0 )] = E[X(~ui )] = m, the expression for λ0 is:  λ0 = m 1 −

n X

 λi .

(2.19)

i=1

The error variance for the differences between the true and the estimated values is given by:

2.4 Kriging Interpolation

σ ˆE2

11

  n X   ˆ = V ar X(~u0 ) − X(~u0 ) = V ar X(~u0 ) − (λi · X(~ui )) − λ0

(2.20)

i=1

Going back to the definitions of variance (equation. 2.4) and covariance (equation 2.6) it can be shown that: V ar[X − Y ] = V ar[X] + V ar[Y ] − 2C(X, Y ).

(2.21)

Given that λ0 is a constant, it does not influence the variance and equation 2.20 can be written as:

σ ˆE2

= V ar[X(~u0 )] + V ar

X n

   n X (λi X(~ui )) − 2C X(~u0 ), (λi X(~ui ))

i=1

= C(~u0 , ~u0 ) +

n X n X

i=1

λi λj C(~ui , ~uj ) − 2

i=1 j=1

n X

(2.22)

λi C(~ui , ~u0 ).

i=1

To find the optimal weighting factors that minimize the error of variances, we set the first derivative with respect to λ to zero: n X ∂σ ˆE2 =0=2 λj C(~ui , ~uj ) − 2C(~ui , ~u0 ) for i = 1, ..., n ∂λi j=1

Written in matrix form this is:      C(~u1 , ~u1 ) · · · C(~u1 , ~un ) λ1 C(~u0 , ~u1 )    ..    .. ..   .  =  . . . C(~un , ~u1 ) · · · C(~un , ~un )

λn

(2.23)

(2.24)

C(~u0 , ~un )

This system of equations is now solved for the kriging weights λi , which then allows us to estimate data at unsampled locations ~u0 as:   X n n   X ˆ X(~u0 ) = m 1 − λi + λi X(~ui ) i=1

2.4.4

(2.25)

i=1

Ordinary Kriging

Ordinary kriging is probably the most widely used form of kriging. As opposed to simple kriging, the mean value of the region has not to be known in advance. This is a reasonable assumption because the only way the mean could be predicted is by assuming that the means of the often sparse datasets are representative of the global means of the sampling regions, i.e. that the data obey first-order stationarity. In practice, however, the mean is often subject to lateral changes, which is accounted for by ordinary kriging.

12

Theory

The mathematics of ordinary kriging is quite similar to that of simple kriging. Adopting the starting equation (equation 2.16) for an unknown mean by assuming E[X(u~0 )] = E[X(~ ui )] = m(u~0 ) yields:   n X λ0 = m(~u0 ) 1 − λi .

(2.26)

i=1

The value of the regional mean λ0 is not known, but it can be forced to zero, assuming the local mean m(~u0 ) to be the mean of the dataset. This results in the following condition for unbiasedness: n X

λi = 1

(2.27)

i=1

The minimization of the variance is quite similar to that in simple kriging. The error variance expressed in terms of auto-covariance (equation 2.5) is the same as in equation (2.22). For minimizing the error variance, the condition defined in equation (2.27) has to be taken in account. This can be done using the Lagrange multiplier method (Luenberger, 1984), which defines the function F as:

F =

σ ˆE2



X n

 λi − 1

i=1

= C(~u0 , ~u0 ) +

n X n X

λi λj C(~ui , ~uj ) − 2

i=1 j=1

n X

λi C(~ui , ~u0 ) + 2µ

i=1

X n

 λi − 1 ,

i=1

(2.28) where µ is the Lagrange parameter. Minimization the error variance is now achieved by differentiating F with respect to λi and µ and setting the resulting equations to zero: n X ∂F =2 λj C(~ui , ~uj ) + 2µ − 2C(~ui , ~u0 ) = 0 for i = 1,...,n ∂λi j=1

(2.29)

and n

X ∂F = λi − 1 = 0. ∂µ i=1

(2.30)

The kriging weights λi and µ can now be calculated by solving the two above equations simultaneously. To this end, these equations are written in matrix form:      C(~u1 , ~u1 ) · · · C(~u1 , ~un ) 1 λ1 C(~u0 , ~u1 )   .. ..   ..   ..   .    . . . = (2.31)     . C(~un , ~u1 ) · · · C(~un , ~un ) 1 λn  C(~u0 , ~un ) 1 ··· 1 0 µ 1 The corresponding estimation at the unsampled location u~0 is then given as:

2.4 Kriging Interpolation

13

ˆ u~0 ) = X(

n X

λi X(~ ui ).

(2.32)

i=1

As already mentioned above, a mayor benefit of ordinary kriging is that it does not require the data to be strictly first-order stationary. The mean can therefore be laterally variable and only considered to be constant within the search neighborhoods of the points that we aim to estimate. The data are thus allowed to have a largerscale trend, which is indeed often the case.

Chapter 3 Description of the Implementation This section documents the Matlab implementation of the kriging interpolation developed in this thesis. The code consists of a main function and several subfunctions (Figure 3.1). This classical modular approach facilitates the understanding of the code and allows later users to adapt it to their requirements by changing only individual functions. The code of the functions is straightforward and largely selfexplanatory. Only reference functions were used and hence no Matlab toolboxes are needed to run the current implementation. The program was tested using Matlab Version 6.1.

3.1

Program Options

The program is designed to interpolate values on a regular two-dimensional grid using ordinary kriging. By slightly modifying the code, it is also possible to use it for simple kriging (see section 3.3.9). The grid of estimated or “kriged” values is rectangular and spans the range of coordinates in the dataset of sampled values. The spacing between estimated points for x - and y-axes can be specified individually and the sampled values do not have to follow a spatial order. For the search neighborhood, the number of points used for interpolation can be specified. Arbitrarily oriented spatial anisotropy of the covariance function can be accounted for. The current implementation is based on the von K´arm´an family of covariance function. The correlation length a and the ν-value of the von K´arm´an function can be specified. Finally, “handles” are provided to switch on a wait bar or a cross validation mode for comparison of the interpolated value at a sampled location.

3.2

Overall Structure of the Code

The implementation of the code is structured hierarchically, as shown in Figure 3.1. The main function vebyk (value estimation by kriging) manages the data in/output and administrates the different steps before and after the actual kriging routine. The actual interpolation for unsampled points is performed by the kriging3 function that gets support from other functions. 14

3.3 Description of the Individual Functions

15

vebyk

inputmatrix

ordinary

kriging3

neighborhood

buildsmallc3

buildbigc3

covcalc

displacement3

rotation

Figure 3.1: Hierarchical structure of functions used in the implementation of the kriging algorithm.

3.3 3.3.1

Description of the Individual Functions vebyk

vebyk is the main function and therefore has to be given all the information used. Table 3.1 describes the parameters which have to be specified. In order to keep the code simple, vebyk does not check or echo input parameters. The calculation of distances and relative positions are based on two-dimensional Cartesian coordinates. The sampled values for input do not have to lie on a specific grid, but can be anywhere in the interpolated region. The first thing the function does, is to calculate the coordinates of the grid, for which values should be estimated. This is done for a domain of rectangular shape defined by minimum and maximum x - and ycoordinates of the sampled values. Spacing between points is given by the input parameter dgrid. The coordinate grid is created using the function inputmatrix. The next step is the rotation of the coordinates for cases in which the anisotropy axes are not parallel to the axes of the coordinate system. This is described in more detail in sections 3.3.10 and 4.2. Then a for loop is started, which repeats the next steps for every point for which a value has to be estimated. The estimation is performed by first calling up the neighborhood function (see section 3.3.4) for evaluating the points used for kriging. The relative coordinates of these points are given to the kriging3 function, which calculates the kriging weights (see section 3.3.2). The weights are summed up with the corresponding values to to establish the estimation. After this has been done for every point in the grid, any previous coordinate transformation is reversed by the rotation function.

3.3.2

kriging3

This function calculates the kriging weights for a specific point. The system of equations for the kriging weights is built up by calling buildbigc (section 3.3.5)

16

Description of the Implementation

Output output

output grid with estimated values

errorvariance

errorvariance at estimated locations

Input coordinates and coord values of sampled points distance between dgrid grid points number of points points used for interpolation proportion of anisoanisotropy tropy angle between coalpha ordinate and anisotropy axes parameter of the nu von K´arm´an covariance function distance of spatial range correlation handle to switch crossv on cross validation mode handle to switch on verbose weight bar

Table 3.1: Parameters for vebyk.

Output lambda

kriging weights

errorvariance

errorvariance at estimated locations

Input relative coordinates position of sampled points proportion of anisoanisotropy tropy parameter of the nu von K´arm´an covariance function horizontal correlarange tion length

Table 3.2: Parameters for kriging3.

and buildsmallc (section 3.3.6). Using the input parameters shown in Table 3.5, the first function constructs the matrix C containing all covariances among the sampled points, while the second function returns a vector ~c containing the covariances between all sampled point and the point to estimate. This would already suffice for simple kriging. The function ordinary expands the matrix and the vector for ordinary kriging as described in section 2.4.4. For changing the interpolation routine to run with simple kriging, the line calling up the ordinary function can be

3.3 Description of the Individual Functions

17

commented out. Solving the system of equations is done by inverting the matrix C and multiplying with ~c. The error variance is calculated to provide information about the reliability of the estimated value. Input and output parameters of the function are shown in Table 3.2.

3.3.3

inputmatrix Output input

n×3 matrix of coordinates and values

matrix

dx

dy

Input rectangular matrix containing only values distance between values in x direction distance between values in y direction

Table 3.3: Parameters for inputmatrix.

The function inputmatrix generates a matrix with three columns containing the x- and y-coordinates in the first two columns and the corresponding parameter values in the third column. Each row in the matrix thus specifies one data point. The input is for this function an array of values defined on a evenly spaced grid. The spaces between the sampled points in x- and y-direction are given by the input parameters of the function (Table 3.3). The function is used in vebyk to create the grid of estimation points, but can also be used to convert a rectangular input matrix with equally spaced values to the format required by vebyk.

3.3.4

neighborhood

neighborhood is the function that chooses the points used for the estimation process. As mentioned in section 2.4.1, the search neighborhood has considerable influence on the result of the kriging interpolation. Therefore, the search neighborhood should be adapted to the dataset used. The neighborhood function defines a simple search neighborhood by choosing a defined number of sampled points lying next to the point to be estimated. The points are figured out by giving respect to the distance between sampled and estimated point only. Anisotropy is taken in account as the search neighborhood becomes quickly too small in one direction if there is significant structural anisotropy. As kriging is a smooth estimator, a too small search neighborhood may result in artefacts in the form of high frequent oscillations (see section 4.2 and Figure 4.5). neighborhood calculates the distances between sampled points and the location on which the estimation is made. Therefore the x- and y-axes are scaled in accordance with the structural anisotropy. Then it takes the specified number of points with the smallest lags and calculates the relative coordinates. If the cross-validation handle is on, the first point is skipped in order not to use the

18

Description of the Implementation

values

position

Output sampled values of the points used for kriging relative coordinates of the sampled points

Input x

x-coordinate of the point to estimate

y

y-coordinate of the point to estimate

number of points used for interpolation coordinates and values of all availcoord able sampled points handle to switch crossv on cross validation mode proportion of anisoanisotropy tropy points

Table 3.4: Parameters for neighborhood.

sampled data at a sampled location (see section 4.4). Alternatively, neighborhood2, which is the implementation of a quadrant search (Isaaks and Sarivastava, 1989), can be used instead of neighborhood. This is useful if the sampled points are clustered. Section 4.3 shows an example of artefacts caused by non adequate search neighborhood.

3.3.5

buildbigc

C

Output matrix with covariances between sampled points

Input position anisotropy nu range

relative coordinates of sampled points proportion of anisotropy parameter of the von K´arm´an covariance function horizontal correlation length

Table 3.5: Parameters for buildbigc.

This is a subfunction of kriging3, which calculates the matrix containing the covariance values among the sampled points (see equation 2.24). displacement3 cal-

3.3 Description of the Individual Functions

19

culates a matrix containing the distances between the sampled points. Therefore, the x- and y-axis are scaled according to the structural anisotropy. The displacement3 function uses the relative coordinates of the search neighborhood, which are provided by the neighborhood function. The lag values of the matrix are then replaced with the corresponding covariance values by the function covcalc. As C is a symmetric matrix, only one half has to be explicitly calculated. Parameters used and calculated by buildbigc are shown in Table 3.5

3.3.6

buildsmallc

c

Output matrix with covariances between sampled points and estimated location

Input position

anisotropy nu range

relative coordinates of sampled points proportion of anisotropy parameter of the von K´arm´an covariance function horizontal correlation length

Table 3.6: Parameters for buildsmallc.

buildsmallc generates a vector containing the covariance values for the lags between the data points used for interpolation and the point which is subject to interpolation. In equation 2.24 this is the vector on the right. The functionality of buildsmallc is the same as in buildbigc with the difference that only a few points have to be calculated and the computation of the lags is straightforward by using the relative coordinates. Table 3.6 shows that the input parameters are the same as for buildbigc.

3.3.7

displacement3

dist

Output distances between sampled points with respect to the anisotropy

Input position

relative coordinates of sampled points

anisotropy

proportion of anisotropy

Table 3.7: Parameters for displacement3.

20

Description of the Implementation

The displacement3 function is called by buildbigc for calculating the lags among the sampled points used for interpolation. The first row of the resulting matrix contains the lags between the first point and all other points. The second row contains the lags between the second point and all other points and so on. The diagonal of the matrix are the lags between identical points and hence uniformly zero. As the lag is the same from point A to B as from B to A only one half of the matrix has to be calculated explicitly. Anisotropy is taken in account in the calculation of the lags by multiplying the axis with the shorter correlation length with the anisotropy factor (correlation length in x-direction divided by correlation length in y-direction). Table 3.7 shows the input and output parameters.

3.3.8

covcalc Output value of the covaricovariance ance function at a certain lag

Input lag

nu a

argument of the covariance function parameter of the von K´arm´an covariance function horizontal correlation length

Table 3.8: Parameters for covcalc.

This function evaluates the used parametric model of the auto-covariance function. The von K´arm´an family of auto-covariance functions (see section 2.3), which is currently used in the covcalc routine has a wide range of shapes that are controlled by the parameter ν and the correlation length a. These parameters are input directly into vebyk. This covariance model can fit a wide variety of practically relevant auto-covariance functions. Therefore, a wide range of datasets can be interpolated without changing the covcalc function. At zero lag, the auto-covariance function is normalized to the value of one (assuming that there is no nugget effect) as the numerical evaluation of the von K´arm´an function at lag zero tends to be problematic, particularly for small ν-values. The parameters of the function are described in Table 3.8.

3.3.9

ordinary

Ordinary kriging does not assume the mean of a region to be given, but instead forces the kriging weights to sum up to one (see section 2.4.4). The corresponding effect on the calculation of the kriging weights is relatively small. The equation for calculating the Lagrange parameter has to be added. This is done by expanding the matrix C by one row and one column of ones and the enlargement of the vector ~c by one element of value one. As the Lagrange parameter is not to be included in summation of the kriging weight, the value at the bottom right corner of the matrix

3.3 Description of the Individual Functions

Output matrix with covariances between sampled points matrix with covariances between sampled points and estimated location

C

c

C

c

21

Input C added a line and a column with ones for ordinary kriging c added a one fore ordinary kriging

Table 3.9: Parameters for ordinary.

is set to zero. Input and output are described in Table 3.9. If the implementation has to be changed to run with simple kriging, the function ordinary has to be commented out and the line where the estimation is performed using the kriging weights must be changed to conform with equation (2.25).

3.3.10

rotation

rotcoord

Output coordinates rotated around the origin at the angle given by alpha

Input coord

coordinates and values of the sampled points

alpha

angle between the axes and the anisotropy

Table 3.10: Parameters for rotation.

In cases where the axes of the anisotropy ellipsoid do not coincide with the axes of the coordinate system, this can be specified in vebyk through the corresponding rotation angle. One possibility to take this into account would be to perform to corresponding adjustments for every estimation. This would take place in the displacement function, where also the anisotropy parallel to the axis is considered. The relative coordinates would have to be rotated by the specific angle. An easier way to account for the angle of anisotropy is to rotate the coordinates of all sampled points and the grid of estimated values in order to have the anisotropy in the direction of the axes. This is equivalent to an eigenvalue transformation. Now the interpolation can be made as if the axes of the coordinate system coincide with those of the anisotropy ellipsoid. Afterwards, the coordinates are rotated back to the places they belong. The direction of the anisotropy is expected to be the same in the whole region and therefore it does not matter which point is the center of the rotation. Here the rotation is done around the origin of the coordinate system because the equations are simplest for this case. The corresponding rotation of Cartesian coordinates is given by (Papula, 1994):

22

Description of the Implementation

u = y sin(ϕ) + x cos(ϕ), v = y cos(ϕ) − x sin(ϕ),

(3.1)

where x and y are the primary coordinates, u and v the corresponding rotated coordinates and ϕ is the rotation angle. The angle ϕ has to be measured as shown in Fig. 3.2. The anisotropy ratio corresponds to ratio of lengths of the axis of the anisotropiy ellipsoid (i.e., to the correlation lengths in u- and v-directions)(Figure 3.2): au . (3.2) av Input parameters for this function are the rotation angle in radians, and the matrix containing the coordinates to be rotated. The first two rows of the matrix are the x- and y coordinates and the third column contains the data values. kany =

y

v

u au j av

x

Figure 3.2: If the direction of the axes of the anisotropy ellipsoid differ from the direction of the axes of the coordinate system, the angle φ is measured as shown above. au and av denote the correlation lengths in the u- and v-directions.

Chapter 4 Testing of the Implementation 4.1

Comparison with a published example P1

P2

P0

P3

P4

Figure 4.1: Locations of the four points P1 , P2 , P3 , P4 with observed values and the point P0 , where the estimation is performed. After Armstrong (1998, pp. 105 ff)

Points P1 , P 4 P2 , P 3

Kriging weights obtained by Armstrong (1998) -0.47 0.547

Kriging weights obtained by vebyk -0.475 0.5475

Table 4.1: Comparison of the results published by Armstrong (1998) with those obtained by vebyk.

To verify the accuracy of the implemented kriging algorithm, we first compare its output with that of a simple published example (Armstrong, 1998, pp. 105ff). The spatial arrangement of the example is shown in Figure 4.1. The four sampled points are equally spaced on a line with 1 m intervals and the point to be estimated lies in the center. The used covariance function, is given by C(τ ) = 1 − |τ |1.5 . Unfortunately, I could not find any suitable published examples for which the von K´arm´an auto-covariance model could be used. It should, however, be noted that the definition of the model auto-covariance function only corresponds to one line in my code that is essentially independent of the core of the actual kriging algorithm. Table 4.1 shows a comparison of the results published by Armstrong (1998) with those obtained with my program. The slight differences may be caused by different equation solvers or may simply represent a rounding effect, as only three digits are given by Armstrong (1998, pp. 105ff). Overall, this comparison indicates that the core of the algorithm works correctly. 23

24

4.2

Testing of the Implementation

Structural Anisotropy

Structural anisotropy implies that the decay of the auto-covariance function depends on the direction in which it is measured or modeled. Anisotropy is generally assumed to be elliptical in nature, i.e., there are two perpendicular main axes along which the auto-covariance function exhibits different correlation lengths (Figure 3.2). The anisotropy factor denotes the ratio between the maximum and minimum correlation length (equation 3.2). The effects of structural anisotropy can be accounted for through a suitable coordinate transform. As the coordinate axes are rotated and compressed, the ellipse is transformed into a circle and kriging can then be performed as if there were no anisotropy. In the presence of anisotropy, care has to be taken with regard to choice of the search neighborhood. The kriging weights decrease in the direction of the long axis of the anisotropy ellipsoid (Figure 3.2). If the search neighborhood is symmetric around the estimated point it tends to be too small in the direction of the long anisotropy axis and unnecessarily large in the other direction. This has the effect that kriging weights at the border of the search neighborhood become larger than kriging weights in the immediate vicinity of the estimated point (Fig. 4.2). This is

0.6

Kriging weight

0.5 0.4 0.3 0.2 0.1 0

−4

−0.1 −4

−2 0

−2

point #

0

2

point #

2 4

4

Figure 4.2: Kriging weights for too small a search neighborhood. A total of 16 points where used for the interpolation. The estimated point is in the center of the region. The kriging weights exist actually only on locations where a sampled point is available. This picture is obtained by interpolating a regular grid of zeros with one sampled point with a value of one in the middle of the region. The estimation grid is sampled at half the distance of the input dataset and the estimation grid is slightly shifted to avoid that the sampled data points dominate the output of the estimation. The kriging weights depend on the spatial relationship amongst each other. This implies that if sampled data points where not on a regular grid, the shape of the kriging weight function would be somewhat different.

4.2 Structural Anisotropy

25

0.6

Kriging weight

0.5 0.4 0.3 0.2 0.1 0 −0.1

−4

−4 −2

−2 0

0

point #

2

2

point #

4

4

Figure 4.3: Same as Figure 4.2 but for 36 points. The kriging weights on the border of the search neighborhood have become notably smaller. The computational efforts have risen nintefold compared to those for Fig. 4.2.

0.5

Kriging weights

0.4 0.3 0.2 0.1 0 −4 −2 0

−4

point #

−2 2

0 2 4

point #

4

Figure 4.4: Same as Figures 4.2 and 4.3, but for an anisotropic search neighborhood using a total of 16 points. Larger kriging weights on the border of the search neighborhood have essentially disappeared.

clearly an artefact as, according to the auto-covariance model, the spatial relationship decreases monotonously with increasing distance. If the search neighborhood is enlarged this effect decreases, as can be seen in Fig. 4.3. It can also be seen that

26

Testing of the Implementation

in the direction of the short anisotropy axis far too many points are used, which unnecessarily increases the computational effort. It is therefore most appropriate to adapt the shape of search neighborhood to that of the anisotropy ellipsoid. Figure 4.4 shows the corresponding results. In summary, if the search neighborhood is too small, the kriging weights are spatially cropped. As illustrated by Figure 4.5, this results in high-frequent oscillations in the output. Kriged regions are expected to be smooth. But as the search neighborhood gets too small, the surface becomes more “edgy” and finally high-frequent oscillations appear. This effect is not very obvious on a two-dimensional plot of a larger kriged region, but it can be clearly discerned when taking profiles across the interpolated surface (Figure 4.5).

4.3

Search Neighborhood

The use of a simple search neighborhood is a suitable practice if the sampled values are evenly or randomly spread. As kriging accounts for slight to moderate clustering of the data, it is usually not necessary to use an elaborate search neighborhood. Nevertheless, some heavily clustered datasets may cause artefacts in the interpolation. A corresponding example is the kriging interpolation of two neutron porosity

1.7

1.6

point value

1.5

1.4

1.3

1.2

1.45

1.4

1.1

1.35

1 1.3

1.25

0.9

55

0

20

40

60

80

60

65

100

70

75

120

80

85

140

90

95

160

180

200

point #

Figure 4.5: High-frequent oscillation caused by too small a neighborhood. The graph shows a slice through an interpolated region. The abscissa shows the number of the interpolated point, whereas the ordinate gives the estimated value at this point. Every fourth point is a sampled point and is green marked in the illustration. The red curve is obtained when using a too small search neighborhood. For the blue curve a search neighborhood of adequate size was used.

4.3 Search Neighborhood

27

logs. This implies that a large area without any observed data has to be interpolated based on densely sampled, but inherently one-dimensional borehole logs. Using a simple search neighborhood causes the interpolation to use data of of the one borehole only closest to the point to be estimated. Figure 4.6 shows the resulting artefacts in the middle of the interpolated region, where the two unrelated halves meet. The quadrant search method (Isaaks and Sarivastava, 1989), where points of all four quadrants are considered, is an effective tool to avoid this kind of artefacts. This is illustrated in Figure 4.7.

Figure 4.6: The use of a simple search neighborhood can cause artefacts if the sampled data points are heavily clustered. This figure shows a kriging interpolation of two neighboring borehole logs. As only the nearest points are considered for interpolation, the right and left halves of the interpolated region are independent of each other. The obvious artefacts in the middle clearly demonstrate that a simple search neighborhood fails in this example.

28

Testing of the Implementation

Figure 4.7: Kriging interpolation of the same borehole data as in Figure 4.6. This time a quadrant search neighborhood was used. Considering data of all four quadrants is a straightforward and efficient way to avoid artefacts in heavily clustered data.

4.4

Cross-Validation

Cross-validation allows to assess how well kriging works for a given dataset. This is achieved by performing the interpolation at locations where observations are available and assessing the discrepancies between the observed and estimated values. In particular, cross-validation is commonly used to test the practical validity of the used model for the auto-covariance function (Kelkar and Perez, 2002). In this study, cross-validation is primarily used to check the implementation of the kriging algorithm, as the auto-covariance function of the sampled dataset is supposed to known in advance. There are several methods for cross-validation. The techniques used in this study are referred as “leaving-one-out” and “jackknifing”. As indicated by its name, the leaving-one-out cross-validation technique estimates a value for a sampled point by leaving out the observed point at which the estimation is made. A comparison of the resulting discrepancies allows to assess the

4.4 Cross-Validation

29

accuracy of the used auto-covariance model. If all the information of the sampled data is used to model the spatial relationship, as this is the case for the leavingone-out cross-validation, the estimated value is not strictly independent from the sampled value. For the more rigorous jackknifing test, the estimated value is independent from the sampled value. This is achieved by first dropping part of the sampled data and then estimating them through kriging. Jackknifing is, however, only applicable if a sufficient amount of sampled points is available.

4.4.1

Leaving-One-Out Cross-Validation

1.7

points: 24 anisotropy: 10

estimated value

1.6

1.5

1.4

1.3

1.2 1.2

1.3

1.4

1.5 true value

1.6

1.7

Figure 4.8: Crossplot of true versus estimated values. As the absolute error is not increasing with increasing data values, the estimation is conditionally unbiased. This is an implicit characteristic of kriging.

For testing the implementation a 20 × 20 stochastic dataset with a known autocovariance function was used. By using the leaving-one-out cross-validation method the estimated values for the sample points were calculated. As proposed in Kelkar and Perez (2002), a crossplot between true and estimated values (Figure 4.8), a crossplot of errors versus true values (Figure 4.9) and a colorplot of the errors were generated. Figure 4.8 shows the crossplot between true and estimated values. The points are equally spread around the 45◦ -line showing that the estimate is conditionally unbiased, that is, the errors are independent of the amplitude of the observed values. This demonstrates that implemented kriging algorithm is indeed unbiased as required by the methodological foundations. In Figure 4.9 the corresponding estimation errors are plotted against the true values. The errors do not show any

30

Testing of the Implementation

0.02

true−estimated value

0.01

0

−0.01

−0.02

−0.03

1.2

1.3

1.4

1.5 true value

1.6

1.7

Figure 4.9: Crossplot of true value versus the corresponding estimation error. The even distribution of the points around the zero line indicates homoscedasticity of the error variance. 0.3 2

4

0.2

6 0.1

y

8

10

0

12 −0.1 14

16

−0.2

18 −0.3 20 2

4

6

8

10

12

14

16

18

20

x

Figure 4.10: Spatial distribution of errors in cross-validation as represented by a colorplot of the errors.

systematic variation in magnitude with increasing sample values, which implies that the estimation shows homoscedastic behavior. That is, the variance around the re-

4.4 Cross-Validation

31

gression line is the same for all estimated values. The colorplot of the error values shown in Figure 4.10 should reveal any systematic spatial relationships present in the error values. When using the correct auto-covariance function, the kriging errors of the estimation should not be spatially related. Figure 4.10 indicates that this is indeed the case.

4.4.2

Jackknife Cross-Validation

For jackknifing, a stochastic dataset with 200 × 200 values and a known exponential auto-covariance function was generated (Figure 4.11). Then an equally spaced “observed” dataset for the interpolation was extracted from this initial model by decimating it fourfold (Figure 4.12). This observed data then was used for estimation of the skipped data points. Figure 4.13 shows the result of the kriging interpolation. For comparison the interpolation of the same input dataset was also performed using a spline interpolation (Figure 4.14), a very common interpolation technique, where a polynomial is fitted to the sampled data points (de Boor, 1978). Compared to the spline interpolation, kriging provides a sharper image that is much more closely related to the initial model (Figures 4.11 and 4.14). The computational efforts for kriging interpolation are, however, much larger than those for spline interpolation. The differences between spline and kriging interpolation increase with increasing roughness and complexity of the initial dataset. Figure 4.15 shows a stochastic model characterized by a von K´arm´an auto-covariance function with ν = 0. The model estimated by kriging (Figure 4.16) is again much more closely related to the original model than the spline interpolation (Figure 4.17). A conspicuous feature of

20

1.5

40 1 60 0.5

y [points]

80

100

0

120 −0.5 140 −1

160

180

−1.5

200 20

40

60

80

100

120

140

160

180

200

x [points]

Figure 4.11: Stochastic model characterized by von K´arm´an auto-covariance function with σ 2 = 1, ν = 0.5, horizontal correlation length a = 100 and anisotropy kany = 10.

32

Testing of the Implementation

1.5

5

10 1 15 0.5

y [points]

20

25

0

30 −0.5 35 −1

40

45

−1.5

50 5

10

15

20

25

30

35

40

45

50

x [points]

Figure 4.12: Input model for jackknifing. The synthetic model (Figure 4.11) was decimated by taking every fourth point in x- and y-axis directions.

1.5

20

40

1

60 0.5

y [points]

80

100

0

120 −0.5 140 −1

160

180

−1.5

200 20

40

60

80

100

120

140

160

180

200

x [points]

Figure 4.13: Kriging estimation of the input model shown in Figure 4.12. The sampling interval is the same as in Figure 4.11.

the kriged image is, however, that the sampled/observed points “stand out” from the interpolated background. This effect is also observed if the input dataset is not a regularly spaced grid, but randomly chosen as for the input dataset of Figure 4.18. I found this observation to be universally characteristic for input data with ν-values

4.4 Cross-Validation

33

1.5

20

40 1 60 0.5

y [points]

80

100

0

120 −0.5 140 −1

160

180

−1.5

200 20

40

60

80

100

120

140

160

180

200

x [points]

Figure 4.14: Spline interpolation of input model shown in Figure 4.12. The sampling interval is the same as in Figure 4.11.

smaller than 0.5. These “freckles” arise from the high variability of neighboring sampled data points. As the interpolated space is regarded as stationary, the sampled values are dispersed around a local mean. Kriging interpolation predicts the expected value for an estimated point based on the sampled points in the search neighborhood. It is therefore a prediction of a random value in consideration of its surrounding values. This is only possible, if the predicted value is not completely random. The stochastic models with auto-covariance functions of the von K´arm´an family are so called fractional Gaussian noise (fGn). fGn has the property, that its spectral density function scales with: P ∝ f β,

(4.1)

β = −(2ν + 1).

(4.2)

where β is related to ν as:

The short-term predictability for scale-invariant stochastic data depends on the ν-value. Bandwidth ranges between good, linear predictability for β ≥ 3 to entirely unpredictable for β < 1 (Hergarten, 2003). For β < 1 the best value prediction is the expected value of the dataset, i.e., the global mean. But as the actual medium is in fact of very rough and complex, most sampled data points have values that differ significantly from the global mean. Kriging of such a medium therefore creates a smooth interpolated surface close to the mean of the used search neighborhoods from which the observed points “stand out” (4.16 and 4.18). Scale-invariant stochastic data with 0 ≤ ν < 0.5 are refered as “anti-persistent”. This means that the gradient is likely to turn negative if it has been positive before and vice versa. Controversely,

34

Testing of the Implementation

for 0.5 < ν < 1 scale-invariant stochastic data are “persistent” and the gradients are expected to remain positive if it was positive before and vice versa (Hergarten, 2003). The behavior of kriging interpolation, as of most other interpolation procedures, is inherently persistent in nature and hence the interpolation may create artefacts if the data to be interpolated are in fact anti-persistent in nature. 3 20

40

2

60 1

y [points]

80

100 0 120 −1

140

160 −2 180 −3

200 20

40

60

80

100

120

140

160

180

200

x [points]

Figure 4.15: Stochastic model characterized by von K´arm´an covariance function with σ 2 = 1, ν = 0, horizontal correlation length a = 100 and anisotropy kany = 10.

4.4 Cross-Validation

35

2 20 1.5 40 1

60

0.5

y [points]

80

100

0

120

−0.5

140

−1

160 −1.5 180 −2 200 20

40

60

80

100

120

140

160

180

200

x [points]

Figure 4.16: Kriging estimation of a dataset generated by taking every fourth point of the model in Figure 4.15.

20 2 40

60

1

y [points]

80 0 100

120

−1

140 −2

160

180 −3 200 20

40

60

80

100

120

140

160

180

200

x [points]

Figure 4.17: Spline interpolation for the model shown in Figure 4.15.

36

Testing of the Implementation

2.5

20

2 40 1.5 60 1

y [points]

80 0.5 100

0

120

−0.5

140

−1

160

−1.5 −2

180

−2.5 200 20

40

60

80

100

120

140

160

180

200

x [points]

Figure 4.18: Kriging estimation of a dataset generated by taking 2500 random points from the model shown in Figure 4.15.

4.5 Sensitivity of Kriging Estimation with Regard to Auto-Covariance Model

4.5

37

Sensitivity of Kriging Estimation with Regard to Auto-Covariance Model 1.3

1.2

best fitting ν−value

1.1

1

0.9

0.8

0.7

0.6

0.5

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

ν−value of input model

Figure 4.19: Relationship between the ν-value of the auto-correlation function of the initial model and the ν-value of the model auto-correlation function that leads to the smallest sum of kriging errors (equation 4.3). Stars denote the explicitly calculated values. The solid line is a quadratic interpolation for ν ≤ 0.7 and a corresponding extrapolation for ν > 0.7.

The kriging results depend on the search neighborhood and the chosen model for the auto-covariance function. As discussed in section 2.2, the auto-covariance function often cannot be determined accurately from the available data. Therefore, it is important to know how stable and robust the kriging estimation behaves with regard to the used auto-covariance model. To this end, eleven stochastic models with known auto-covariance functions were created. The auto-covariance models are all of the von-K´arm´an-type (σ 2 = 1, a = 100, kany = 10) and differ only in terms of their ν-values. The ν-values are increased in steps of 0.1 from ν = 0 for a very rough and complex model to ν = 1 for a very smooth model. Every fourth value in x- and y- axis direction was taken for the input dataset. The dataset was then kriged using von K´arm´an auto-covariance functions with wide range of ν-values and anisotropy ratios keeping the horizontal correlation length fixed at the correct value. The sum of absolute error between the estimated and sampled values is taken as an indicator for the quality of the interpolation. Figure 4.19 shows the best fitting ν-values against the true ν-values of the stochastic models. The resulting “trade-off maps” for input models with νvalues 0, 0.5 and 0.7 are shown in Figures 4.20, 4.21 and 4.22. The minimal error values are marked with a red cross. It is noticeable that the ν-values that lead to the lowest error values are all

38

Testing of the Implementation

considerably larger than the true ν-value of the auto-covariance function of the input model. This difference increases with increasing roughness character of a medium. At least in the range 0 ≤ ν ≤ 0.7 the ν-values of the input model νtrue are related to the best-fitting ν-values νbest as: 2 νbest = −0.5 νtrue + 1.1 νtrue + 0.55.

(4.3)

This reflects the attitude of kriging not to preserve the auto-covariance function of the initial dataset, but to calculate the expected value at interpolated locations. The auto-covariance function of a kriged dataset, therefore, will always be smoother than that of used auto-covariance model. This also finds its expression in the fact that sum of the absolute errors in the trade-off maps shown Figures 4.20, 4.21 and4.22 decreases marked with increasing ν-values. Appendix A shows kriging interpolation of the model presented in Figure 4.11 with different ν-values and the “trade-off” maps for cross-validation with different correlation lengths. 0.1 640

0.2 0.3 620

0.4 0.5

ν

600

0.6 580

0.7 0.8

560

0.9 1

540

2

4

6

8

10

12

14

16

18

20

Anisotropy Figure 4.20: Sum of absolute kriging errors as a function of the ν-value and the anisotropy ratio kany . The input dataset has a von K´arm´an covariance function with ν = 0, a horizontal correlation length ax = 100 and an anisotropy ratio kany = aaxy = 10. The subscripts denote the axis direction of the correlation length.

4.5 Sensitivity of Kriging Estimation with Regard to Auto-Covariance Model

0.25

39

150

0.5 140

0.75 130

ν

1 120

1.25

1.5

110

1.75

100

2 2

4

6

8

10

12

14

16

18

20

Anisotropy Figure 4.21: Sum of absolute kriging errors as a function of the ν-value and the anisotropy ratio kany . The input dataset has a von K´arm´an covariance function with ν = 0.5, a horizontal correlation length ax = 100 and an anisotropy ratio kany = aaxy = 10. The subscripts denote the axis direction of the correlation length. 85 0.25 80 0.5 75 0.75

ν

70 1

65

1.25

60

1.5

55

1.75

50

45

2 2

4

6

8

10

12

14

16

18

20

Anisotropy

Figure 4.22: Sum of absolute kriging errors as a function of the ν-value and the anisotropy ratio kany . The input dataset has a von K´arm´an covariance function with ν = 0.7, a correlation length ax = 100 and an anisotropy ratio kany = aaxy = 10. The subscripts denote the axis direction of the correlation length.

Chapter 5 Application of Kriging to Conditional Geostatistical Simulations Conditional simulation is a complementary geostatistical tool to interpolation. As opposed to interpolation, which has the objective to provide estimates that are statistically as close as possible to the unknown true value at any particular location, the goal of conditional simulation is to represent the inherent spatial variability of a medium. Unlike kriging interpolation, conditional simulation therefore reproduces the second-order statistical attributes (i.e., mean value and auto-covariance function) of the considered dataset. The error variance of the simulated values is, however, not minimized and a simulated value is not necessarily the best estimation, in a statistical sense, for a specific point. Conditional simulation is therefore not adequate, for example, to estimate total reserves of an ore deposit, but it can adequately illustrate the inherent variability of the ore concentration in the deposit. Many conditional simulations allow to draw probability maps of a region. The mean of many conditional simulation maps correspond to a map of the expected values, which is similar to a kriging interpolation. This is required, for example, for optimization of mining and hydrocarbon recovery processes as well as for detailed ground water flow and contaminant transport simulations (Hardy and Beier, 1994). A conditioned simulation is one of an infinite number of possible realizations of the space between sampled data points. The realization is constrained by the mean value and the auto-covariance function of the observed dataset and is conditioned by the sampled values. There are many different approaches to conditional simulation (Kelkar and Perez, 2002). A simple and effective way is to use unconditional stochastic simulation and kriging interpolation (Goff and ”Jennings, 1999; Journel and Huijbregts, 1978) as following: 1. Compute unconditional simulation Xu (~u). 2. Sample XU (~u) at locations u~i (i ∈ 1,2,...,N ) with available observed data and ˆ ui ) = X(~ ˆ ui ) − XU (~ compute differences: ∆X(~ ui ). ˆ ui ) → ∆XI (~u). 3. Perform kriging interpolation: ∆X(~ 40

5.1 Unconditional Simulation

41

4. Add kriged values to unconditioned simulation: XC = XU (~u) + ∆XI (~u). The space between the locations with observed data is smoothly interpolated through kriging, as can be seen in Figure 4.7, which shows the state of the conditional simulation after step three. The unconditional model is then superimposed on to this smooth kriged data structure.

5.1

Unconditional Simulation

An unconditional simulation is a second-order stationary realization of a random variable with given mean value and auto-covariance function. Although such realizations are possible with any auto-covariance function, the band-limited “fractal” von K´arm´an model (see section 2.3) has been applied with particular success in many fields (Goff and Jordan, 1988; Holliger, 1996; Holliger and Levander, 1993) and will also be used here. There exist different possibilities to obtain a unconditional simulation (Journel and Huijbregts, 1978). A straightforward and efficient way to perform a unconditioned simulation is the spectrum method (Christakos, 1992). With the increasing availability of powerful desktop computers, such realizations using two- (Goff and ”Jennings, 1999) and three-dimensional (Chemingui, 2001) discrete Fourier transforms have become very attractive. Unconditional stochastic simulations of this type are performed by taking the inverse discrete Fourier transformation (IFFT) of the amplitude spectrum A of considered stochastic process: ~

XU (~u) = IF F T [A(~k)ei2πϕ(k) ],

(5.1)

where ~k is the wavenumber. The phase value ϕ is a uniformly distributed random number sampled between 0 and 1. As discussed in section 2.2 the spectral density function is the Fourier transform of the correlation function of a variable. The spectral density function is thus also the square of the amplitude spectrum. Therefore, a dataset with an auto-covariance function C(~r) can be generated by computing the spectral density function of the auto-covariance function and taking the square root to obtain the amplitude spectrum in equation 5.1. For the von K´arm´an family of auto-covariance functions, the amplitude spectrum can be calculated analytically. The Fourier transform of the von K´arm´an auto-covariance function (equation 2.3) and therefore also the spectral density function for E-dimensional space is given by (Holliger, 1996): √ E 2 πa) Γ(ν + E/2) σ (2 h , Phh (~k) = Γ(ν)(1 + ~k 2 a2 )ν+E/2

(5.2)

where a is the correlation length. For the two-dimensional case (E = 2) this yields: Phh−2D (~k) =

4πaνσh2 ν+1 . 1 + (~ka)2

(5.3)

42

Application of Kriging to Conditional Geostatistical Simulations

As the final variance can be adjusted by simple scaling, the variance in equation 1 . The amplitude spectrum of the unconditioned 5.3 can be chosen as σh2 = 4πaν simulation with a von K´arm´an covariance function then yields: q −(ν+1) ~ 1 + (~ka)2 (5.4) |A(k)| = An unconditional simulation can now realized following equation 5.1. The dataset has to be normalized with regard to mean, amplitude and variance after discrete inverse Fourier transformation. Examples of unconditional simulations based on von auto-covariance functions are shown in Figure 4.11 and Figure 4.15.

5.2

Conditional Simulations of Porosity Distributions in Heterogeneous Aquifers

In hydrology, detailed knowledge of the spatial porosity distribution is pre requisite for constraining the permeability structure, which then allows to simulate the flow and transport properties of an aquifer. Porosity data are obtained from borehole logs, which have a high resolution in the vertical dimension but generally a very low lateral coverage. This results in a correspondingly vague interpretation of the region between the boreholes. Various geophysical methods, such as resistivity sounding, crosshole georadar and seismic tomography, can be used to improve our knowledge of the lateral porosity distribution. However, due to their limited resolution, these methods tend to produce an overly smooth picture of the subsurface. As flow and transport simulations are strongly influenced by the local variability of porosity data (Hassan, 1998), conditional simulations based on borehole logs and geophysical measurements are critical for improving such simulations. I have produced conditional simulations for two different sites. At both sites there were porosity borehole logs and crosshole georadar tomography data available. In addition to this, crosshole seismic data were available at one site. To perform a conditional simulation with these geophysical data, the following steps were necessary: 1. Estimate ν-value, vertical correlation length, and variance from porosity logs. 2. Convert tomographic data to porosity and estimate structural aspect ratio and its orientation. 3. “Arbitrarily” fix vertical or horizontal correlation lengths so that structure is scale-invariant at considered model range. 4. Subsample spatial porosity structure to resolution of tomographic data. 5. Scale porosity field derived from tomographic data to match variance of porosity logs. 6. Perform conditional simulation of spatial porosity structure at the desired grid spacing.

5.2 Conditional Simulations of Porosity Distributions in Heterogeneous Aquifers 43

7. Repeat previous step several times for different unconditional realizations to obtain an estimate of the bandwidth of the variability.

5.2.1

Boise

Based on crosshole georadar tomography and porosity logs of two nearby boreholes at the Boise Hydrogeophysical Research Site (BHRS) a conditional simulation of the porosity structure has been performed. BHRS is a testing ground for geophysical and hydrological methods in an unconfined alluvial aquifer near Boise, Idaho. The crosshole georadar tomography data and the porosity logs were provided by Tronicke et al. (2003). To obtain a conditional simulation of the tomographic data and the porosity logs, the following parameters were defined: ν-value To construct an unconditional simulation, it is critical to estimate the variability of the simulated medium. This can be done on the basis of sampled values or by using a priori information. The use of an a priori ν-value in case of porosity simulations seems feasible as many different analyses of porosity have shown the spatial distribution to behave uniformly as fractional-Gaussiannoise (fGn) with a ν-value close to zero. A straightforward method to estimate 95

o = Source

o = Source

x = Receiver

C5

C5

C6

4

90

6

6

8

8

10 85 12 14

0.4

C6

4

Depth [m]

Depth [m]

x = Receiver

2

2

0.35

0.3 10 12

0.25

14

16

16

0.2

80 18

18 0

5 Distance [m]

10

0

75

5 Distance [m]

10

0.15

0.1

Figure 5.1: Crosshole georadar tomography between boreholes C5 and C6 on Boise Hydrogeophysical Research Site (BHRS). Left: velocity in m/µs, right: attenuation in 1/m. After Tronicke et al. (2003)

44

Application of Kriging to Conditional Geostatistical Simulations

C6

4

4

6

6

8

8

Depth [m]

Depth [m]

C5

10

12

10

12

14

14

16

16

18

18

0.1

0.2

0.3

0.4

0.5

0.1

0.2

Porosity

0.3

0.4

0.5

Porosity

Figure 5.2: Porosity logs for boreholes C5 and C6 at BHRS. Sample interval of both logs is 0.06m. After Tronicke et al. (2003).

the ν-value of a stochastic time-series is to analyze the behavior of the spectral density function. The analysis is done here for the densely sampled porosity logs rather than for the sparsely sampled, smoothed tomographic data. The spectral density function P of a scale-invariant sequence scales with angular frequency ω as (Hardy and Beier, 1994; Holliger, 1996): P ∝ ωβ ,

(5.5)

where β lies is between −1 and −3. If the logarithm of the spectral density is plotted against the logarithm of the frequency, β denotes the slope of the linearly decaying spectral density: ln(P ) ∝ β · ln(ω).

(5.6)

Figure 5.3 shows a double-logarithmic plot of the spectral density of one of BHRS porosity logs. A linear regression of the slope indicates that β is around -0.97. The relationship between the slope of spectral density and the ν-value is (Hardy and Beier, 1994; Holliger, 1996): β = −(2ν + 1). This results in a ν-value of -0.015, which is approximated as ν = 0.

(5.7)

5.2 Conditional Simulations of Porosity Distributions in Heterogeneous Aquifers 45

1

2 * log (amplitude)

0 −1 −2 −3 −4 −5 −6 −7 −5

−4

−3

−2

−1

0

1

2

3

4

5

log (angular frequency)

Figure 5.3: Double-logarithmic plot of spectral density of the porosity log in borehole C5. A linear regression of the slope suggests β to be around -0.97. The medium is therefore expected to be of rough and complex with a ν-value of close to zero.

Correlation length and structural aspect ratio The estimation of the correlation length cannot be done very accurately for different reasons. The limitation of available data can cause scaling effects (Western and Bl¨oschl, 1999) and the vicinity of the boreholes may be disturbed during the drilling process or be washed out by drilling fluids. It is also possible that the backfill of the casing is measured instead of the in situ porosity of the sediments. The analysis of the auto-covariance function of boreholes results in correlation lengths of 2.4 m, which is about 1/10 of the measurement scale and hence points to a scaling effect as discussed by Gelhar (1993) and Western and Bl¨oschl (1999). An other way to estimate the correlation structure is to plot the tomography data and estimate the horizontal and vertical size of evident structures. Such a fit-by-eye yields a dominant scale of about 2 m in vertical and 10 m in the horizontal direction and therefore an anisotropy ratio kany of 5. To achieve a dataset that is self-affine at scales in the order of the investigated region, the correlation length were defined ten times larger with a = 100 m in horizontal direction and a = 20 m in vertical direction. Conversion of georadar velocities to porosity Many different conversions of georadar velocities to porosity exist. Tronicke et al. (2003) found the conversion based on a two-component mixing model (Wharton et al., 1980) for water-saturated media to provide good results for this site. Porosity Φ therew fore is a function of the relative permittivity of matrix m r , water r and the measured relative permittivity r : √ √ r − m Φ = √ w √ rm . r − r

(5.8)

w The values used for m r and r are 4.6 and 80, respectively. The measured

46

Application of Kriging to Conditional Geostatistical Simulations

relative permittivity can be calculated from the velocity tomogram using the high-frequency approximation: =

1 , µv 2

(5.9)

where v is the electromagnetic velocity and µ is the magnetic permeability. The latter can be assumed to be equal of to the magnetic permeability of vacuum (µ0 = 4π · 10−7 V sA−1 m−1 ), as rocks and soils are generally non-magnetic. Subsampling spatial porosity structure Although the tomographic data gives the impression of a very densly sampled image of the subsurface (Figure 5.1), the actual resolution is much sparser. The limitation of the resolution depends systematically on the wavelength λ and the propagation distance L. The smallest feature to be recovered is expected to be of the order of (Williamson and Worthington, 1993): rmin ∼

√ Lλ.

(5.10)

In this case, the two boreholes are about 10 meters apart. The tomographic data was recorded with a georadar system with nominal center frequency at 250 MHz. The nominal recorded center frequency, however, is around 100 Hz and the bandwidth of the signal is about two octaves (50Hz-200Hz). The crucial wavelength for tomography resolution seems to be the shortest recorded wavelength, which is in this case approximately half a meter. The resolution also depends on the propagation distance and is therefore poorest in the center of the tomographic image with a value of somewhat more than one meter. On the other hand, ray covering is much better in the center of the tomography than it is near the boreholes. As a result of the inversion process, changes in the tomographic image are inherently smooth and a regular spacing of one meter for subsampling the tomography data for the conditional simulation seems adequate. Figure 5.4 shows the locations of sampled points on which the unconditional simulation has to be adapted. Scale tomography data to variance of porosity logs The assumption of secondorder stationarity implies a constant variance over the entire experimental region. It is likely that the variance of the tomographic data has been reduced due to the damping and smoothing constraints used in the inversion process. In contrast, the porosity values measured in the boreholes are assumed to represent in situ porosity values and for this reason also the correct variance. To perform the conditional simulation, the variance of porosity values derived from the crosshole georadar tomography has therefore been scaled to match the variance of the porosity logs. The conditional simulation was performed for the porosity logs only (Figure 5.5) and for the porosity logs in conjunction with the porosity values derived from the crosshole georadar tomography (Figure 5.6). On both Figures, a zone of very high

5.2 Conditional Simulations of Porosity Distributions in Heterogeneous Aquifers 47

Figure 5.4: Unconditional simulation between boreholes C5 and C6. Locations for sampled data are marked with a circle for the porosity logs and crosses for tomography data.

porosity above four meters depth is visible. The origin of this feature are the very high values in the porosity log of borehole C5. As the porosity values are much higher than the surrounding values and prevail only over a short depth range, there is a fair possibility that the measurement has been disturbed. At a depth of six meters, the log of borehole C5 predicts again high porosity values, which are not present in the tomography data. It is therefore again most likely to be a local disturbance of the porosity log. Conversely, the high porosity zones visible in borehole C6 at 11 and 13 meters seem to be real features. If only the borehole logs are used for the conditional simulations, it seems that two independent high porosity zones connect the two boreholes at 11 m and 15 m depth, whereas the tomographic information indicates that these two peeks rather belong to one big zone of high average porosity. Further realizations of the conditional simulations can be found in Appendix C.

48

Application of Kriging to Conditional Geostatistical Simulations

Figure 5.5: Conditional simulation use porosity logs only.

5.2 Conditional Simulations of Porosity Distributions in Heterogeneous Aquifers 49

Figure 5.6: Conditional simulation using both porosity logs and crosshole georadar tomography data. The same realization of the unconditional simulation was used as in Figure 5.5.

50

5.2.2

Application of Kriging to Conditional Geostatistical Simulations

Kappelen Velocity [m/ns] 0.082

o = Transmitter x = Receiver

4 0.08

Depth [m]

6

0.078

8

0.076

10

0.074

12

0.072

14

0.07 (b) 0

5

10

15 Distance [m]

20

25

30

0.068

Attenuation [1/m] 0.45

o = Transmitter x = Receiver

4

Depth [m]

6

0.4

8 0.35

10 12

0.3

14 0

5

10

15 Distance [m]

20

25

30

0.25

Figure 5.7: Crosshole georadar tomography between boreholes K4, K3, K3 and K8 at the hydrogeological test site in Kappelen. After Tronicke et al. (2002).

Velocity [m/ms] 2.5

o = Transmitter x = Receiver

5

Depth [m]

2.4

2.3

10

2.2 15 0

5

10

15 Distance [m]

20

25

30

2.1

Figure 5.8: Crosshole seismic tomography between boreholes K4, K3, K3 and K8 at the hydrogeological test site in Kappelen. Courtesy H. Paasche (unpublished data).

The second field data set for a conditional simulation of aquifer porosity structure is from the hydrogeological test site in Kappelen, Canton Berne, Switzerland, where the Centre d’Hydrog´eologie de l’Universit´e de Neuchatˆel (CHYN) has drilled 16 boreholes. The test site and several pump and tracer tests are described by Probst and Zojer (2001). Neutron porosity logs used for conditioning were digitized from Hacini (2002). Tronicke et al. (2003) provided the georadar crosshole tomography, from which porosity values for conditioning data between the boreholes was derived. The corresponding seismic tomography data have not jet been published, but were

5.2 Conditional Simulations of Porosity Distributions in Heterogeneous Aquifers 51

K3

Depth [m]

K4

K2

K8

2

2

2

2

4

4

4

4

6

6

6

6

8

8

8

8

10

10

10

10

12

12

12

12

14

14

14

14

0.1

0.2

0.3

Porosity

0.1

0.2

0.3

Porosity

0.1

0.2

0.3

Porosity

0.1

0.2

0.3

Porosity

Figure 5.9: Neutron porosity logs of boreholes K4, K3, K3 and K8 of Kappelen hydrogeological test site. Logs are irregularly spaced due to manual digitalization. After Hacini (2002).

kindly provided by H. Paasche. The conditional simulation was obtained on a section between four boreholes lying on a straight line (K4, K3, K2 and K8). The parameters for the conditional simulation were compiled in the same way as for BHRS (see section 5.2.1). For the spatial variability, again a ν-value of 0 was estimated. The horizontal correlation length was assumed to be 200 m with an aspect ratio kany = 5. For the georadar crosshole tomography, the spatial resolution was estimated to one meter. The variance of the porosity values derived from tomography was adjusted to that of the porosity logs, which are assumed to represent in situ porosity values. The conversion of georadar tomography velocities to porosity was performed with the two-component mixture model of Wharton et al. (1980) as well as with Topp’s equation (Topp et al., 1980): Φ = −5.3 × 10−2 + 2.92 × 10−2 r − 5.5 × 10−4 2r + 4.3 × 10−6 3r .

(5.11)

The results obtained with the two conversion methods are quite similar. With Topp’s equation the resulting porosity values are on average 2% larger than those obtained with the two-component mixture model. Somewhat surprisingly, the porosity values derived from georadar tomography velocity are, however, significantly higher than the values of the neutron porosity logs: the mean porosity value derived from the georadar data is 29%, whereas the mean porosity from the logs is only 16%. It should be noted that the average porosity estimated from the crosshole georadar

52

Application of Kriging to Conditional Geostatistical Simulations

data is rather in line with the expected porosity of an unconsolidated alluvial aquifer than that obtained from the porosity logs (Sch¨on, 1996). For the conversion of seismic tomography velocities to porosity values, the empirical time-average-equation was used (Wyllie et al., 1958): Φ=

∆t − ∆tm , ∆tf − ∆tm

(5.12)

where ∆t = 1/Vp is the slowness of the P-wave, Vf = 1/∆tm is the velocity of the rock matrix and Vf = 1/∆tf is the velocity of the fluid that fills the pore space. Based on specialized seismic velocity tables (Sch¨on, 1996), the velocities were assumed to be Vf = 1400 ms−1 for the fluid and Vm = 5200 ms−1 for the rock matrix. This results in porosity values with an average of 46%, which is very likely to be a to high value. Wyllie’s equation is, however, known to yield to high porosity values for uncompacted formations (Sch¨on, 1996, p. 233). A correction factor can be adopted to take the effects of compaction or pressure and temperature in account. Unfortunately, for Kappelen test site these informations are not available by present and an adaption of the porosity values to the neutron log or the georadar crosshole tomography data does not seem to be an adequate solution. Figure 5.10 shows a conditional simulation constrained by the neutron logs with porosity values that are likely to be too small and porosity values derived from the seismic crosshole tomography that are likely to be too high. The upper corner frequency of the seismic data was about 800 Hz and the average velocity was about 2400 ms−1 , which indicates that subsampling 2 m is adequate. It is obvious, that the two datasets do not agree, as high porosity pillows are embedded between the generally low values surrounding the boreholes. This result indicates that two datasets are entirely inconsistent. We also see that the more densely sampled region around the boreholes has only a local effect and does not unduly influence sparser sampled regions. Figure 5.11 shows a conditional simulation for the neutron logs only. The probed surface seems to be quite homogeneous so that only few structures can be discerned. The most obvious feature is a zone of higher porosity which is most obvious in Figure 5.12 where the georadar crosshole tomography porosity values are also used as conditioning data. At borehole K8 it is about four meters thick and its top is located at a depth of about eight meters. Toward borehole K4 its thickness is reduced to approximately two meters and its top is located at a depth of about five meters. Figure 5.14 shows a conditional simulation of seismic crosshole tomography data only. The high porosity structure can also be seen here, even though it is not as pronounced as in the georadar data. In exchange a high porosity zone between boreholes K4 and K3 with top at about ten meters and a thickness of about four meters is very distinct from the seismic crosshole tomography data and can also be seen in the georadar data is not present in the neutron logs. Overall, the datasets are not consistent and more information is needed before a consistent interpretation of the porosity structure of this aquifer can be obtained. Appendix C shows additional realizations of conditional simulations.

5.2 Conditional Simulations of Porosity Distributions in Heterogeneous Aquifers 53

Figure 5.10: Conditional simulation between boreholes K4, K3, K2 and K8 conditioned by porosity data from neutron logs and crosshole seismic tomography.

54

Application of Kriging to Conditional Geostatistical Simulations

Figure 5.11: Conditional simulation between boreholes K4, K3, K2 and K8 conditioned by borehole logs only.

5.2 Conditional Simulations of Porosity Distributions in Heterogeneous Aquifers 55

Figure 5.12: Conditional simulation between boreholes K4, K3, K2 and K8 conditioned by porosity data from neutron logs and crosshole georadar tomography.

56

Application of Kriging to Conditional Geostatistical Simulations

Figure 5.13: Conditional simulation between boreholes K4, K3, K2 and K8 conditioned by porosity data from crosshole georadar tomography only.

5.2 Conditional Simulations of Porosity Distributions in Heterogeneous Aquifers 57

Figure 5.14: Conditional simulation between boreholes K4, K3, K2 and K8 conditioned by porosity data from crosshole seismic tomography only.

Chapter 6 Conclusions A program to perform ordinary kriging in two dimensions has been developed. Kriging interpolation cannot be used in the same way as standard interpolation procedures, but has to be adapted to the considered dataset, at least in terms of structural anisotropy (section 4.2), auto-covariance function (sections 2.2 and 4.5) and search neighborhood (section 2.4.1 and 4.3). To achieve optimal results, the user is expected to understand the code and the theory behind it and be able to adapt it to his purposes. The objective in developing this program, therefore, has been focused on well structured, understandable and easy expandable code. A documentation of the program, called vebyk, is given in Chapter 3. The implementation was tested on stochastic models characterized by autocovariance functions of the von K´arm´an type. Somewhat surprisingly, optimal result were obtained by using larger ν-values for the interpolation than for the generation of the stochastic models. The reason for this is that the auto-covariance function of the experimental dataset is not conserved in the interpolation and the interpolated medium is always considerably smoother than the input dataset (section 4.5). Kriging of so-called anti-persistent models with a auto-covariance functions having ν-values smaller than 0.5 produced artefacts that become more pronounced with decreasing ν-values. The predictability of such stochastic models decreases with decreasing ν-values. For power spectral exponents corresponding to β < 1 the stochastic models become entirely unpredictable and the best forecast for an unknown value is the expected value of the model. Therefore, estimated values calculated by kriging progressively approach the expected value of the search neighborhood as the ν-values for the auto-covariance function of the stochastic model decrease. The sampled values in contrast tend to be different from the expected value of the search neighborhood as the roughness of the medium increases with decreasing ν-values. This systematic discrepancy between sampled and estimated values give rises to the observed artefacts. In second part of this work, these implemented kriging algorithm was applied to perform conditional simulations with geophysical data of the hydrological test sites of Boise, Idaho, USA and Kapplen, Bern,Switzerland. The impacts of ν-values, the correlation lengths of the simulated models and subsampling of the spatial structure to resolution were estimated. Conversion of tomographic velocities to porosity values was performed with Wharton’s two-component mixture model and Topp’s equation 58

59

for crosshole georadar data and with Wyllie’s time-average-equation for crosshole seismic data. Unconditional simulations were generated by the spectrum method and were adapted to the conditioning data through kriging interpolation. Whereas the conversion of the georadar crosshole tomographic data produced good results for BHRS and presumably in Kappelen aswell, the conversion of the seismic velocities yielded too high porosity values, due to unknown effects of compression, pressure and temperature. This is a well known effect of Wyllie’s equation for consolidated sediments. A calibration to the neutron porosity logs seemed also not to be adequate. The conditional simulations show that secondary or “soft” data as georadar or seismic crosshole tomography can be successfully incorporated in an aquifer simulation if the data are consistent with primary information, such neutron porosity logs as this was the case for BHRS and to some part also in Kappelen site. Conditional simulation not only allows to simulate facies properties or to integrate secondary data, but also opens the possibility to assess the uncertainties of the modeled aquifer structure. A conditioned simulation is not a deterministic method as kriging and offers many different stochastic solutions. Therefore it is possible to establish a probability distribution rather than a single deterministic estimate by assuming that different realizations characterize unbiased and adequate values in a “space of uncertainty”. This also allows to predict minimal and maximal deviations from a “best estimation” as provided by deterministic estimation technique, such as kriging. Stochastic porosity models, conditioned by “hydrogeophysical” data thus allow for a better understanding of connectivity between porous and non-porous zones in aquifer and for improving flow simulations in comparison to smooth deterministic model.

Chapter 7 Acknowledgments I am in very grateful to my parents for all their efforts. I thank Klaus Holliger for comments and references that guided me through this work and for the patience to correct my English. I also thank Jens Tronicke for the crosshole georadar data of the BHRS site and Hendrick Paasche for providing me the seismic and georadar crosshole tomography data for Kappelen site.

60

Appendix A Kriging Kriging of a stochastic model with a von K´arm´an auto-covariance function with variance σ = 1, ν = 0.5, horizontal correlation length ax = 100 m and an anisotropy factor kany = 10. The model is shown in Figure 4.11. Figures A.1 through A.3 show kriging interpolations assuming every fourth point to be an observation with different ν-values for the von K´arm´an auto-covariance function.

y [point #]

2.5 20

2

40

1.5

60

1

80

0.5

100

0

120

−0.5

140

−1

160

−1.5

180

−2 −2.5

200 20

40

60

80

100

120

140

160

180

200

x [point #]

Figure A.1: Result of kriging interpolation for an auto-covariance function with ν = 0.2. The other parameters for the auto-covariance function are the same as those of the input model.

61

62

Kriging

2 20 1.5 40 1 60 0.5

y [point #]

80

100

0

120

−0.5

140

−1

160 −1.5 180 −2 200 20

40

60

80

100

120

140

160

180

200

x [point #]

Figure A.2: Result of kriging interpolation for an auto-covariance function with the same parameters as for the auto-covariance function of the input model.

1.5

20

40 1 60 0.5

y [point #]

80

100

0

120 −0.5 140 −1 160

180

−1.5

200 20

40

60

80

100

120

140

160

180

200

x [point #]

Figure A.3: Result of kriging interpolation for an auto-covariance function with ν = 0.9, which turned out to offer best results.

63

In addition to cross-validation in section 4.5, the stochastic model with the von K´arm´an auto-covariance function with ν = 0.5 and a horizontal correlation length ax = 100 m has been cross-validated with von K´arm´an functions that have different correlation lengths. Figures A.4 through A.6 show the sum of absolute error, and the “x” denotes the location of the global minimum. 160

160

0.25

0.5

150

0.5

150

0.75

140

0.75

140

1

130

1

130

1.25

120

1.5

ν

ν

0.25

1.25

120

1.5 110

1.75

110

1.75 100

2

100 2

5

10

15

20

5

Anisotropy

10

15

20

Anisotropy

Figure A.4: Cross-validation with correla- Figure A.5: Cross-validation with correlation length ax = 20 m. tion length ax = 50 m.

160

ν

0.25

0.5

150

0.75

140

1

130

1.25

120

1.5 110

1.75 100 2 5

10

15

20

Anisotropy

Figure A.6: Cross-validation with correlation length ax = 200 m.

Appendix B Boise Appendix B presents different realizations of the conditional simulations for BHRS. The kriged version of the observed data is shown as well, and represents the dataset before the unconditional realization is added. The data values of this stochastic interpolation are of qualitative character only, as the unconditional realization at the observed points is added and the mean of the region is subtracted. The kriging interpolation between logs only is shown in Figure 4.7. The “seed” is a calibration factor for the Matlab random number generator, which allows to regenerate the same “random” dataset. This is useful for comparison of conditional simulations with different input datasets. The seed for conditional simulations used in section 5.2.2 was 41. Figures B.4 through B.6 show impact of the magnitude of the correlation lengths on the conditional simulations. Here the horizontal and vertical correlation lengths are 10 m, 2 m, respectively, which is about an order of magnitude smaller.

64

65

Figure B.1: Conditional simulation for BHRS. Only logs were used as conditioning data. Random number seed = 41.

Figure B.2: Conditional simulation for BHRS. Only logs were used as observed data. Random number seed = 1203.

66

Boise

Figure B.3: Conditional simulation for BHRS. Only logs were used as observed data. Random number seed = 1275.

Figure B.4: Conditional simulation for BHRS. Only logs were used as observed data. The correlation length were chosen an order of magnitude smaller than for the simulations shown in Figures B.1 to B.3 (ax = 10 m, ay = 2 m). Random number seed = 41. Compare to Figure B.1.

67

Figure B.5: Conditional simulation for BHRS. Only logs were used as observed data. The correlation length were chosen an order of magnitude smaller than for the simulations shown in Figures B.1 to B.3 (ax = 10 m, ay = 2 m). Random number seed = 1203. Compare to Figure B.2.

Figure B.6: Conditional simulation for BHRS. Only logs were used as observed data. The correlation length were chosen an order of magnitude smaller than for the simulations shown in Figures B.1 to B.3 (ax = 10 m, ay = 2 m). Random number seed = 1275. Compare to Figure B.3.

68

Boise

Figure B.7: Kriging interpolation of the observed datas, consisting of log and georadar tomography data.

Figure B.8: Conditional simulation for BHRS site. Log and georadar tomography data was used as conditioning datas. Random number seed = 41.

69

Figure B.9: Conditional simulation for BHRS site. Log and georadar tomography data was used as conditioning datas. Random number seed = 1203.

Figure B.10: Conditional simulation for BHRS site. Log and georadar tomography data was used as conditioning datas. Random number seed = 1275.

Appendix C Kappelen Appendix C presents different realizations of the conditional simulations for Kappelen dataset. The kriged version of the observed data is shown as well, which represents the dataset before the unconditional realization is added. The data values of this interpolation are of qualitative character only, as the unconditional simulation at the observed points is added and the mean of the region is subtracted. The “seed” is a calibration factor for the Matlab random number generator, which allows to generate the same “random” dataset again. This is useful for comparison of conditional simulations with different input datasets. The seed for conditional simulations used in section 5.2.2 was 54.

Figure C.1: Kriging interpolation using logs only.

70

71

Figure C.2: Conditional simulation for Kappelen site. Only logs were used as conditioning data. Random number seed = 43.

Figure C.3: Conditional simulation for Kappelen site. Only logs were used as conditioning data. Random number seed = 1203

Figure C.4: Conditional simulation for Kappelen site. Only logs were used as conditioning data. Random number seed = 1275

72

Kappelen

Figure C.5: Kriging interoplation of the observed dataset, consisting of log and georadar tomography data.

Figure C.6: Conditional simulation for Kappelen site. Log and georadar tomography data were used as conditioning constraints. Random number seed = 43

Figure C.7: Conditional simulation for Kappelen site. Log and georadar tomography data were used as conditioning constraints. Random number seed = 1203

73

Figure C.8: Conditional simulation for Kappelen site. Log and georadar tomography data was used as conditional constraints. Random number seed = 1275

Figure C.9: Kriging interoplation of the observed dataset consisting of log and crosshole, seismic tomography data. The two datasets do obviously not coincide with each other.

Figure C.10: Kriging interoplation of the observed dataset consisting of crosshole, seismic tomography data only.

74

Kappelen

Figure C.11: Conditional simulation for Kappelen site. Seismic tomography data only was used as conditioning constraints. Random number seed = 43.

Figure C.12: Conditional simulation for Kappelen site. Seismic tomography data was used only as conditioning constraints. Random number seed = 1203.

Figure C.13: Conditional simulation for Kappelen site. Seismic tomography data was used only as conditioning constraints. Random number seed = 1275.

List of Figures 2.1 2.2

Covariance and semi-variogram . . . . . . . . . . . . . . . . . . . . . von K´arm´an covariance functions . . . . . . . . . . . . . . . . . . . .

7 8

3.1 Hierarchical structure of vebyk subfunctions . . . . . . . . . . . . . . 15 3.2 Anisotropy ellipsoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16 4.17 4.18 4.19 4.20 4.21 4.22

Example arrangement . . . . . . . . . . . . . . . . . . . . . . . . . . Kriging weights for too small a search neighborhood . . . . . . . . . . Kriging weights for a larger search neighborhood . . . . . . . . . . . . Kriging weights for an anisotropic search neighborhood . . . . . . . . High-frequent oscillations caused by too small a search neighborhood Artefacts generated by a simple search neighborhood . . . . . . . . . Kriging with a quadrant search neighborhood . . . . . . . . . . . . . Crossplot of true vs. estimated value . . . . . . . . . . . . . . . . . . Crossplot of true value vs. corresponding estimation error . . . . . . . Spatial distribution of errors in cross-validation . . . . . . . . . . . . Stochastic model with ν = 0.5 . . . . . . . . . . . . . . . . . . . . . . Subsampled input model for jackknifing . . . . . . . . . . . . . . . . . Kriging estimation of subsampled input model . . . . . . . . . . . . . Spline interpolation of model with ν = 0.5 . . . . . . . . . . . . . . . Stochastic model with ν = 0 . . . . . . . . . . . . . . . . . . . . . . . Kriging estimation of regular spaced input model . . . . . . . . . . . Spline interpolation of model with ν = 0 . . . . . . . . . . . . . . . . Kriging estimation of random spaced input model . . . . . . . . . . . Best fitting ν-values . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sum of absolute error for ν = 0 . . . . . . . . . . . . . . . . . . . . . Sum of absolute error for ν = 0.5 . . . . . . . . . . . . . . . . . . . . Sum of absolute error for ν = 0.7 . . . . . . . . . . . . . . . . . . . .

23 24 25 25 26 27 28 29 30 30 31 32 32 33 34 35 35 36 37 38 39 39

5.1 5.2 5.3 5.4 5.5 5.6

Tomography of Boise, ID . . . . . . . . . . . . . . . . . . . . . . . . Boise porosity logs . . . . . . . . . . . . . . . . . . . . . . . . . . . Spectral density of porosity logs . . . . . . . . . . . . . . . . . . . . Locations of sampled data for Boise . . . . . . . . . . . . . . . . . . Conditional simulation for Boise with borehole data only . . . . . . Conditional simulation for Boise with boreholes and georadar porosity data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Georadar tomography of Kappelen . . . . . . . . . . . . . . . . . .

43 44 45 47 48

5.7

75

. . . . .

. 49 . 50

76

LIST OF FIGURES

5.8 Seismic tomography of Kappelen . . . . . . . . . . . . . . . . . . . . 5.9 Porosity logs of Kappelen . . . . . . . . . . . . . . . . . . . . . . . . 5.10 Conditional simulation for Kappelen with borehole and seismic porosity data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.11 Conditional simulation for Kappelen with borehole data only . . . . . 5.12 Conditional simulation for Kappelen with borehole and georadar porosity data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.13 Conditional simulation for Kappelen with georadar porosity data only 5.14 Conditional simulation for Kappelen with seismic porosity data only .

55 56 57

A.1 A.2 A.3 A.4 A.5 A.6

Kriging Kriging Kriging Kriging Kriging Kriging

. . . . . .

. . . . . .

61 62 62 63 63 63

B.1 B.2 B.3 B.4 B.5 B.6 B.7 B.8 B.9 B.10

Conditional simulation for BHRS logs . . . . . . . . . . . . . . . . . Conditional simulation for BHRS logs . . . . . . . . . . . . . . . . . Conditional simulation for BHRS logs . . . . . . . . . . . . . . . . . Conditional simulation for BHRS logs with short correlation length Conditional simulation for BHRS logs with short correlation length Conditional simulation for BHRS logs with short correlation length Kriged observed dataset of log and georadar data at BHRS . . . . . Conditional simulation for BHRS log and georadar data . . . . . . . Conditional simulation for BHRS log and georadar data . . . . . . . Conditional simulation for BHRS log and georadar data . . . . . . .

. . . . . . . . . .

65 65 66 66 67 67 68 68 69 69

. . . . . . . . . . . . .

70 71 71 71 72 72 72 73 73 73 74 74 74

with with with with with with

ν = 0.2 ν = 0.5 ν = 0.9 different different different

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . correlation length correlation length correlation length

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

C.1 Kriged dataset of log only at Kappelen . . . . . . . . . . . . . C.2 Conditional simulation for Kappelen logs . . . . . . . . . . . . C.3 Conditional simulation for Kappelen logs . . . . . . . . . . . . C.4 Conditional simulation for Kappelen logs . . . . . . . . . . . . C.5 Kriged observed dataset of log and georadar data at Kappelen C.6 Conditional simulation for Kappelen log and georadar data . . C.7 Conditional simulation for Kappelen log and georadar data . . C.8 Conditional simulation for Kappelen log and georadar data . . C.9 Kriged observed dataset of log and seismic data at Kappelen . C.10 Kriged observed dataset at Kappelen of seismic data only . . . C.11 Conditional simulation for Kappelen with seismic data only . . C.12 Conditional simulation for Kappelen with seismic data only . . C.13 Conditional simulation for Kappelen with seismic data only . .

. . . . . .

. . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

50 51 53 54

Bibliography Armstrong, M., Basic Linear Geostatistics, Springer, 1998. Bendat, J. S. and Piersol, A. G., Random Data, John Wiley & Sons, third edn., 2000. Chemingui, N., Modeling 3-D anisotropic fractal media, Tech. Rep. 80, Stanford Exploration Project, 2001. Christakos, G., Random Field Models in Earth Sciences, Academic Press Inc., 1992. de Boor, C., A Practical Guide to Splines, Springer, 1978. Gelhar, L. W., Stochastic Subsurface Hydrology, Prentice Hall, 1993. Goff, J. A. and ”Jennings, jr., J. W., Improvement of Fourier-based unconditional and conditional simulations for band-limited fractal (von K´arm´an) statistical models, Mathematical Geology, 31 , 627–649, 1999. Goff, J. A. and Jordan, T. H., Stochastic modeling of seafloor morphology: Inversion of sea beam data for second-order statistics, Journal of Geophysical Research, 93 , 13 589–13 608, 1988. Hacini, Y., Contribution `a l’´etude g´eophysique et hydrog´eologique du site test de Kappelen (BE) `a l’aide des diagraphies, Master’s thesis, Universit´e de Lausanne, 2002. Hardy, H. H. and Beier, R. A., Fractals in Reservoir Engineering, World Scientific Publishing Co., 1994. Hassan, A. E., Significance of porosity variability to transport in heterogeneous porous media, Water Resources Research, 34 , 2249–2259, 1998. Hergarten, S., Self-Organized Criticality in Earth Systems, Springer, 2003. Holliger, K., Upper crustal seismic velocity heterogeneity as derived from a variety of p-wave sonic logs, Geophysical Journal International , 125 , 813–829, 1996. Holliger, K. and Levander, A. R., Stochastic modeling of the reflective lower crust: Petrophysical and geological evidence from the ivera zone (northern italy), Journal of Geophysical Research, 98 , 11 967–11 980, 1993. 77

78

BIBLIOGRAPHY

Isaaks, E. H. and Sarivastava, R., An Introduction to Applied Geostatistics, Oxford University Press, 1989. Journel, A. G. and Huijbregts, C. J., Mining Geostatistics, Centre de G´eostatistique Fontainbleau, France, 1978. Kelkar, M. and Perez, G., Applied Geostatistics for Reservoir Charcterization, Society of Petroleum Engineers, Richardson, Texas, 2002. Kitanidis, P. K., Introduction to Geostatistics, Cambridge University Press, 1997. Klimeˇs, L., Correlation functions of random media, Pure and Applied Geophysics, 159 , 1811–1831, 2002. Krige, D. G., A statistical approach to some basic mine valuation problems on the witwatersrand, Journal of the Chemical Metallurgical & Mining Society of South Africa, 52 , 119–139, 1951. Luenberger, D. G., Linear and Nonlinear Programming, Addison-Wesley, second edn., 1984. Matheron, G., Les Variables R´egionalis´ees et leur Estimation, Masson et Cie, 1965. Papula, L., Mathematische Formelsammlung f¨ ur Ingeneure und Naturwissenschaftler , Vieweg, 1994. Probst, M. and Zojer, H., Tracer studies in the unsaturated zone and groundwater (investigations 1996-2001), Beitr¨age zur Hydrogeologie, 52 , 3–232, 2001. Sch¨on, J. H., Physical Properties of Rocks: Fundamentals and Principles of Petrophysics, Pergamon, 1996. Topp, G. C., Davis, J. L., and Annan, A. P., Electromagnetic determination of soil water content: Measurements in coaxial transmission lines, Water Resources Research, 16 , 574–582, 1980. Tronicke, J., Paasche, H., Holliger, K., and Green, A., Combining crosshole georadar velocity and attenuation tomography for site characterization: A case study in an unconsolidated aquifer, in 9th International Conference on Ground Penetrating Radar , edited by S. K. Koppenjan and H. Lee, vol. 4758, pp. 170–175, Proceedings of SPIE, 2002. Tronicke, J., Holliger, K., Barrash, W., and Knoll, M. D., Multivariate analysis of crosshole georadar velocity and attenuation tomograms for aquifer zonation, submitted to Water Resources Research, 2003. von K´arm´an, T., Progress in the statistical theory of turbulence, Journal of Maritime Researches, 7 , 252–264, 1948. Western, A. W. and Bl¨oschl, G., On the spatial scaling of soil moisture, Journal of Hydrology, 217 , 203–224, 1999.

BIBLIOGRAPHY

79

Wharton, R. P., Rau, R. N., and Best, D. L., Electromagnetic propagation logging: Advances in technique and interpretation, in SPE 9267 , American Institute of Mining, Metallurgical, and Petroleum Engineers, 1980. Williamson, P. R. and Worthington, M. H., Resolution limits in ray tomography due to wave behavior: Numerical experiments, Geophysics, 58 , 727–735, 1993. Wyllie, M. R. J., Gregory, A. R., and Gardner, G. H. F., An experimental investigation of factors affecting elastic wave velocities in porous media, Geophysics, 23 , 459–493, 1958.

Suggest Documents