Geostatistical Simulation of Regionalized Pore-Size ... - CiteSeerX

12 downloads 0 Views 1MB Size Report
Sep 5, 2000 - confirm the usefulness of the MAF approach for the simulation of large ... are best characterized not by a small number of model parameters but ...
P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Mathematical Geology, Vol. 32, No. 8, 2000

Geostatistical Simulation of Regionalized Pore-Size Distributions Using Min/Max Autocorrelation Factors1 A. J. Desbarats2 and R. Dimitrakopoulos3 In many fields of the Earth Sciences, one is interested in the distribution of particle or void sizes within samples. Like many other geological attributes, size distributions exhibit spatial variability, and it is convenient to view them within a geostatistical framework, as regionalized functions or curves. Since they rarely conform to simple parametric models, size distributions are best characterized using their raw spectrum as determined experimentally in the form of a series of abundance measures corresponding to a series of discrete size classes. However, the number of classes may be large and the class abundances may be highly cross-correlated. In order to model the spatial variations of discretized size distributions using current geostatistical simulation methods, it is necessary to reduce the number of variables considered and to render them uncorrelated among one another. This is achieved using a principal components-based approach known as Min/Max Autocorrelation Factors (MAF). For a two-structure linear model of coregionalization, the approach has the attractive feature of producing orthogonal factors ranked in order of increasing spatial correlation. Factors consisting largely of noise and exhibiting pure nugget–effect correlation structures are isolated in the lower rankings, and these need not be simulated. The factors to be simulated are those capturing most of the spatial correlation in the data, and they are isolated in the highest rankings. Following a review of MAF theory, the approach is applied to the modeling of pore-size distributions in partially welded tuff. Results of the case study confirm the usefulness of the MAF approach for the simulation of large numbers of coregionalized variables. KEY WORDS: principal component analysis; coregionalization.

INTRODUCTION In many fields of the Earth Sciences, one is interested in characterizing the size and relative abundance of mineral particles or voids within a sample. Measurements of size distributions are usually expressed as weight or volume fractions corresponding to a suite of size classes discretizing the range of values within 1Received

1 June 1999; accepted 2 October 1999. Survey of Canada, 601 Booth St., Ottawa, ON K1A 0E8, Canada. e-mail: desbarat@ NRCan.gc.ca 3W. H. Bryan Mining Geology Research Centre, University of Queensland, Brisbane, Qld 4072, Australia. 2Geological

919 C 2000 International Association for Mathematical Geology 0882-8121/00/1100-0919$18.00/1 °

P1: FLF Mathematical Geology [mg]

920

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Desbarats and Dimitrakopoulos

the sample. In sedimentology, the distribution of grain sizes in a sediment can be used to characterize depositional environment (Krumbein, 1934; Bagnold and Barndorff-Nielsen, 1980) or hydraulic properties (Alyamani and Sen, 1993). The distribution of pore sizes may be used to estimate capillary properties (Wardlaw and Taylor, 1976; Basan and others, 1997) or to predict the diffusion properties of radionuclides in fractured media (Agterberg, Katsube, and Lew, 1984). Reserve estimates in diamond deposits are based on the measurement of stone-size distributions (Sichel, 1973; Kleingeld and Lantuejoul, 1993). Other mining applications include liberation curves (Manieh, 1984; Yim, 1984; Smith, 1993), which characterize ore texture through the separation between economic and gangue minerals achieved for different crushed particle sizes. Because size distribution curves reflect sedimentological, mineralogical, or petrophysical characteristics of a geological medium, they may be expected to show a degree of spatial continuity imparted by the common physical or chemical processes that have shaped the medium at sampled locations. Therefore, by analogy with the concept of regionalized variable, we may think of particle-size distributions as regionalized functions or curves (Goulard and Voltz, 1993) and model them within a geostatistical framework (Journel and Huijbregts, 1978). The goal of this study is to develop a geostatistical approach for simulating the in situ spatial variations of size distributions in geological media. Such an approach ought to be useful in a wide range of Earth science applications. Traditionally, particle-size distributions have been represented by parametric models such as the Lognormal (Krumbein, 1934), Hyperbolic (Bader, 1970), and Fractal (Smith, 1993; Perfect, 1997). However, these models do not adequately describe the bi- or multimodal distributions frequently encountered in practice without resorting to complex algorithms for fitting mixed populations (Agterberg, Katsube, and Lew, 1984). An increasingly popular view is that size distributions are best characterized not by a small number of model parameters but by their entire spectrum as provided directly by the complete suite of measured size class frequencies (Full, Ehrlich, and Kennedy, 1984). However, depending on the application, the number of size classes may be quite large, ranging between 10 and 40. Furthermore, the class frequencies may exhibit correlation among one another so that significant information redundancy may be present in the data. Therein lie the two main challenges faced in the development of a geostatistical simulation approach for size distributions: dimensional reduction (reducing the number of variables) and decorrelation (obtaining variables that can be simulated independently). For the present discussion, it is significant that discretized size distribution data are very similar in nature to multispectral remote sensing data where each sample location (map pixel) is associated with a spectrum of energy reflectances discretized by a large number of channels or “bands.” In both sedimentology (Klovan, 1966; Mather, 1972; Pirkle and others, 1984; Davis, 1986) and remote sensing (Chavez and Kwarteng, 1989; Dwivedi and

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

Geostatistical Simulation of Pore-Size Distribution

12:47

Style file version June 30, 1999

921

Ravi-Sankar, 1992), the multivariate technique known as principal component analysis (PCA) (Joreskog, Klovan, and Reyment, 1976; Davis, 1986) has been used to obtain reduced sets of transformed variables with enhanced interpretability and lower noise content. Based on an eigenvector–eigenvalue decomposition of the variance–covariance matrix between variables, a sequence of uncorrelated transformed variables, or principal components, is determined so as to account for successively smaller fractions of the total variance within the data. Since most of this variability is usually explained by the first few principal components, a more compact representation of the data is achieved. However, as frequently pointed out (Switzer and Green, 1984; Goovaerts, 1993; Wackernagel, 1995), the naive application of PCA to the (lag zero) variance–covariance matrix of regionalized variables ignores spatial correlation within the data. Only when the multivariate spatial data can be represented by an “Intrinsic” model of coregionalization (Journel and Huijbregts, 1978) does spectral decomposition of the variance–covariance matrix yield principal components uncorrelated at all lags (Goovaerts, 1993; Wackernagel, 1995). While multivariate spatial data can seldom be described rigorously using the simple Intrinsic model, it is often assumed in practice. Principal component analysis can then be used as a crude way of decorrelating multivariate data so that kriging or simulation can be performed on quasi-independent variables, thereby eliminating the need for cumbersome techniques involving cross-correlations (Borgman and Frahme, 1976; Luster, 1985; Suro-Perez and Journel, 1991; Desbarats, 1995, 1997). In general, multivariate correlation depends on spatial scale and a more realistic representation of multivariate spatial data is given by the “Linear” model of coregionalization (Journel and Huijbregts, 1978; Goovaerts, 1993; Wackernagel, 1995). This model assumes that the variance–covariance matrix can be decomposed into a sum of covariance matrices where each is associated with one of a sequence of nested correlation structures at different spatial scales. Unfortunately, for this coregionalization model, principal components are no longer uncorrelated at all lags (Wackernagel, 1995). Several authors have tried to address this problem using methods based on the extraction of principal components from auto/cross-covariance matrices at nonzero lags (Grunsky and Agterberg, 1988; Tercan, 1999). In a comparison of several decomposition techniques based on simulated data, Tercan (1999) showed that the two-lag Cholesky–Spectral approach is particularly effective for decorrelating variables. This paper reexamines a somewhat neglected two-lag decomposition approach developed by Switzer and Green (1984) for the processing of multispectral remote sensing imagery. This approach has features that make it particularly suitable for addressing the simultaneous problems of dimensional reduction and variable decorrelation inherent in the simulation of regionalized size spectra. Following a review of the theory, the approach is illustrated with real data in the simulation of regionalized pore-size distributions.

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

12:47

922

Style file version June 30, 1999

Desbarats and Dimitrakopoulos

MINIMUM/MAXIMUM AUTOCORRELATION FACTORS In order to address shortcomings in the processing of remote sensing data using spatial filtering and standard principal components, Switzer and Green (1984) developed the procedure known as Minimum/Maximum Autocorrelation Factors or MAF (Berman, 1985; Green and others, 1988; Wackernagel, Petitgas, and Touffait, 1989). It is a principal components-based approach incorporating global spatial statistics of the data in addition to the nonspatial statistics of the lag-zero variance–covariance matrix. This section provides a brief review of the MAF procedure set in a geostatistical context. The MAF are defined as p orthogonal linear combinations Yi (x) = aiT Z (x) i = 1, . . . , p of the original multivariate observation vector Z (x) = (Z 1 (x), . . . , Z p (x))T . Each transform Yi (x) is determined so as to exhibit greater spatial correlation than any of the previously determined transforms Y j (x) while remaining orthogonal to these transforms. Following Switzer and Green (1984), let ρi (1) be the spatial correlation between Yi (x) and Yi (x + h) for a short lag h = 1. Then, we seek the vectors of coefficients ai , i = 1, . . . , p such that ¡ ¢ ρ1 (1) = corr a1T Z (x), a1T Z (x + ∆) = min corr(a T Z (x), a T Z (x + ∆)) a ¡ T ¢ ρi (1) = corr ai Z (x), aiT Z (x + ∆) = min corr(a T Z (x), a T Z (x + ∆)) (1) a ¡ T ¢ T ρ p (1) = corr a p Z (x), a p Z (x + ∆) = max corr(a T Z (x), a T Z (x + ∆)) a

where the minimization of the ai in (1) is subject to the orthogonality constraint ¢ ¡ corr aiT Z (x), a Tj Z (x) = 0

for j < i

(2)

It can be shown (Switzer and Green, 1984; Berman, 1985) that the vectors of coefficients ai yielding the MAF ranking of transforms are obtained as the lefthand eigenvectors of the nonsymmetric matrix B1 B −1 , where B1 = cov[(Z (x) − Z (x + ∆)), (Z (x) − Z (x + ∆))] = 20 Z (∆) B = cov[Z (x), Z (x)]

(3)

B1 is the covariance matrix of lag 1 differences, B is the variance-covariance matrix of Z (x) and 0 Z (∆) is the lag 1 variogram matrix (Wackernagel, 1995). A formal proof of the MAF procedure can be found in Berman (1985). Consider a multivariate, p dimensional, stationary spatial random function Z (x) = [Z 1 (x), . . . , Z p (x)]T such that Z (x) = S(x) + N (x) where S(x) and

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Geostatistical Simulation of Pore-Size Distribution

923

N (x) are uncorrelated signal and noise components, respectively. The variance– convariance matrices of N (x), S(x), and Z (x) are then Cov[N (x), N (x)] = C N (0) = B0 Cov[S(x), S(x)] = C S (0) = B1

(4)

Cov[Z (x), Z (x)] = C Z (0) = B0 + B1 = B Assuming that the spatial correlation structures of S(x) and N (x) can be represented by Intrinsic coregionalization models, Z (x) itself is then represented by a two-structure Linear model of coregionalization (Goovaerts, 1993). The spatial covariance matrices are written Cov[N (x), N (x + h)] = C N (h) = B0 ρ0 (h) Cov[S(x), S(x + h)] = C S (h) = B1 ρ1 (h)

(5)

Cov[Z (x), Z (x + h)] = C Z (h) = B0 ρ0 (h) + B1 ρ1 (h) where ρ0 (h) and ρ1 (h) are scalar spatial correlation functions such that ρ1 (h) > ρ0 (h) for all h. For regionalized variables composed of signal and noise, it is reasonable to assume that the signal will exhibit greater spatial correlation than the noise. In particular, If N (x) has a nugget-effect correlation structure, then ρ0 (h) = δ(h), where δ(h) = 1 if h = 0 and δ(h) = 0 otherwise. From these definitions, the variance–covariance matrix of the difference (Z (x) − Z (x + ∆)) for some short lag 1 is found to be Cov[(Z (x) − Z (x + ∆)), (Z (x) − Z (x + ∆))] = 20 Z (∆) = 2(1 − ρ1 (∆))B + 2(ρ1 (∆) − ρ0 (∆))B0

(6)

where 0 Z (∆) is the semivariogram matrix at lag 1. Multiplying (6) by B −1 , the product can be expressed in terms of the matrix of its left-hand eigenvectors A T and the diagnoal matrix of its eigenvalues 3: A T 0 Z (∆)B −1 = (1 − ρ1 (∆))A T + (ρ1 (∆) − ρ0 (∆))A T B0 B −1 =

3 T A 2

(7)

Orthogonal eigenvectors form the columns of A T and the corresponding eigenvalues, in order of decreasing magnitude, form the diagonal elements of 3. Furthermore, it is assumed that the matrix of eigenvectors A T has been normalized so that

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

924

Desbarats and Dimitrakopoulos

AT BA = I . Then, postmultiplying (7) by the matrix product BA yields (1 − ρ1 (∆))I + (ρ1 (∆) − ρ0 (∆))A T B0 A =

3 2

(8)

From (7) and (8) it can be seen that the eigenvectors of 20 Z (∆)B −1 are the same as those of B0 B −1 and are independent of lag 1. Consider next the multivariate spatial random function Y (x) obtained from the transform Y (x) = A T Z (x). It can easily be shown that the variance–covariance matrix of Y (x) is equal to the identity matrix I and that the spatial correlation matrix of Y (x) at lag 1 is Corr[Y (x), Y (x + ∆)] = ρ1 (∆)I − (ρ1 (∆) − ρ0 (∆))A T B0 A

(9)

However, combining (8) and (9) shows that Corr[Y (x), Y (x + ∆)] = I −

3 2

(10)

The matrix of eigenvalues 3 is diagonal hence the elements Yi (x) of Y (x) are orthogonal at lag 1 as well as at lag 0, regardless of the coregionalization model. For the two-structure Linear coregionalization model for Z (x) assumed here, it is readily shown that orthogonality is ensured at all lags. Furthermore, because eigenvalues λi within 3 decrease with increasing index i, (10) shows that the Yi (x) are ranked in order of increasing spatial correlation. These are the properties that are sought for the MAF; thus, the MAF transform is defined by Y (x) = A T Z (x), where A T is the matrix of normalized eigenvectors of 20 Z (∆)B −1 . Switzer and Green (1984) describe a computational approach for obtaining the matrix of eigenvectors A T and the matrix of eigenvalues 3 in a way that avoids spectral decomposition of a nonsymmetric matrix. Their approach involves the following steps: 1. Calculate the spectral decompostion of the symmetric matrix B = HDHT into a matrix of orthonormal eigenvectors H and a diagonal matrix of eigenvalues D. 2. Calculate the transformed variables V (x) = W T Z (x), where W = HD −1/2 so that W T BW = I . 3. Calculate the covariance matrix of the lag 1 difference vector (V (x) − V (x + ∆)). For a two-structure Linear coregionalization model, this matrix is given by Cov[(V (x) − V (x + ∆)), (V (x) − V (x + ∆))] = 2(1 − ρ1 (∆))I + 2(ρ1 (∆) − ρ0 (∆))W T B0 W

(11)

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Geostatistical Simulation of Pore-Size Distribution

925

4. Calculate the spectral decomposition of the covariance matrix of (V (x) − V (x + ∆)) into a matrix of orthonormal eigenvectors C and a diagonal matrix of eigenvalues 3: (1 − ρ1 (∆))I + (ρ1 (∆) − ρ0 (∆))W T B0 W = C

3 T C 2

(12)

3 2

(13)

Rearranging this equation yields (1 − ρ1 (∆))I + (ρ1 (∆) − ρ0 (∆))C T W T B0 WC =

which is identical to (8) above if A = WC. Thus, the desired transform matrix A T is given by C T D −1/2 H T , where A T BA = I since H T H = C TC = I . It is interesting to examine the results of the MAF procedure when other coregionalization models for Z (x) are considered. If B0 , the variance–covariance matrix of N (x), is diagonal with identical elements, then from (7) it can be seen that the MAF procedure amounts to the naive extraction of the eigenvectors of B (which are the same as those of B −1 ) (Switzer and Green, 1984; Goovaerts, 1993). As pointed out in Switzer and Green (1984), this may explain the frequent success of naive principal components for ordering image quality in remote sensing data sets when noise is not cross-correlated among channels. If Z (x) itself can be represented by an Intrinsic coregionalization model then B0 = bB and B1 = (1 − b)B, where b is a constant between 0 and 1. It follows from (6) that 20 Z (∆)B −1 is proportional to the identity matrix in which case a naive principal component transfrom is all that is required in order to decorrelate Z (x) at all lags. Min/Max Autocorrelation Factors have several features that make them particularly attractive in the conditional simulation of multivariate spatial random functions. If N (x) has a pure nugget-effect structure, then MAF calculation from the eigenvectors of 20 Z (∆)B −1 for a small lag 1 will isolate the nugget-effect structure in a small number of transformed variabels Yi (x) with low index i. Since it is inefficient to model nugget effect structures using standard conditional simulation routines such as SGSIM (Deutsch and Journel, 1992), this provides an opportunity to screen a few variables immediately. Since the MAF procedure ranks transformed variables by the strength of their spatial correlation, it is easier to select a subset of variables for full conditional simulation. When Z (x) can be represented by a two-structure Linear model of coregionalization, the MAF procedure rigorously decorrelates the transformed variables at all lags so that conditional simulations of these variables can be performed independently. If the coregionalization model for Z (x) is more complex, then at least the transformed variables can be largely decorrelated over shorter lags corresponding to the search radius for conditioning data in the simulation routine.

P1: FLF Mathematical Geology [mg]

926

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Desbarats and Dimitrakopoulos

While the MAF transform procedure may be useful prior to the simulation of any multivariate spatial random field, it should be particularly suitable for the simulation of size distributions or spectra discretized into classes or “channels.” By the nature of the variables and the measurement methods, both signal and noise components are likely to be highly cross-correlated. In the remainder of this paper, the MAF transform is demonstrated through an application to the conditional simulation of regionalized pore-size distributions. SIMULATION OF REGIONALIZED PORE-SIZE DISTRIBUTIONS AT THE APACHE LEAP TUFF SITE Thick tuff deposits in the unsaturated zone are being considered as host rocks for the disposal of high-level radioactive waste. Hydraulic and other petrophysical properties of these rocks are critical to radionuclide containement and they must be thoroughly characterized before repository performance assessments can proceed. In order to investigate the natural spatial variability of unsaturated flow parameters and to develop appropriate sampling methods, extensive field measurements have been conducted at the Apache Leap Tuff site in central Arizona (Vogt, 1988; Rasmussen and others, 1990). Core samples of massive tuff were obtained from nine inclined drill holes drilled on three parallel sections, 5 m apart (Fig. 1). A full suite of hydraulic properties were determined on 105 samples taken at approximate 3 m intervals along the holes. Pore-size distributions were obtained using the mercury intrusion technique (Wardlaw and Taylor, 1976; Case, Ghiglieri, and Rennie, 1987) whereby cumulative intrusion volume is measured for a series of pressure increments corresponding to a decreasing series of equivalent pore diameters. The data used here are derived from Tables A5a and A5b of Rasmussen and others (1990). However, the 32 incremental intrusion volumes of the original measurement suite for each sample have been recombined into 15 values corresponding to broader pore-size classes. The upper bound of the smallest class is 0.0077 µm whereas the lower bound of the largest class is 6.30 µm. Classes are essentially the

Figure 1. Configuration of boreholes and sample locations at the Apache Leap site.

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Geostatistical Simulation of Pore-Size Distribution

927

Figure 2. Typical discretized pore-size distribution measured by mercury intrusion on a sample of massive partially welded tuff.

same for each sample. Earlier studies (Vogt, 1988; Rasmussen and others, 1993) have shown that the distribution of pore sizes is bimodal with modes corresponding to pore diameters of 0.07 and 3 µm. A typical pore-size distribution, discretized into 15 classes, is shown in Figure 2. The y axis of this figure represents the pore volume fraction or porosity associated with a given pore-diameter class interval. The goal of this case study is to illustrate the MAF approach in the conditional simulation of regionalized pore-size distributions such as the one shown here. Normal-Score Transformation A preliminary Normal-score transformation is performed on the 15 variables representing the pore volume fractions associated with each pore-diameter class. Being based on a rank ordering of sample values, this transformation decreases the influence of outliers, and makes more robust the estimation of covariance matrices and experimental variograms in subsequent steps of the simulation process. Since it is a nonlinear transformation, it does have an effect on the MAF transformation performed next. MAF Transformation The transformation matrix A T yielding the Min/Max Autocorrelation Factors and the corresponding diagonal matrix 3/2 of eigenvalues are calculated using the algorithm described in the previous section. Perhaps the most delicate point in this procedure is the estimation of the lag 1 variogram matrix for V (x) in step 3: In many geostatistical studies, it is difficult to obtain a reliable value for the experimental variogram at short lags because of the small number of samples available or because of a large sample spacing. Here, the average sample spacing

P1: FLF Mathematical Geology [mg]

928

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Desbarats and Dimitrakopoulos

is 3 m, although there are a small number of more closely spaced samples at the top of each hole that yield 18 pairs at an average lag of 1 m. In order to obtain a more reliable experimental variogram matrix, all lags between 0.8 and 3.5 m are pooled to yield a total of 147 pairs. Furthermore, variograms are calculated in the downhole direction only. Another problem that may be encountered at this stage concerns the number of variables that is used to discretize the size distribution. Originaly, 30 variables were retained. However, they were found to be very highly cross-correlated and to exhibit very poor spatial correlation. This made the two spectral decompositions in the MAF calculation somewhat unstable resulting in some negative eigenvalues. The problem was resolved by grouping the original variables into 15 broader size classes thereby increasing the signal-to-noise ratio in each “band” while decreasing the cross-correlation among “bands.” Figure 3 shows the eigenvalues in the matrix 3/2 plotted against their corresponding index. The lowest ranked MAF have eigenvalues greater than 1 that, from (10), would seem to indicate that they posess a negative (hole-effect) spatial correlation at lag 1, although this is probably an artifact. The highest ranked MAF has a very low eigenvalue, indicating very strong spatial correlation at lag 1. Each MAF Yi (x) is calculated by premultiplication of the vector of observations Z (x) by a vector of loadings aiT from the rows of matrix A T . The loadings used to calculate each MAF can be displayed and compared using “profile” icons

Figure 3. MAF eigenvalues plotted against rank index.

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

Geostatistical Simulation of Pore-Size Distribution

12:47

Style file version June 30, 1999

929

Figure 4. Profile icons of factor loadings for the 15 MAF variables. In each icon, factor loadings are ordered from left to right according to decreasing pore-size class.

as shown in Figure 4. This figure shows numbered plots corresponding to each MAF. The individual plots represent graphs of factor loadings against index of size class, in order of decreasing size class, from left to right. The figure shows that the loadings for MAF 15, which has the greatest spatial continuity, tend to be more even whereas the loadings for the lesser correlated MAF are more erratic. There is also a tendency for loadings of adjacent size classes to be of opposite sign. This is probably due to significant cross-correlation among size classes. Finally, the lowest ranked MAF tend to have large negative loadings associated with the coarsest size class. This is probably a reflection of the large noise component due to measurement error for this class. It is interesing to compare the loadings associated with the MAF transform procedure and those associated with a conventional principal component transformation. Using the same representation mode as in Figure 4, the loadings in each row of the matrix of normalized eigenvectors W T (step 2, above) are shown in Figure 5. Here, however, the numbering refers to the ordering of principal components, from highest to lowest eigenvalue. Inasmuch as lowest ranked principal components are often found to exhibit the greatest spatial correlation, their loadings should be compared with those of the highest ranked MAFs. Indeed, loadings associated with the first few principal components are seen to be the most uniform, a tendency also observed in MAF 15. Conversely, the highest ranked principal components exhibit the greatest variability in loadings as observed in the lower ranked

P1: FLF Mathematical Geology [mg]

930

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Desbarats and Dimitrakopoulos

Figure 5. Profile icons of factor loadings for the 15 normalized principal components. In each icon, factor loadings are ordered from left to right according to decreasing pore-size class.

MAFs. Despite these similarities, the MAF transformation clearly yields factors different from those of PCA.

Dimensional Reduction and Decorrelation Figure 3 does not reveal any obvious index threshold by which noisedominated MAF can be screened although those MAF with eigenvalues greater than 1 are likely candidates. Experimental variograms are calculated on all 15 MAF in order to identify those with any interpretable spatial correlation structure. On that basis, only the six highest ranked MAF are retained for further analysis. The nine lower ranked MAF are viewed as pure noise and can be modeled seperately using random Normal deviates. This represents a significant dimensional reduction for the multivariate simulation problem. Figure 6 shows the experimental variograms calculated on the six highest ranked MAF transformed variables along with the corresponding fitted models. The first point on the experimental variograms, at a lag, of 1 m, is calculated using only 18 pairs and is unreliable. The second point, at a lag of 3 m is obtained from 99 pairs. It is recalled that the lag 1 variogram matrix required in the MAF transform was calculated based on all sample pairs with lags less than 3.5 m. This explains slight inconsistencies between the MAF variogram values at short lags shown here and those given by the corresponding eigenvalues in matrix 3/2. The general improvement in spatial correlation with increasing MAF rank is clearly apparent in this figure, although MAF 12 appears somewhat out

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Geostatistical Simulation of Pore-Size Distribution

931

Figure 6. Experimental variograms (solid) and corresponding models (dashed) for the 6 highest ranked MAF variables.

of place and deserving a higher rank. The degree of spatial continuity shown by MAF 15 is quite remarkable for this data set and provides compelling evidence of the ability of the MAF transform to isolate factors with the highest signal-to-noise ratios. The experimental variograms are fitted using different combinations of a nugget effect structure and two nested transition structures. The proportions of each structure in the different models are given in Table 1. As discussed in Desbarats (1997), the first structure is represented by an isotropic exponential model with a range parameter of 3 m. This structure is related to postemplacement processes

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

12:47

932

Style file version June 30, 1999

Desbarats and Dimitrakopoulos Table 1. Proportions of the Different Variogram Structures for the Models in Figure 6

MAF 10 MAF 11 MAF 12 MAF 13 MAF 14 MAF 15

Nugget

Structure 1

Structure 2

0.10 0.10 0.10 0.05 0.05 0.00

0.90 0.60 0.30 0.85 0.50 0.00

0.00 0.30 0.60 0.10 0.45 1.00

in the tuff. The second structure is represented by an exponential model with a range parameter of 20 m in the bedding plane of the tuff deposits which dips 15◦ to the East. Perpendicular to the bedding plane, the range parameter is 10 m. This structure relates to original depositional features of the tuff. A selection of typical experimental cross-variograms between MAF variables is shown in Figure 7. Although the MAF transform procedure rigorously decorrelates the factors at lags 0 and 1, small cross-variogram values are observed in some of the plots here because of the different set of sample pairs used to calculate the experimental variograms in the MAF transformation matrix. Overall, the decorrelation of factors achieved by the MAF procedure is quite satisfactory. Values up to 0.2 in the experimental cross-variograms at longer lags are probably related to the small number of sample pairs and to the fluctuation variance linked to variogram estimation in a finite field (Journel and Huijbregts, 1978).

Conditional Simulation and Back-Transformation Conditional simulations are performed independently on the 6 retained MAF variables using the sequential Gaussian method as implemented in program SGSIM of the GSLIB software library (Deutsch and Journel, 1992). The simulation domain is the 30 × 30 m cross-sectional area containing the “Y” series of drill holes shown in Figure 1. The domain is discretized by a 100 × 100 grid. Because this field is not large compared to the range of the second variogram transition structure, ergodic fluctuations are expected in the statistics of the simulated values. The 9 other MAF variables are simulated as uncorrelated standard Gaussian random deviates. Each of these MAF variables is rescaled so that its lag 1 variogram value is equal to the corresponding eigenvalue in matrix 3/2. This is necessary in order to preserve the total variance in the simulated values through the back-transformation process. The MAF back-transformation is applied first. For each grid point, the column vector of simulated MAF variables is premultiplied by A−T to obtain a vector of simulated Normal-score deviates. The Normal-score transformation is then

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Geostatistical Simulation of Pore-Size Distribution

933

Figure 7. Selected experimental cross-variograms (solid) and corresponding models (dashed) for pairs of variables among the 6 highest ranked MAF.

inverted at each grid point to retrieve the suite of 15 simulated pore-volume fractions in each size class.

Check of Simulation Results A series of checks are performed in order to assess the results of the multivariate conditional simulation based on the MAF transform procedure. Figure 8 shows a series of “QQ” plots for simulated values in selected pore-size classes. In these plots, quantiles from the distribution of simulated

P1: FLF Mathematical Geology [mg]

934

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Desbarats and Dimitrakopoulos

Figure 8. Quantile plots comparing the distributions of original and simulated pore volume fractions for 6 selected size classes.

pore-volume fractions are plotted against corresponding quantiles from the distribution of original pore-volume fractions. Ideally, points should fall along the line of unit slope and this is generally the case here. However, for size classes 3 and 5, ergodic fluctuations amplified by the back-transformation process have caused

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

Geostatistical Simulation of Pore-Size Distribution

12:47

Style file version June 30, 1999

935

Figure 9. Selected cross-plots of pore volume fractions in different size classes. Original values are shown on the left and simulated values from a subset of 105 points are shown on the right.

the points to be shifted upwards slightly. For class 14, the handling of the upper tail of the distribution in the Normal-score back-transformation was problematic. Figure 9 shows cross-plots of pore-volume fractions for three different pairs of pore-size classes. The plots on the left are for the original 105 samples and the plots on the right are for a random subset of 105 points taken from the simulation grid. These plots show that the character of the bivariate relationship between pore-size classes in the same sample is well reproduced in the simulated values.

P1: FLF Mathematical Geology [mg]

936

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Desbarats and Dimitrakopoulos

Figure 10. Selected experimental correlograms on original (dashed) and simulated (solid) pore volume fractions for 6 different size classes.

Figure 10 shows experimental correlograms calculated on the pore-volume fractions of 6 selected pore-size classes. Correlograms calculated on the 105 original and 10,000 simulated values are represented by dashed and solid lines, respectively. The experimental correlograms calculated on the original data are extremely erratic due to the small number of samples. The first point, calculated using only 18 pairs, is particularly unreliable. Allowing for this uncertainty, the correlograms calculated on simulated values are in reasonable agreement for 5 of the

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Geostatistical Simulation of Pore-Size Distribution

937

Figure 11. Selected experimental cross-correlograms on original (dashed) and simulated (solid) pore volume fractions for 6 pairs of size classes.

6 pore-size classes shown. However, for class 5, the agreement is very poor and it is unclear why. Figure 11 shows selected experimental cross-correlograms calculated on the pore-volumes fractions of 6 pairs of pore-size classes. As before, correlograms calculated on the 105 original and 10000 simulated values are represented by dashed and solid lines, respectively. Again, allowing for the erratic behavior of experimental results from the original samples, it would seem that the simulated variables

P1: FLF Mathematical Geology [mg]

938

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Desbarats and Dimitrakopoulos

Figure 12. Grey-scale image of simulated total porosity, equal to the sum of pore volume fractions over all 15 size classes.

have faithfully captured the essence of the spatial cross-correlation relationships present in the data. A final visual check of simulation results is provided by the grey-scale image shown in Figure 12. This figure shows the simulated cross-section of total porosity, which is calculated as the sum of pore-volume fractions over all 15 size classes. In agreement with observations, the lowest values are generally found near the surface. A faint band of moderately high values dips across the section from West to East as expected from the dip of the tuff layers. A cluster of high values occurs in an area near the top of the trace of borehole “Y2,” and this is consistent with observations from the hole. These high values are due to a large porosity component in small pore-size classes (12–14), and it is interesting that they are not spread very

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Geostatistical Simulation of Pore-Size Distribution

939

far laterally, up and down dip. This is because simulated pore-volume fractions in these classes do not contain a large contribution from MAF variables with strong long-range correlation structures. Superimposed on the broad features of this image is the random speckling associated with a significant noise component in the original data due to measurement error.

SUMMARY AND CONCLUSIONS Like other geological attributes, the distribution of particle or pore sizes within a sample can be viewed as a regionalized phenomenon amenable to geostatistical analysis. Rather than attempting to represent complex size distributions using parameteric models, they may instead be characterized by their entire discretized spectrum as provided directly by a suite of measurements for different size-class intervals. The geostatistical modeling or simulation of regionalized size distributions then becomes a multivariate problem possibly involving a large number of highly cross-correlated variables. In order to address such a problem using current geostatistical simulation tools, there is a need for dimensional reduction and decorrelation of variables. These requirements have prompted a re-examination of the MAF procedure developed by Switzer and Green (1984) for the processing of multispectral remote sensing data that are similar in nature to size distribution data. Following a review of the theory, this study presents an application of the MAF procedure to the simulation of regionalized pore-size distributions discretized by measurements in 15 size classes. The MAF procedure is based on the eigenvalue-eigenvector decomposition of a matrix expressing cross-correlations among variables at lag 0 as well as at a small spatial lag 1. Consideration of the spatial component of cross-correlations represents the main advantage of the MAF procedure over conventional principal component analysis for the study of coregionalized variables. The MAF transform yields factors that are uncorrelated at lag 0 and lag 1, although, for a two-structure Linear model of coregionalization, the factors are mutually uncorrelated at all lags. In theory, this represents a more thorough decorrelation of variables than can be achieved using standard principal components, and this was verified in practice, in the case study. The MAF procedure yields factors that are ranked in order of increasing spatial correlation: the lowest ranked MAF are dominated by noise and exhibit almost pure nugget effect variogram structures. They can be screened from the spatial simulation study and modeled simply using random deviates. The highest ranked MAF contain mainly signal and show the most significant spatial correlation structure, that which must be captured in the simulations. In the case study, the first 9 factors obtained from the MAF transform were found to represent mainly noise while the 6 remaining factors exhibiting some spatial correlation structure were retained for simulation. This represents a dimensional reduction

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

12:47

940

Style file version June 30, 1999

Desbarats and Dimitrakopoulos

of over 50%. While any dimensional reduction is a trade-off between efficiency and quality of simulation results, it was found here, in a series of checks, that the simulated discrete pore-size distributions faithfully reproduced the univariate, bivariate, and spatial statistics of the original data. In conclusion, the MAF transform procedure represents a very powerful tool for the analysis and simulation of coregionalized variables: It provides a more thorough decorrelation of variables than conventional principal components; it provides a ranking of factors in order of increasing spatial correlation, thereby establishing a rational basis for dimensional reduction. ACKNOWLEDGMENTS This paper was written while the first author held a Research Fellowship Award from the Association of Commonwealth Universities, which allowed him to visit the W. H. Bryan Mining Geology Research Centre for the first three months of 1999. Comments by A. E. Tercan are gratefully acknowledged. Geological Survey of Canada contribution no. 1999137. REFERENCES Agterberg, F. P., Katsube, T. J., and Lew, S. N., 1984, Statistical analysis of granite pore size distribution data, Lac du Bonnet Batholith, eastern Manitoba: Geological Survey of Canada Paper 84-1A, p. 29–38. Alyamani, M. S., and Sen, Z., 1993, Determination of hydraulic conductivity from complete Grain-size distribution curves: Ground Water, v. 31, no. 4, p. 551–555. Bader, H., 1970, The hyperbolic distribution of particle sizes: Jour. Geophy. Res., v. 75, no. 15, p. 2822– 2830. Bagnold, R. A., and Barndorff-Nielsen, O., 1980, The pattern of natural size distributions: Sedimentology, v. 27, p. 199–207. Basan, P. B., Lowden, B. D., Whattler, P. R., and Attard, J. L., 1997, Pore-size data in petrophysics: A perspective on the measurement of pore geometry, in Lovell, M. A., and Harvey, P. K., eds., Developments in petrophysics: Geological Society of London, Special Publication 122, p. 47–67. Berman, M., 1985, The statistical properties of three noise removal procedures for multichannel remotely sensed data: CSIRO Division of Mathematics and Statistics, Consulting Rep. NSW/85/31/MB9, 37 p. Borgman, L. E., and Frahme, R. B., 1976, A case study: Multivariate properties of bentonite In northeastern Wyoming, in Guarascio, M., David, M., and Huijbregts, C., eds., Advanced Geostatistics in the Mining Industry: Proceedings of the NATO Advance. Study Institute, Rome, October 13–25, Reidel, Dordrecht, p. 381–390. Case, C. M., Ghiglieri, D. L., and Rennie, D. P., 1987, Model-dependence of the interpretation of mercury injection porosimetry data, in Evans, D. D., and Nicholson, T. J., eds., Flow and transport through unsaturated fractured rock: Geophysical Monograph 42, American Geophysical Union, Washington, DC, p. 157–164. Chavez, P. S., and Kwarteng, A. Y., 1989, Extracting spectral contrast in LANDSAT thematic mapper image data using selective principal component analysis: Photogrammetric Eng. and Remote Sensing, v. 55, no. 3, p. 339–348.

P1: FLF Mathematical Geology [mg]

PL098-17

September 5, 2000

Geostatistical Simulation of Pore-Size Distribution

12:47

Style file version June 30, 1999

941

Davis, J. C., 1986, Statistics and data analysis in geology: John Wiley, New York, 646 p. Deutsch, C. V., and Journel, A. G., 1992, GSLIB: Geostatistical software library and user’s Guide: Oxford University Press, 340 p. Desbarats, A. J., 1995, Upscaling capillary pressure-saturation curves in heterogeneous porous media: Water Resour. Res., v. 31, no. 2, p. 281–288. Desbarats, A. J., 1997, Geostatistical modeling of unsaturated flow parameters at the Apache Leap Tuff site, in Baafi, E. Y., and Schofield, N. A., eds., Geostatistics Wollongong ’96, vol. 1, p. 621–633. Dwivedi, R. S., and Ravi-Sankar, T., 1992, Principal component analysis of LANDSAT MSS data for delineation of terrain features: Int. Jour. of Remote Sensing, v. 13, no. 12, p. 2309–2318. Full, W. F., Ehrlich, R., and Kennedy, S., 1984, Optimal configuration and information content of sets of frequency distributions: Jour. of Sed. Petrololgy, v. 54, no. 1, p. 117–126. Goovaerts, P., 1993, Spatial orthogonality of the principal components computed from coregionalized variables: Math. Geol., v. 25, no. 3, p. 281–302. Goulard, M., and Voltz, M., 1993, Geostatistical interpolation of curves: A case study in soil science, in Soares, A., ed, Geostatistics Troia ’92: vol. 2, p. 805–816, Kluwer, Dordrecht. Green, A. A., Berman, M., Switzer, P., and Craig, M. D., 1988, A transform for ordering multispectral data in terms of image quality with implications for noise removal: IEEE Trans. Geosc. Rem. Sens., v. 26, no. 1, p. 65–74. Grunsky, E. C., and Agterberg, F. P., 1988, Spatial and multivariate analysis of geochemical data from metavolcanic rock in the Ben Nevis area, Ontario: Math. Geol., v. 20, no. 7, p. 825–862. Joreskog, K. G., Klovan, J. E., and Reyment, R. A., 1976, Geological factor analysis: Methods in Geomathematics 1, Elsevier, Amsterdam, 178 p. Journel, A. G., and Huijbregts, C., 1978, Mining geostatistics: Academic Press, London, 600 p. Kleingeld, W., and Lantuejoul, C., 1993, Sampling of orebodies with a highly dispersed Mineralization, in Soares, A., ed., Geostatistics Troia ’92: Kluwer, Dordrecht, vol. 2, p. 953–964. Klovan, J. E., 1966, The use of factor analysis in determining depositional environments from grain-size distributions: J. of Sediment. Petrol., v. 36, p. 115–125. Krumbein, W. C., 1934, Size frequency distribution of sediments: Jour. Sed. Petrology, v. 4, p. 65–77. Luster, G. R., 1985, Raw materials for Portland cement: Applications of conditional simulation of coregionalization: unpublished doctoral dissertation, Stanford University, Stanford, CA, 532 p. Manieh, A., 1984, Oolite liberation of oolitic iron ore, Wadi Fatima, Saudi Arabia: Int. Jour. of Mineral Processing, v. 13, no. 3, p. 187–192. Mather, P. M., 1972, Study of factors influencing variations in size characteristics of fluvioglacial Sediments: Math. Geol., v. 4, no. 3, p. 219–234. Perfect, E., 1997, Fractal models for the fragmentation of rocks and soils: A review, in Vallejo, L. E., ed., Fractals in engineering geology: Engineering Geology, v. 48, nos. 3–4, p. 185–198. Pirkle, E. C., Pirkle, F. L., Pirkle, W. A., and Stayert, P. R., 1984, The Yulee heavy mineral sand deposits of northeastern Florida: Economic Geology, v. 79, no. 4, p. 725–737. Rasmussen, T. C., Evans, D. D., Sheets, P. J., and Blanford, J. H., 1990, Unsaturated fractured rock characterization methods and data sets at the Apache Leap Tuff site: Report NUREG/CR-5596, 139 p., US Nuclear Regulatory Commission, Washington, DC. Rasmussen, T. C., Evans, D. D., Sheets, P. J., and Blanford, J. H., 1993, Permeability of Apache Leap tuff: Borehole and core measurements using water and air: Water Resour. Res., v. 29, no. 7, p. 1997–2006. Sichel, H. S., 1973, Statistical valuation of diamondiferous deposits: Jour. S-Afr. Inst. Min. and Metall., v. 73, p. 235–243. Smith, M., 1993, The use of fractals in quantifying pyrite textures to determine ore liberation in base metal ores: GSA North-Central Section 27th annual meeting, Abstracts with programs— Geological Society of America, v. 25, no. 3, p. 82. Suro-Perez, V., and Journel, A. G., 1991, Indicator principal component kriging: Math. Geol., v. 23, no. 5, p. 759–792.

P1: FLF Mathematical Geology [mg]

942

PL098-17

September 5, 2000

12:47

Style file version June 30, 1999

Desbarats and Dimitrakopoulos

Switzer, P., and Green, A. A., 1984, Min / Max autocorrelation factors for multivariate spatial imaging: Technical Report No. 6, Department of Statistics, Stanford University, 14 p. Tercan, A. E., 1999, Importance of orthogonalization algorithm in modeling conditional distributions by orthogonal transformed indicator methods: Math. Geol., v. 31, no. 2, p. 155–173. Vogt, G. T., 1988, Porosity, pore-size distribution and pore surface area of Apache Leap Tuff near Superior, Arizona using mercury intrusion: unpublished master’s thesis, Department of Hydrology and Water Resources, University of Arizona, Tucson, 130 p. Wackernagel, H., Petitgas, P., and Touffait, Y., 1989, Overview of methods for coregionalization analysis, in Armstrong M., ed., Geostatistics, Vol. 1: Kluwer, Dordrech, p. 409–420. Wackernagel, H., 1995, Multivariate geostatistics: Springer, Berlin, 256 p. Wardlaw, N. C., and Taylor, R. P., 1976, Mercury capillary pressure curves and the interpretation of pore structure and capillary behavior in reservoir rocks: Bull. Can. Petrol. Geol., v. 24, p. 225–262. Yim, W. W. S., 1984, Liberation studies on tin-bearing sands off North Cornwall, United Kingdom: Marine Mining, v. 5, no. 1, p. 87–99.

Suggest Documents