Stochastic simulation of categorical variables using a ... - CiteSeerX

13 downloads 0 Views 661KB Size Report
Stochastic Simulation of Categorical Variables Using a Classification Algorithm and Simulated Annealing. P. Goovaerts ~. An important step c~f reservoir ...
Mathematical Geology, Vol, 28, No. 7, 1996

Stochastic Simulation of Categorical Variables Using a Classification Algorithm and Simulated Annealing P. Goovaerts ~ An important step c~f reservoir chtwacterization is the stochastic modeling of the geometry ~f lithofacies which control large-scale heterogeneities ~f petr~physical pr~Jperties. Although multiple realizations are necessary to appreciate the uncertaino +in the spatial distribution of fucies, u common short cut consists of retaining the first realization dru~vt. ?Tffs paper presents an alternative to this potentially hazardous selection: (I) a categorical map is generated by alh~cating a single fiwies to each grid node according to the hJcal probabilities of occurrence of the fiwies, and (2) the map then is post-processed using u steepest descent-Opt algorithnt so a.~ to improve reprothwtion ~!f spatial continuity and transition prohahilities between fiwies. The procedure is illustrated using a O,nthetic dataset. A watetfla~M simuhJtion sho~w that retaining a single realization would yieM, in average, larger errors it! production fiJrecasts (water cuts und recovered oil) that1 the singh, postprocessed.[acies map.

KEY WORDS: facies distribution, sequenlial indicator simulation~

INTRODUCTION

Stochastic simulation is used increasingly to generate numerical models (realizations) of the spatial distribution of petrophysical properties (Haldorsen and Damsleth, 1990). The modeling generally proceeds in two steps: the geometry of facies is simulated first, then the spatial distribution of petrophysical variables is simulated within each facies. This two-step approach allows reproducing both long-range heterogeneities as generated by facies boundaries and short-range variability specific to each facies. Petrophysical properties typically are known only at a few well locations, and so there is a large uncertainty attached to their distribution within the reservoir. A visual and quantitative measure of spatial uncertainty is provided by generating many realizations that all match reasonably the same sample statistics (histogram, variogram) and conditioning data. Running a flow simulation pro'Received 15 January 1996; accepted 29 February 1996. ZUniversit6Catholique de Louvain, Unit6 BIOM, Place Croix du Sud, 2 Bte 16, 1348 Louvain-laNeuve. Belgium:e-mail: [email protected] 909

()882-8121196Jl(Xll}-il~lg~}9

50~I ' i~-~ lnlcm~Iional A~.si~.-i~ll~ii~l~f M~|hcmaIic~l Gctdog}

910

Goovaerts

gram then yields an estimate of the recovery' values for each realization. The distribution (histogram) of flow responses corresponding to the set of input realizations allows an appreciation of the uncertainty in reservoir perfomlance forecasts resulting from our imperfect knowledge of the distribution in space of the rock and fluid properties. In practice, we may be tempted to bypass the cumbersome generation of multiple facies maps and focus on a single realization, disregarding uncertainty, This unique categorical map may be the realization that pleases best the geologist or, more often, it is the first realization generated. In such a situation, we suggest that a better alternative would consist of using a single estimated (rather than simulated) categorical map deduced from a classification of the grid nodes. Soares (1992) developed a classification algorithm that preferentially allocates nodes to the category with largest local probability of occurrence under the constraint of reproduction of global proportions. Because local probabilities are established using kriging, the classification typically is too smooth and does not reproduce (cross) variogranl models. To correct liar this smoothing effect and reproduce better transition probabilities between categories, we propose to postprocess the categorical map by a MAP (maximum a posteriori or steepest descent-type) algorithm. The proposed technique is illustrated using a synthetic dataset. Fifty realizations of the facies distributions are generated conditional to facies-data at 60 well locations using sequential indicator simulation, each realization then is postprocessed using a steepest descent-type algorithm to improve reproduction of indicator variogram models. A single post-processed estimated categorical map is generated using the same 60 well locations. For both situations, petmphysical properties then are simulated conditioned to the facies distributions, and the resulting pern~eability maps are submitted to a waterflood flow simulation. The resulting model of uncertainty in production li)recasts (water cuts and recovered oil) is compared to the reference values provided by the synthetic dataset. STOCHASTIC SIMULATION There are many algorithms fl'~r simulating categorical variables, for example, object-based algorithms, multiple truncations of a Gaussian field, indicator-based algorithms, and simulated annealing. In this paper, the spatial distribution of facies is simulated using the fl)llowing two-step procedure (Deutsch and Joumcl, 1992a, p. 159): • an initial image is generated using sequential indicator simulation (SIS), • the SIS realization then is post-processed with a MAP (steepest descenttype) algorithm so as to reproduce better the indicator direct and crossvariogram models.

911

Simulation of C a t e g o r i c a l V a r i a b l e s

Sequential Indicator Simulation Consider the simulation of the spatial distribution of K mutually exclusive categories s k at N grid nodes UJ conditional to the dataset {s(u,), ~ = 1 . . . . . n}. Sequential indicator simulation proceeds as tbllows: • Transform each categorical datum s(u,) into a vector of K indicator data defined as:

i(u,; s,)

=

I

I

if s(u.) = s~ k=l

otherwise

.....

K

• Define a random path visiting only once each node to be simulated. • At each node u: (1) Determine the local probability of occurrence of each category s k [p(u: sk[(n))]*, using indicator kriging. The conditioning information (n) consists of neighboring original indicator data and previously simulated indicator values. (2) Correct these probabilities so as to ensure that they are valued within [0. 1] and sum up to one. (3) Define any ordering of the K categories and build a cdf-type function by adding the corresponding probabilities of occurrence, for exanaple: k'

IF(u; Sk,l(n))]* =

~

k-I

[p(u; ski(n))]*

k' = 1. . . . .

K

(4) Draw a random number p unifomaly distributed in [0. 1]. The simulated category at location u is the category that corresponds to the probability interval including the probability p: s(Z)(u) = s~. such as: p e ([F(u; s ~ ,

tl(n))l*, IF(u;

s~,l(.))l*l

(5) Add that simulated value s(/)(u) to the conditioning dataset. (6) Proceed to the next node along the random path, and repeat the five previous steps, Repeat the entire sequential procedure with a different random path to generate another realization { s ( r ) ( u j ) , j = 1 . . . . . N } , I' --/: I. This algorithm does not make full use of the information available in that it ignores transition probabilities between any two categories sk and s~, as provided by indicator cross-variograms yt(h: s k, s~.). Such infomaation can be accounted for using the cokriging fomaalism. Practical implementation of co-

912

Goovaerts

kriging, however, is faced with the joint modeling of indicator (cross) variograms and the computational cost of solving large and possibly unstable cokriging systems. An alternative consists of post-processing the SIS realizations with simulated annealing so as to achieve a better reproduction of direct and crossvariograms (Murray, 1992; Deutsch and Journel, 1992b: Goovaerts and Journel, 1996). Post-processing

with

Simulated

Annealing

Simulated annealing is a generic name for a family of optimization algorithms based on the principle of stochastic relaxation (Geman and Geman, 1984: Farmer, 1988; Deutsch and Cockerham, 1994), An initial (seed) image is modified gradually so as to match constraints such as reproduction of a target histogram or variogram. The first step is to define an objective function that measures deviations between the target and current statistics of the realization, here the deviations between the target indicator variogram models and the current variograms. Reproduction of variogram models generally is limited to a specified number C of lags and can be controlled using the following objective function: O(i)

= ~] , : I

k= I k'= I

[3'j(h,.: sk, sk.) - 3'/ ~ l i )~n,., , L . sk, s,.)] -~ ..... ~ [3`/(h,,; sk, s~,)]-

(1)

w h e r e "~"(h,; sk, sk.) is the indicator cross-variogram value between categories s k and s k, at lag h, calculated from the realization at the ith perturbation, and 7/(h,.; s k. sk,) is the corresponding target value. The division by the square of the variogram model at each lag gives more weight to reproduction of the model near the origin. The next step is to modify systematically the SIS realization so as to diminish the value of the function (1). Because the SIS realization already reproduces indicator direct variograms, it should not be randomized completely by accepting too many unfavorable perturbations at the beginning of the post-processing. The MAP (maximum a p o s t e r i o r i ) algorithm (Doyen, Guidish, and de Buyl, 1989) accepts only the perturbations that lower the value of the objective function, which speeds the convergence by disallowing any randomization of the initial image. The algorithm proceeds as follows:

(1) Compute the value of the objective function for the initial realization. (2) For a specified number of iterations: • Define a random path that visits all unsampled grid nodes. • At each node u, consider all K possible categories and compute the corresponding K objective function values.

Simulation of Categorical Variables

913

• Assign the node u to the category s t associated with the smallest value of the objective function. • Proceed to the next node u along the random path. Srivastava (1992) proposed to use for convergence criterion the percentage of changes at each ith iteration, that is the proportion of grid nodes at iteration i that have a value different from that at iteration i - I. An advantage of simulated annealing over sequential indicator simulation is that reproduction of indicator cross variograms does not call for solving cokriging systems, hence for fitting a linear model of coregionalization to the set of direct and cross-indicator variograms (Goovaerts, 1993). P O S T - P R O C E S S I N G A UNIQUE E S T I M A T E D C A T E G O R I C A L MAP Instead of starting with many possible realizations simulated annealing now is used to post-process a (unique) estimated categorical map deduced from a classification based on the vectors of K local probabilities [p(u: ski(hi)]*. Let {[p(uj; ski(n))]*; k = 1. . . . . K, j = 1. . . . . N} be the set of probability vectors provided by indicator kriging at the N grid nodes. Soares (1992) developed a classification algorithm that preferentially allocates nodes to the category with the largest conditional probability of occurrence under the constraint of reproduction of the K global proportions (marginal probabilitiesJ Pk. The classification algorithm is dynamic in the sense that the allocation rule changes as the classification progresses, and it proceeds as follows: (1) For each category s k the N grid nodes are ranked according to decreasing conditional probabilities [p(u; stl(n))]*, (2) The n k nodes (n k = NpD with the largest conditional probabilities [p(u: ski(n))]* are allocated to the category s t. (3) If a location u is allocated to two or more categories, for example, s~ and sk,, it is assigned to the category sk with the largest conditional probability of occurrence, that is, [p(u; ski(n))]* > [p(u: st.l(n))l*. The (n k, + l)th node with the largest conditional probability [p(u; st,I(n))]* then is allocated to category sk, so that the global proportion Pk' is reproduced. The procedure is repeated until each grid node belongs to a single category. Unlike sequential indicator simulation the K conditional probabilities are established using only the original data s(u,0 and the category is not drawn at random but is selected according to a classification criterion. Because the conditional probabilities are established using kriging, the classification typically is too smooth and does not reproduce transition probabilities

914

Goovaerts

between categories. These features can be imparted to the classification by postprocessing the categorical map using simulated annealing and the objective function (1). CASE STUDY Throughout this paper, the categorical map shown in Figure 1 is considered as the reference exhaustive distribution of four facies in a 2D section of a reservoir. The reference dataset comprises 50 x 50 categorical indicators on a regular square grid. This reference categorical map was generated by truncating the continuous variable in true.dat of the GSLIB dataset (Deutsch and Journel, 1992a, p. 35). Sixty locations were drawn at random and form our sample dataset. Because of the sparsity of sampling, the sample proportions of categories 2 and 3 deviate significantly from the exhaustive proportions. Information about the spatial continuity and transition probabilities between facies is provided by the set of standardized direct and cross-variograms calculated from the exhaustive dataset (see Fig. 1, right graphs). Alternatively, these transition statistics could have been inferred from outcrop data. Instead of using the rather restrictive linear model of coregionalization, the indicator variograms were modeled independently one from each other because simulated annealing does not involve any cokriging.

Generation of Facies Maps Fifty realizations of the spatial distribution of facies were generated using sequential indicator simulation and the information of Figure 1. Figure 2 (left middle graph) shows the first SIS realization. The corresponding experimental standardized indicator variograms are displayed by black dots in Figure 3. Apart from the second category the direct variograms are poorly reproduced. The cross-variograms, hence the nesting of categories, are not reproduced since they are not accounted for in the simulation procedure. The 50 SIS realizations were post-processed using the MAP algorithm and the objective function (1). For each realization, the post-processing was stopped after six iterations when the proportion of changes per iteration was determined to be less than 1% of the total number of grid nodes. Post-processing yields a considerable improvement of the original SIS image: the different classes are now nested (see Fig. 2, right middle graph): this is confirmed by the better reproduction of direct and cross-indicator variograms (see Fig. 3, open circles). There also is a better reproduction of global proportions of the different facies. Note that one could have achieved a perfect reproduction of target proportions by including an additional component in the objective function (Goovaerts, 1994). The problem is the determination of such target proportions. In this

°o



I

o

o~

10

,

.







ii.



20

,

o'o:

.

o

,

|

Io

30

.



,

40

__

:~

Sample data set



.





50

I

,o%

4

3

2

Category

10%

20%

40% 30%

°I

°'t

1~ calego~ 3

~ol

~5

o~

or

or

05

QI

0

e

~2

16

[rm~n~. ~ia m

4

~o

o~

~

e7

e3 OS

O5

.,

O3

~ a l e g o n o s 2-4

C a e g o i o s I-3

~t

,,'

"

s

o

i

oo

oe

:::°iriill °'l. .........

._~

E

2 Category

o

~mganes

g

% ......

la

3~4

.,

..-':,°,.

Catches ~,4

Z~

and exhaustive standardized direct and cross variograms with models fitted.

Figure I. Relerence categorical map and ~,ample infomlatitm available to task of image rec~mstruction: 60 sample locations

0

I0.

20.

30.

40.

50~

,~

|

Reference categorical map

R"

=...

Goovaerts

916 Reference categorical map

SIS realization #1

~ 47%

After post-processing

i

17=:

27%

42%

22%

2,t ~:

12%

Soares" classification

After post-processing

22 ~ =

23 ~=

Figure 2. Reference categ~rical map. calegorical map,', generaled by sequenlia~ indicator simulation ~tirst realizati~m) and alg~mthm of Soares belore and after post-processing.

example, identifying target proportions to sample values would have led to an underrepresentation (p* = 25% < p~ = 30%) of the second category and an overrepresentation of the third category (p* = 25% > P3 = 20%). Figure 2 (left bottom graph) shows the categorical map generated using the algorithm of Soares and conditional probabilities provided by ordinary indicator kriging. The corresponding standardized indicator variograms are displayed by black dots in Figure 4. The smoothness of the kriged probability maps leads to a smooth categorical map with nugget effects that are too low. In addition the cross-variograms are not reproduced because transition probabilities between

917

Simulation of Categorical Variables tz

~..a#egory 1

~2

~alegory 2

Ol



0B

04

Before post-processing

*

After posl*processin9 {~

o

4

8

12

~6

ZO

o

~

B

12

16

2o

~2

~

2o

tm 0 ~*

I2

Category 3

I2

oo

oo o

o i

Category 4

.~

B

C a t e g o n e s 1=2

12

~6

2o

o

a

~

01

0 i 1 C a l e g o n e s 1-3

Calegones 1-4

E O~

07 O

4

8

I~t

15

~0

,OZ L. . . . . . . . . . . . 0 4 8 T2 t6 20

07

0

4

fI

t2

I6

20

16

20

>

o

0 ~ ~ C a t e g o r i e s 2-3

0

0 i i C a l e g o n e s 2-4

o I

:-,:..

.o, ]

°'

:'72""-:::"

.0~ i

05

05 i

o

07

o

8

,07 L o

0

4

C a t e g o n e s 3-4

12

16

20

............

a

8

12

t6

20

IL 0

................. 4 8 ~2

Distance, grid units Figure 3. Reproduction of standardized indicator vari~grams flu" first SIS realization beli~re (black dots) and after (open circles) post-processing. Target models arc sh~wn as c o n t i n u o u s

culwes.

categories are not accounted for by indicator kriging. As for the SIS realization post-processing improves reproduction of the spatial continuity and nesting of categories (see Fig. 2, right bottom graph, and Fig. 4, open circles). Impact on Flow Simulation To assess the performances of the two approaches when predicting flow properties all categorical maps (the reference map, 50 post-processed SIS realizations and the post-processed classification) were transformed into permeability maps using a two-step procedure:

Goovaerts

918 2

Z

l

Category

Category 2

'

o~

...., ....

o8 i_i~/o°* o~

~ " • ~."

E

Before posl-processmg After posl-process~ng

oo

oe 0

4

8

12

16

20

o

.~

8

~2

~6

20

12

16

~0

0

i~

..~a]egory 3

~,a~egory 4

i 2

oo t

0

0 I

Categones

~

~

1-2

~2

~

o B

Categones

i * 03 i

-03

o

20

4

8

1-3

0 i

Categones

1.4

v°.



,03

E -07

.O7 ~ o

9

o1

4

~

12

16

Categories 2-3

,07

o

20

o1

4

~

12

I~

~0

Calegories 2-4

0

o 1

4

~

I~

16

-,20

~6

2(}

C a t e g o r i e s 3~4

O

I~,,°.°*°°H°,,

-O5 I

~'J5

-07

-o7 0

'~

8

~

~s

2O

-o ~.07 0

.~

8

12

~6

2O

O

,:

8

~2

Distance, grid units Figure 4, Repn)ducti{in of standardized indicalt~r variograms l'{~r categ{)rical map generated by alg~withnl of Soares bel~we (black d~lts) and after (open circles) postpro~.'es~dng. Target nl~dels are shown as c¢)ntinu~us curves,

(1) the permeability values for each facies are simulated using a nonconditional sequential Gaussian algorithm (Deutsch and Journel. 1992a, p. 141-143) and the following normal scores variogram models: Facies Facies Facies Facies

1: 2: 3: 4:

"7(h) "y(h) y(h) y(h)

= = = =

0.1 0. I 0.1 0.1

+ + + +

0.9Sph(]h]/20), Y = 300 md 0.9Sph(IhI/lO), Y = 100 md 0.9Sph(]hl/lO), Y = 25 md 0.9Sph(Ih]/5). Y = 1 md

where Sph is the spherical variogram model, and £ is the mean of permeability values within the facies. For all facies the distribution of

Simulation of Categorical Variables

919

Reference permeability map 800 md

II

~

640 480 320

!.]:°

Figure 5. L~calion+,, ~I injector and producer on relerence pemleabiti|y map derived I'rom retk~rence facies map of Figure I.

permeability values was assumed Iognormal with a coefficient of variation of 1.5. (2) each node u in facies sk is associated to the permeability value simulated at that location in the realization corresponding to the kth facies. Such an approach amounts to assuming independence of permeability values from one facies to another, but allows [or correlation of permeability values across two different bodies of the same facies located close together, Figure 5 shows the reference permeability map derived from the reference facies map of Figure I. Note the W - S W streak of low permeability values corresponding to facies 3 and 4, The effective permeability of each image was computed in the EW and NS directions using the pressure solverflowsim (Deutsch and Joumel, 1992b). A waterflood simulator (Eclipse, 1991 ) then was applied to each permeability map, the injector and producer being located in the lower left and upper right comers, respectively (Fig. 5). The flow is thus perpendicular to the W - S W channel of low permeability values. The fractional flow of oil vs. time was recorded, and three flow characteristics were retrieved: tirnes to reach 5% or 95% water cut, and the time to recover 50% of the oil. Table 1 (2nd column) lists the values of effective permeability and the flow responses obtained for the reference permeability map of Figure 5. The absolute deviations between these reference values and predicted values were computed for the classification, the first SIS realization, and in average over the 50 SIS realizations. Whatever the reservoir property, the post-processed classification yields the smallest prediction errors. The poor scores of the first SIS realization are the result of bad luck: most of the next 49 SIS realizations per|'orm better

920

Goovaerts

Table !. Ret~:renceValues ~f Reserv~ir Pr~pemies and Average Absolute Prediction Errors Obtained

fi~r Classification, First SIS Realization, and 50 SIS Realizati~ms illl

Aver. Ireferenee values-predicted valuesI

Pr~perties K~t¢(rod)

K)tt (mdt Water cut 5% lyear) Waler cu! 95~'~ (year) Recover, 50f~ (year)

Reference values 38.6 39.3 3.6 43.4 39.2

C~assification 0.7 0. I 1.5 13.4 1.0

SIS realization #i

50 SIS realizati~ns

1.7 6.5 43.9 161 232

2.9 4.5 3.4 37.6 43.3

as reflected by the smaller deviation values obtained in average o v e r all SIS realizations (see Table 1, last column). Yet, the average deviations between reference and predicted values are larger than those provided by the classification. In this example, if the reservoir per~i~rmance is to be predicted from a single facies map, one is better off starting with a post-processed classification.

CONCLUSIONS The post-processing of SIS realizations using the M A P algorithm allows reproducing spatial continuity and transition probabilities between facies without requiring the fitting of a linear model of coregionalization to the set of indicator variograms. Although all realizations share the same target statistics they may yield different predictions of reservoir performance. The poor prediction scores of the first SIS realization emphasizes the risk o f retaining a single realization in any simulation procedure. In this case study, a post-processed classification of grid nodes represents, in average, a better alternative to the hazardous selection of a single realization. The generation of a single categorical map prevents, however, any assessment of the uncertainty in the spatial distribution of facies.

ACKNOWLEDGMENTS This work was done while the author was with the Stanford Center for Reservoir Forecasting.

REFERENCES

Deutsch, C. V.. and Cockerham, P., 1994. Praclical considerations in the applica|ion of simulated annealing to stochastic simulation: Math. GetHogy, v. 26. no. 1, p. 67-82.

Simulation of Categorical Variables

921

Deutsch, C. V., and Joumel. A. G,, 1992a, GSLIB: Geostatistical Software Library and user's guide: Oxford Univ. Press, New York. 340 p, Deutsch. C. V., and Joumel, A. G.. 1992b, Annealing techniques applied to the inlegration of geological and engineering data: Stanford Center fnr Reservoir Forecasting. Stanford University. unpubl. Ann, Rept. N~. 5, 120 p. Doyen. P., Guidish. T., and de Buyl, M,, 1989, Seismic discrimination of litholngy in sand/shale reservoirs: a Bayesian approach: Proc. 59th annual SEG meeting. Dallas. ECLIPSE t(X) Reference Manual. t991. Interd ECL Petroleum Technologies. Highlands Fanm Greys Road. Henley-on-Thames, Oxl-ordshire, England. 646 p. Famler, C., 1988. The generation of stochastic fields of reservoir parameters with spccilicd geostatistical distributions, in Edwards, S., and King, P., eds., Mathematics in oil production: Clarendon Press. Oxford, p. 235-252. Geman. S.. and Geman. D.. 1984, Stochastic relaxatinn. Gibbs distributions, and the Bayesian reslaration of images: IEEE Trans. Pattern Anal. Machine Intell. PAMI, v. 6. no. 6. p. 721741. Gonvaens, P.. 1993. Ctmlpanson of coIK. IK. and talK perlbmlances lot modeling conditinnal probabilities of categ~mcal variables, itt Dimitrak~poutos, R.. ed.. Geostatislics lot the next century: Kluwer Acad. Publ., Dordrecht. The Netherlands, p, 18-29. Goovaerts, P., 1994~ Prediclinn and stochastic modelling of facies types using classification algt}rithms and simulated annealing, Stanfl~rd Center for Reservoir Forecasting, Stanford University, unpubl. Ann. Rept. N~l. 7, 74 p. Goovaens, P.. and Jnumct, A, G., 1996. Accounting Ior h~cal pn)babilities in stochastic modeling of lacies data: SPE Paper Number 29230. prcprinl, Haldorsen, H., and Damsleth. E.. 1990. Stochastic modeling: Jour. Pclrt~leum Technology, v. 42, no. 4. p. 404-4t2. Murray, C. J.. 1992, Indicator simulation of petrophysical rock types, in Soares. A,. ed,. Geostatistics Tr6ia '92. Quantitative geolog 3 and geostatistics: Kluwer Acad. Publ.. Dordrecht, The Netherlands. p. 399-41 I. Soares, A., 1992, Geostatisticai estimation of multi-phase sir'clotures: Math. Geology. v 24, no. 2. p, 149-160. Srivaslava. R. M.. t992. llerativc models for spatial simulation: Stanlord Center l~r Reservoir Forecasting, Stanford University. unpuh. Ann. Rcpt. No. 5. 24 p.

Suggest Documents