Integration of seismic and well-log data using ...

Integration of seismic and well-log data using statistical and neural network methods Y. Zee Ma1, Ernest Gomez1, and Barbara Luneau1 Abstract

In the last two to three decades, the use of seismic attributes for reservoir characterization and modeling has grown exponentially. Now, a dozen or more attributes are often extracted from seismic data to predict reservoir properties. Meanwhile, an increasing trend of acquiring more wireline logs provides more and more data to describe reservoir properties. Both statistical methods and artificial neural networks (ANNs) are often used to extract information and make predictions. Although statistical methods and ANNs provide powerful tools for geoscience data integration, they also have pitfalls. We present a principal component analysis (PCA) and ANNs for facies classifications and porosity prediction. We also show the use and limitations of these methods and the importance of integrating the geologic and petrophysical knowledge.

Introduction

While heuristic extraction of information from data has been practiced for centuries, increasing use of statistical and artificial intelligence methods in recent decades has dramatically improved knowledge extraction. Meanwhile, vast amounts of data are being generated in petroleum geoscience, and statistical learning and data mining have become increasingly important to deal with big data. Ever-increasing computation power has enabled storage of large-scale data and has provided capabilities to process big data. One common problem in big-data analytics is to integrate a large number of input variables to predict the output or calibrate them to the target variable. This is sometimes termed the curse of (high) dimensionality (COD), although COD may also imply a large number of observations. Both principal component analysis (PCA) and artificial neural networks (ANNs) can be used to overcome the COD (Everitt and Dunn, 2002; Avseth et al., 2005; Ma et al., 2014). Statistical and neural network methods have been proposed for discriminating categorical variables as well as for predicting continuous variables (Kettenring, 2006; Chithra Chakra et al., 2013; Ma and Gomez, 2015). There has been a tendency to use neural networks to replace statistical methods for data integration due to the ease of use (Chopra and Marfurt, 2008; Ma, 2011). In fact, although neural network methods can be powerful in integrating various data, they can also generate artifacts. This problem can be serious because users generally cannot interrogate the neural network process and the intermediate results. Data integration and predictions using ANNs with little use of subject-matter knowledge often fail to accurately classify the lithofacies or to generate realistic predictions (Ma and Gomez, 2015; Ma et al., 2015). In this article, we first present examples of lithofacies and seismic facies classification, and we show uses and limitations of statistical and neural network methods in classifications using seismic attributes and wireline logs. We then present predictions 1

64

Schlumberger.

THE LEADING EDGE

of continuous variables using neural network by combining a number of other continuous reservoir variables and integration of the ANN’s result by geostatistical methods to improve reservoirproperty mappings.

Integration of well logs for lithofacies clustering using PCA and ANNs

One or two logs traditionally have been used to classify lithofacies in which cutoffs are applied; the cutoffs are defined using heuristics and can be subjective. For example, the gamma ray (GR) log is commonly used to classify sandy channel and shaley floodplain facies using cutoffs. However, GR data for individual lithofacies typically show overlaps of GR values for different lithofacies (Ma et al., 2014). As a result, a single log generally cannot discriminate lithofacies satisfactorily because this implies no overlap of the property values to accurately separate the different clusters. Figure 1a shows a histogram of GR, which is a bimodal distribution. The logarithm of resistivity log also shows a similar profile (Figure 1b). Geoscientists sometimes interpret two clusters when seeing a bimodal distribution. In fact, a bimodal distribution can convey more than two clusters (McLachlan and Peel, 2000; Ma et al., 2014). In this fluvial reservoir, three facies are present, including channel, crevasse/splay, and floodplain. One mode is hidden from the GR and resistivity histograms (Figures 1a and 1b). The bivariate histogram of GR and resistivity is a trimodal distribution, although one of the modes is not clearly pronounced (Figure 1c). In order to discriminate the three clusters associated with the three modes, a simple PCA on the GR and resistivity logs was performed. Cutoffs were defined using the first principal component (PC), as shown in Figure 1d because the first PC aligns with the three modes in the bivariate distribution (comparing Figures 1c and 1e). Then, the three clusters defined by the two cutoffs on the first PC were generated. They represent three depositional facies: channel, crevasse, and floodplain deposits (Figure 1f). Each facies is associated with a quasi-normal distribution for both the GR and logarithm of resistivity (Figures 1g and 1h). Obviously, individual logs do not enable separating these facies accurately because of the overlaps, and the leverage of complementary information from two logs is self-evident. Note also that using cutoffs on a PC is not the same as using cutoffs on the original logs because a PC is a combination of the original input logs. More than two logs can also be integrated by statistical or ANNs methods for lithofacies clustering for either conventional or unconventional formations (Wang and Carr, 2012; Ma et al., 2015), which enables the use of complementary information from different logs. One problem is how to handle irrelevant information. PCA is often an effective way to deal with this problem. In the above example, all the information related to the facies is in the http://dx.doi.org/10.1190/tle36030064.1.

March 2017

Special Section: Data analytics and machine learning

Figure 1. Lithofacies clustering using PAC from GR and resistivity logs. (a) Histogram of GR log. (b) Histogram of the logarithm of resistivity log. (c) 2D probability density function based on GR and logarithm of resistivity. (d) Histogram of the first PC of PCA from GR and logarithm resistivity. (e) Crossplot of GR and logarithm of resistivity overlain by the first PC. (f) Crossplot of GR and logarithm of resistivity overlain by the classified facies using PCA. (g) Component GR histograms by the three clustered facies (channel, crevasse, and floodplain). (h) Component histograms of the logarithm of resistivity for the three facies. (i) Crossplot of GR and logarithm of resistivity overlain by the classified facies using ANNs. (j) Histogram and cumulative histogram of the first PC with the cutoffs associated with the facies proportions: 33:12:55 (in percentage) for channel, crevasse, and floodplain. The cutoff of 0.15 on the first PC (PC1) separates the floodplain and crevasse, and the cutoff of 0.8 on PC1 separates the channel from crevasse.


first PC; the second PC essentially has no information for facies. When ANNs are used without the preprocessing of PCA or using both the PCs, the clustered facies are not as good as the previous result from the first PC (Figure 1i). This is because the ANN cannot discriminate between relevant and irrelevant information and uses both of them in the classification. From Figure 1i, the two separation lines of the three facies are not perpendicular to the first PC and are correlated to both the first and second PCs. On the other hand, applying cutoffs on the first PC implies ignorance of the second PC because the first and second PCs are orthogonal. In other cases, however, both the first and second PCs (can have more PCs when PCA is applied to more than two variables) may contain information. Then it is possible to combine PCs to create a composite PC, with appropriate weights, that discriminates the lithofacies (Ma and Gomez, 2015). Notice also that another deficiency of ANNs is the tendency of generating similar proportions for the clusters unless the clusters are highly distinctive. In this example, the clustered facies by ANNs have a relative proportion of 33:24:43 for floodplain, crevasse/splay, and channel. However, the analog data from the regional geology and neighboring fields all suggest a much higher proportion of floodplain and a lower proportion of crevasse/splay. ANNs over-predicted the crevasse/splay proportion and under-predicted the floodplain proportion, as it cannot integrate the analog information. When core data are available, a supervised ANN can mitigate this problem if the core data have no sampling bias. In practice, core data are often biased, and supervised ANNs can go astray as a result. On the other hand, PCA enables the incorporation of analog information, and a simple cumulative histogram of the discriminant PC or composite PC enables the generation of the target proportions. The definition of a relative proportion of 53:12:35 for floodplain, crevasse/splay, and channel for this example is shown in Figure 1j. Examples of well-log signatures for core facies can be found in Ma et al. (2015b).

March 2017

THE LEADING EDGE

65

Integration of seismic attributes for clustering seismic facies using neural network

effect is a common problem in using seismic data to generate geologic maps. ANNs initially could not overcome this problem using just common seismic attributes. By introducing a geometric attribute — the relative distance to the coastline — the mapped

The use of seismic attributes for reservoir characterization and modeling has gained much popularity. A large number of seismic attributes are often extracted from seismic data to predict reservoir properties. ANNs are often the method of choice. However, without a thorough understanding of the physical problem, using ANNs can generate hard-to-interpret features or even artifacts. Moreover, although a large number of attributes may be available in an application, using too many attributes can make ANNs go astray. Selection of appropriate attributes is very important, and this requires thorough data analysis with a rigorous screening of attributes. Next we present an example of mapping the environments of depositions (EOD) for a stratigraphic interval using ANNs with seismic attributes, including average amplitude, local variance, P-impedance, S-impedance, and fracture likelihood (Figures 2a–2e). Initially, only the amplitude and local variance were used, and the mapped EODs have many undesired features (Figure 3a), mainly as a result of local features in the amplitude data. As pointed out in the previous example, Figure 2. Maps of seismic attributes (averages within a stratigraphic zone, performed in depth) and a geometric ANNs do not know what the relevant attribute. (a) Amplitude. (b) Local variance. (c) P-impedance. (d) S-impedance. (e) Fracture likelihood. (f) Relative or irrelevant information is and thus distance to the coastline (general orientation geometry index). All the maps have approximate dimensions: 10 km generated many small lagoonal-facies (x axis) by 6 km (y axis). clusters within the basinal deposition and some basinal-facies clusters within the lagoonal deposition. By adding Pimpedance, the clustered facies have no basinal facies in the lagoonal deposition and fewer lagoonal-facies in the basinal deposition, but the lagoonal facies still exist between the basinal and reefal depositions (Figure 3b). By using more attributes, such as S-impedance and fracture likelihood, the clustered facies results can be hardly improved, such as shown in Figure 3c. ANNs with more seismic attributes were also used, but it was practically impossible to get a satisfied EOD delineation because of the generations of artifacts (each run generates artifacts in different places, but they always produces some artifacts). Figure 3. Map views of EODs. (a) EODs from ANNs classification using amplitude and local variance. (b) EODs This challenging problem is often from ANNs classification using amplitude, local variance, and P-impedance. (c) EODs from ANNs classification termed the shoulder effect; i.e., there is using amplitude, local variance, P-impedance, S-impedance, and fracture likelihood. (d) EODs classification using abundant lagoonal facies between the amplitude, local variance, and relative distance to the coastline. All the maps have approximate dimensions: 10 km basinal and reefal EODs. The shoulder (x axis) by 6 km (y axis).

66

THE LEADING EDGE

March 2017


Prediction of continuous properties integrating well logs and seismic attributes

Both PCA and neural network can be used to integrate a number of seismic attributes to predict a continuous variable, such as porosity, permeability, and hydrocarbon production. When well data are available, these methods can be supervised to make the prediction. However, the integrated results by using only neural network or PCA may not be easy to interpret and are sometimes inconsistent with other data. Figure 5 shows results of integrating three seismic attributes to map the porosity using supervised ANNs (well-log porosity data from eight wells were used in the supervised ANNs). The porosity map by ANNs is very smooth and has negative porosity values. It reduced the overall porosity quite dramatically, and high porosity values in the well logs were not predicted (Figures 5a and 5b). Integrating neural network’s result with well-log data using geostatistical methods can mitigate the biased estimaFigure 4. Amplitude-variance crossplots. (a) Overlain by the EOD of ANNs using amplitude and variance as inputs. The lagoonal facies in the circle are the shoulder effect created by ANNs. (b) Overlain by the EOD of ANNs using tion by ANNs in the porosity mapping. amplitude, variance, and relative distance to the coastline as inputs. The shoulder effect is not present because Specifically, cokriging and collocated those classified lagoonal facies are now classified as basinal facies. (c) Overlain by P-impedance, which explains cosimulation (CoCoSim) can be used why adding P-impedance did not mitigate the shoulder effect (the impedance values for the “shoulder” are more to integrate the ANNs prediction with similar to the impedance values of the lagoonal EOD. (d) Overlain by relative distance that shows similar values in the well-log data. This workflow can be the “shoulder” as in the basinal EOD. very useful, especially because most commercially available modeling and EODs by ANNs were improved dramatically, the shoulder effect mapping software allows only one secondary variable as the is completely eliminated, and all the three depositions are well constraint in using cokriging or CoCoSim. Supervised ANNs defined (Figure 3d). enable integration of many seismic attributes into one composite Figure 4 explains why using only basic seismic attributes could attribute, and subsequently, this composite attribute can be used not overcome the shoulder effect, but adding the relative distance to constrain the geostatistical mapping. Figure 5c shows the to the coastline enabled an accurate classification. The classification collocated cosimulation of porosity that integrates the well-log of the EODs by amplitude and variance is shown in Figure 4a; porosity data and the ANNs predicted porosity while using the the clustered lagoonal facies from the shoulder effect could not previous EOD delineation as a constraint. Figure 5d shows the be separated because the P-impedance values are similar for the histogram comparisons of the original well-log porosity and the lagoonal facies and misclassified lagoonal facies (Figure 4c). On collocated cosimulated porosity maps. Notice that the porosity the other hand, the relative distance values in the “shoulder” are map by CoCoSim has porosity values similar to the original similar to the basinal EOD (Figure 4d), which explains why well-log porosity (Figure 5d). The main reason for this improved adding that additional attribute in the ANNs eliminated the porosity mapping is as follows: when ANNs prediction is used as shoulder effect in the classification (compare Figures 4a and 4b, a secondary variable in CoCoSim, its absolute values do not impact and 4c and 4d). the mapping result. The porosity values in the CoCoSim map is In short, this classification example has shown that definition mainly impacted by well-log porosities and the EOD constraint and selection of input attributes for ANNs are very important. (Figure 3d), and the ANN prediction only impacted the relative Without diligent screening and selection of the input data, ANNs highs and lows in the spatial distribution, not the frequency will not generate good results, no matter how many attributes are distribution (histogram). used in ANNs and how much the ANN’s training parameters Note the overall good histogram match between the mapped are tuned. porosity and the well-log porosity. Yet, some differences in the


March 2017

THE LEADING EDGE

67

Figure 5. (a) Porosity map predicted using supervised ANNs with porosity data from eight wells and three seismic attributes (including amplitude, local variance, and P-impedance; see Figure 2.) (b) Histogram comparison for well-log porosity (black) and ANNs prediction (red). (c) Cosimulation of porosity (map) by using EOD and ANNs prediction constraints. (d) Histogram comparison for well-log porosity (black), cosimulation using EOD, and ANNs prediction (red). Maps in (a) and (c) have approximate dimensions of 10 km (x axis) by 6 km (y axis).

very low and very high porosity ranges of the histogram comparison are observed. These are actually a result of correcting the sampling bias because more well-log data are available in the high-porosity reefal deposition than in the basinal and lagoonal depositions.

comments on the manuscript. All the data analysis using PCA and ANNs was carried out on Petrel* (Schlumberger’s trademark).

Conclusions

References

The uses of PCA and ANNs for seismic facies and lithofacies clustering from seismic attributes and wireline logs were presented. ANNs are powerful tools for reservoir characterization as they can model nonlinear relationships between a predictor and the response variable. Both categorical and continuous reservoir properties can be predicted using ANNs. However, our examples and experience have shown that ANNs also have some deficiencies, such as lack of capability for interrogating the ANN’s process and difficulty in tuning the training parameters. On the other hand, PCA is an analytical method. Our examples have shown how to relate PCA’s components to geologic and petrophysical interpretations and how to select or combine principal component(s) for improving lithofacies classifications. We have also demonstrated the importance of integrating ANNs with the physical model and statistical methods. Combining ANNs with PCA and geostatistical methods in predicting reservoir properties has proved to be useful. In our examples PCA was used as a preprocessing method, which enabled the selection of an appropriate component for predictions.

Acknowledgments

The authors thank Schlumberger for permission to publish this work and Yasin Hajizadeh and David Marquez for their reviews and

68

THE LEADING EDGE

March 2017

Corresponding author: [email protected] Avseth, P., T. Mukerji, and G. Mavko, 2005, Quantitative seismic interpretation: Applying rock physics tools to reduce interpretation risk: Cambridge University Press. Chithra Chakra, N., K.-Y. Song, M. M. Gupta, and D. N. Saraf, 2013, An innovative neural network forecast of cumulative oil production from a petroleum reservoir employing higher-order neural networks (HONNs): Journal of Petroleum Science and Engineering, 106, 18–33, http://dx.doi.org/10.1016/j.petrol.2013.03.004. Chopra, S., and K. J. Marfurt, 2007, Seismic attributes for prospect identification and reservoir characterization, SEG Geophysical Developments Series No. 11: SEG. Everitt, B. S., and G. Dunn, 2001, Applied multivariate data analysis, 2nd ed.: Arnold. Kettenring, J. R., 2006, The practice of clustering analysis: Journal of Classification, 23, no. 1, 3–30, http://dx.doi.org/10.1007/ s00357-006-0002-6. Ma, Y. Z., 2011, Lithofacies clustering using principal component analysis and neural network: applications to wireline logs: Mathematical Geosciences, 43, no. 4, 401–419, http://dx.doi. org/10.1007/s11004-011-9335-8. Ma, Y. Z. and E. Gomez, 2015, Uses and abuses in applying neural networks for predicting reservoir properties, Journal of Petroleum Science and Engineering, 133, 66–75, http://dx.doi. org/10.1016/j.petrol.2015.05.006.


Ma, Y. Z., W. R. Moore, E. Gomez, B. Luneau, P. Kaufman, O. Gurpinar, and D. Handwerger, 2015, Wireline log signatures of organic matters and lithofacies classifications for shale and tight carbonate reservoirs, in Y. Z. Ma and S. Holditch, eds., Handbook of Unconventional Resource: Elsevier, 151–171. Ma, Y. Z., W. R. Moore, E. Gomez, W. J. Clark, and Y. Zhang, 2015b, Tight gas sandstone reservoirs, Part 1: Overview and lithofacies, in Y. Z. Ma and S. Holditch, eds., Handbook of Unconventional Resource: Elsevier, 405–427. Ma, Y. Z., H. Wang, J. Sitchler, O. Gurpinar, E. Gomez, and Y. Wang, 2014, Mixture decomposition and lithofacies clustering from wireline logs: Journal of Applied Geophysics, 102, 10–20, http://dx.doi.org/10.1016/j.jappgeo.2013.12.011. McLachlan, G. J., and D. Peel, 2000, Finite mixture models: Wiley. Wang, G., and T. R. Carr, 2012, Marcellus shale lithofacies prediction by multiclass neural network classification in the Appalachian basin: Mathematical Geosciences, 44, no. 8, 975–1004, http://dx.doi.org/10.1007/s11004-012-9421-6.


March 2017

THE LEADING EDGE

69

Integration of seismic and well-log data using ...

Integration of seismic and well-log data using ...

Suggest Documents

DISTRIBUTED DATA INTEGRATION AND MINING USING ADMIRE

Integration of Seismic Refraction and 2D Electrical

Semantic Integration of Relational Data Using SPARQL

DATA INTEGRATION USING WEIGHTS OF EVIDENCE MODEL

Analysis and Visualization of Seismic Data Using Mutual Information

Seismic data interpretation using Hough transform

Seismic data decomposition into spectral components using ...

Seismic data modelling using parallel distributed MATLAB

RESERVOIR MANAGEMENT USING 3-D SEISMIC DATA

Seismic data interpretation using Hough transform

Seismic data decomposition into spectral components using ...

Reflection Seismic Data Analysis using Big Data

Joint inversion of high resolution seismic reflection data and seismic ...

Integration of Satellite and Surface Data Using a Radiative-Convective

Integration of text- and data-mining using ontologies successfully ...

Integration of metabolomics, lipidomics and clinical data using a ...

Integration Of Geophysical And Geological Data Using ... - IEEE Xplore

Seismic Data Collection with Shakebox and Analysis Using MapReduce

Seismic Data Processing Using Nonlinear Prediction and ... - Vibrometric

Seismic Data Processing Using Nonlinear Prediction and ... - Vibrometric

Dynamic Data Integration using Web Services - CiteSeerX

Spatial Data Integration using Ontology-Based Approach

Biological data integration using Semantic Web ...

Hydrocarbon Volumetric Analysis Using Seismic and Borehole Data ...