Modular Neural Networks for Seismic Tomography

Modular Neural Networks for Seismic Tomography D. Barráez*, S. Garcia-Salicetti**, B. Dorizzi**, M. Padrón*, **, E. Ramos* *Universidad Central de Venezuela, Caracas, Venezuela **Institut National des Télécommunications, Dépt. EPH, France

Abstract We propose in this paper a modular approach for the problem of traveltime inversion or seismic tomography. This problem consists in the inference of the velocity of wave propagation in the subsurface after an explosion has been produced at the surface, relying on such waves’ traveltimes. These traveltimes are recorded by several receivers on the surface. In the present work, we consider data synthetically generated, thanks to the use of a particular "Earth-Model". An Earth-model is a multilayered media in which each layer is homogeneous, that is, the seismic wave’s propagation velocity in each layer is constant, and each layer's thickness is different. We compare, on these synthetic data, a Multilayer Perceptron (MLP) to a modular neural architecture. We show that the modular approach is better suited for the inversion problem stated, and study the experimental conditions in which the potential of this approach is optimally exploited.

1. Introduction Seismic methods are broadly used in oil exploration and detection. Such methods [1] consist in the generation of a set of seismic waves in the surface of the earth by a source, usually after explosion or vibration; such set of waves propagates along the subsurface. These waves are partially reflected and partially refracted by the interfaces of different geological layers. A set of receivers at the surface of the Earth permits to determine the time spent by waves between the source and a specific receiver, called the waves’ traveltime. The problem of traveltime inversion or seismic tomography consists in the inference of some structural characteristics of the subsurface, relying on traveltimes. Such characteristics are in the present work the velocity of propagation of

seismic waves and the deepness associated to each geological layer. In the last years, Neural Networks (NNs) have become a useful tool in different fields of Geophysics [2]. In seismic tomography, only for velocity inversion, Multilayer Perceptrons (MLPs) [3,4] were used. In crosswell tomography, "counter propagation NNs" with a hidden cluster layer were compared to genetic algorithms [5]. Also in cross-well tomography, a NN was used to compute iteratively the best least-squares fit to the measured traveltime data [6]. This approach is quite different from the others since the NN computes the Earth-Model that explains the best the data (traveltimes). Indeed, in the other works mentioned [3,4], the NN learns to invert the function permitting to obtain velocities from traveltimes, on data resulting from several Earth-models randomly generated. In such works, different Earth-model structures and ray-tracing algorithms were used. The structure of our Earth-model is similar to that of Röth and Tarantola [3,4], but it is more complex and more realistic since we consider that each geological layer has a variable thickness. Also, we consider Earth-models in which the generated velocities are uncorrelated. Therefore, in our framework, not only the velocities are approximated by the NN as was the case in [3,4]; also the deepness of each layer is inverted by the neural architecture. This makes the problem much more difficult. For instance, consider one layer and one receiver, that is only one time arrival; if deepness varies, there are infinite couples (v,p) where v is the velocity and p the deepness, that are a solution of the traveltime inversion problem. Setting a second receiver in the surface introduces a restriction to the inversion problem that reduces the number of possible solutions. As it is not possible to know a priori how many receivers are needed to have a unique solution, the problem is ill-posed. Given this framework, the important thing is to reach a good approximation to a particular solution. We first evaluate the ability of a MLP to approximate the multivariate function that permits to obtain velocities and deepnesses

1051-4651/02 $17.00 (c) 2002 IEEE

in each geological layer (outputs) from all the traveltimes available (inputs). Then, we propose another strategy for inversion, founded on a modular approach; indeed, the multivariate function mentioned above is rather difficult to approximate. It is why we think that the right strategy in our inversion framework is to divide the original problem in sub-problems of reduced complexity, and to solve each of them using neural networks of lower size. A natural way to separate the original problem in smaller inversion problems is to perform inversion, geological layer per geological layer, with a different input at each step. Such an approach has not been explored before for seismic tomography. Our paper is structured as follows: in section 2, the Earth-models that we consider are presented, as well as data generation; in section 3, we detail the neural architectures (MLP and modular). In section 4, experimental results are given, and conclusions are stated in section 5.

2. Generation of the Earth-models The Earth-models under study have 7 layers over a half-space, and all layers have a different thickness; indeed, the deepness pi of the interface between layer i and layer i+1 is randomly generated as follows: p1 = 200 m + X1 and pi+1 = pi + Xi, where i=2,…7, X1 is a uniform random variable on [-100,100], and Xi, for i = 2,…7, are independent and identically distributed random variables, with a uniform distribution on [50, 350]. As can be seen, in average, the thickness of each layer is of 200 m. Source ← 140

1

2 ← 90

Receivers 3 ← 90

← 90

4…

N

. . . . . .

vary only vertically from one layer to the next. The wave propagation velocity vi in layer i is randomly generated in order to obtain uncorrelated propagation velocities. The generation scheme is the following: v1 = 1500 m/s + Y1 and vi+1 = 1500 + 190(i-1) + Yi, where i=2,…7, Y1 is a uniform random variable on [-350,350], and Yi, for i=2,…7, are independent and identically distributed random variables, with a uniform distribution on [-350, 350]. Figure 1 shows the Earth-model with the waves’ source and a first receiver placed 140 mts. away from it, as well as the remaining receivers placed every 90 mts. As shown in Figure 1, the source emits waves that propagate in the subsurface, and N receivers in the surface, record the information of waves’ reflections generated at the interfaces of different geological layers. Each receiver records 7 traveltimes, one per interface between subsequent layers; therefore, 7*N traveltimes are associated to an Earth-model. Traveltimes of an Earth-model are computed with a ray-tracing algorithm [1,7], that describes the path of a wave through the different layers of the subsurface, since it emerges from the source until its arrival (after several reflections) to the surface.

3. Neural Architectures 3.1 The Multilayer Perceptron We consider a Multilayer Perceptron (MLP) in which the traveltimes of a given Earth-model are the inputs and the velocities and deepnesses associated to each geological layer are the outputs. Considering N receivers in the surface, the architecture of the MLP is the following: an input layer of 7*N units, one per traveltime, an output layer of 14 units, composed of 7 units for the velocities that we want to approximate, and 7 for the 7 deepnesses, and one hidden layer. The cost function is the classical quadratic error function: 2

7

E(W) =

∑∑(

vdesiα

α

Figure 1. Earth-model with its source, M layers and N receivers on the surface On the other hand, we consider an Earth-model in which each layer is homogeneous, that is the seismic wave propagation velocity in each layer is constant. In other words, the velocities do not vary horizontally, they

i =1

− vobsiα

2

7

) + ∑∑(

pdesiα

α

− pobsiα

)

i =1

where α represents the current example of the training database, i a given geological layer, vdes is the target velocity for layer i, vobs is the velocity estimation given by the neural network, , pdes is the target deepness for layer i, pobs is the deepness estimation given by the neural network.

3.2 The Modular Architecture We consider now as a single pattern X(k) the traveltimes of waves that were reflected in a specific geological layer, that is in layer k; in other words, X(k) is

1051-4651/02 $17.00 (c) 2002 IEEE

a vector X(k) = (tk1,...,tkN), where tki denotes the wave traveltime recorded by the i-th receiver when the reflection occurs in layer k. Therefore, X(k) codes the information concerning geological layer k as a reflector. The modular architecture is made of a set of neural networks organized sequentially: the first network is trained to approximate the propagation velocity and the deepness of the first geological layer, when X(1) is given as input (the traveltimes corresponding to reflections occurred in layer 1). The second network is trained to approximate the propagation velocity and the deepness of the second geological layer, when receiving as input X(1), X(2) and the approximated wave propagation velocity and deepness of layer 1 computed by the previous network; and so forth. We can thus summarize this architecture as follows: the neural network corresponding to layer i (denoted NNi in Figure 2) receives as input the traveltimes corresponding to reflections of the current layer, that is X(i), and of previous geological layers X(1),…, X(i-1), and also the velocities V1,…,Vi-1, and deepnesses P1,…,Pi-1, approximated by previous networks (see Figure 2). NN1

(1)

X

In the first case, 4500 Earth-models were used for training purposes; in the second case, 9000 Earth-models were used, and in the last case, 12000. 60% of the Earthmodels generated were used for training purposes, 10% for validation, and 30% for generalization. We have considered two indicators to measure the quality of the estimation: the empirical correlation coefficient for the propagation velocity in layer i is given by:

∑ (v n

Ei =

obs j

σi

models,

for Earth-model j,

(1)

(7)

Vi ,Pi V6 ,P6 NN7

X ,…, X

V7 , P7

Figure 2. The modular architecture

4. Experimental Results The MLP and Modular architectures were both trained on different sets of data, that is considering a different number of receivers in the surface. Indeed, the more receivers we have, the more available information we have. This study was carried out to measure the influence of the number of receivers on the quality of the inversion performed by the neural architectures. Three cases were studied: first, 20 receivers were used to collect traveltimes in the surface, then 30 and finally 40.

v obs is the mean velocity in layer i,

mean desired velocity, σiobs is the standard deviation obtained for velocity in layer i, and σides is the desired standard deviation. We also measured the mean relative error per geological layer:

Layers (i)

des

des v des is the desired velocity in Earth-model j, v is the j

Vi-1 ,Pi-1 (1)

σi

v obs is the wave propagation velocity in layer i j

∑

j =1

X ,…, X

obs

where i is the current layer, n is the number of Earth-

n

NNi

)

j =1

Fi =

V1 , P1

)(

− v obs v des − v des j

1 2 3 4 5 6 7

(

 abs v obs − v des j j   v des j Velocity MLP

Velocity Modular

) 

Deep MLP

Deep Modular

0.990 1 0.999 1 0.853 0.996 0.974 0.999 0.700 0.959 0.918 0.995 0.482 0.818 0.912 0.986 0.336 0.732 0.967 0.982 0.307 0.198 0.964 0.980 0.230 0.172 0.962 0.980 Table 1. Correlation coefficients with 20 receivers

We first notice in Table 1 (correlation coefficient with 20 receivers) that the modular architecture gives better approximations than those of the MLP for deepnesses. As for velocities, the same is observed up to layer 5. Indeed, results are strongly degraded in deeper layers for both architectures; the difference is that the degradation is rather progressive for the MLP and instead rough for the modular architecture, after layer 5. Although the potential of the modular architecture appears in this first experiment, it is also clear that its drawback is that it cumulates errors from one neural module to the next. On the other hand, both architectures give bad approximations for deeper layers, because there are not enough receivers in this experiment, as discussed before.

1051-4651/02 $17.00 (c) 2002 IEEE

Indeed, receivers impose restrictions to the inversion problem, that reduce the number of possible solutions (v,p). When there are too many solutions, the neural architectures can only approximate an average solution, which is not satisfactory. It is why we made 2 other experiences with more receivers.

Layers

1 2 3 4 5 6 7

Velocity MLP

Velocity Modular

Deep MLP

Deep Modular

0.991 1 0.990 1 0.955 0.999 0.991 0.999 0.918 0.975 0.989 0.999 0.889 0.967 0.988 0.998 0.882 0.950 0.990 0.997 0.827 0.951 0.988 0.996 0.760 0.917 0.990 0.997 Table 2. Correlation coefficients with 30 receivers

Layers

Velocity MLP

Velocity Modular

Deep MLP

Deep Modular

1 0.0058 0.0002 0.0324 0.0006 2 0.0267 0.0092 0.0620 0.0031 3 0.0326 0.0135 0.0632 0.0038 4 0.0342 0.0139 0.0662 0.0039 5 0.0328 0.0153 0.0651 0.0182 6 0.0358 0.1282 0.0669 0.0304 7 0.0398 0.1568 0.0719 0.0390 Table 3. Mean relative errors with 30 receivers Comparing Table 2 (correlation coefficient with 30 receivers) to Table 1, we notice a clear improvement in both architectures' approximations (MLP, modular). Also, the results obtained with 30 receivers with the modular architecture are much better than those given by the MLP. Indeed, when using the modular approach, we observe that the approximations made by each neural module are only slightly degraded when deepness increases (all the empirical correlation coefficients are higher than 0.9). This confirms the fact that the modular approach is appropriate for traveltime inversion: solving the problem in a incremental way permits indeed to guide the search for a solution by the introduction of restrictions as inputs to each neural module. Table 3 (mean relative error with 30 receivers) shows that, for the MLP, after the third layer, the error stabilizes at an almost constant value. This is not true for the modular architecture, where the errors get degraded with deepness. Results with 40 receivers are not presented here because they only bring a very slight improvement. This suggests that for an Earth-model with 7 layers, the information collected by 30 receivers is enough to

perform an inversion of good quality with the modular neural architecture.

5. Conclusion We studied in this paper the impact of an original modular approach for seismic tomography on synthetic data. The Earth-models used to generate such data (waves' traveltimes) have 7 layers of different thickness and uncorrelated velocities from one layer to the next. We showed that the modular architecture outperforms the MLP one and that the number of receivers at the surface is a crucial parameter to reach a good quality of inversion. In other words, the modular architecture has the ability of making good approximations that do not get degraded in deeper layers, if there are enough receivers. The study also showed that beyond a critical value for the number of receivers, the improvement of the inversion quality is not significant.

Acknowledgements This work was partially supported by the FrenchVenezuelan Action ECOS-Nord N° V99M01 (N° 99000267), CDCH-UCV grant N° 03.11.4258.98 and Vicerectorado Académico UCV.

6. References [1] M. Lavergne, Seismic Methods, Éditions Technip, Paris, 1986. [2] M. Van der Baan, C. Jutten, “Neural Networks in Geophysical Applications”, Geophysics, SEG publications, USA, 2000, Vol. 65 No 4, pp. 1032-1047. [3] G Röth, Application des réseaux de neurones aux problèmes inverses sismiques, PhD thesis, Institut de Physique du Globe de Paris, Université de Paris 7, Paris, 1993. [4] G. Röth, A. Tarantola., “Neural networks and inversion of seismic data”, Journal of Geophysical Research, American Geophysical Union, Washington, USA, 1994, Vol. 99, Nº B4, pp. 6753-6768. [5] S. Kumar Nath, S. Chakraborty, S. Kumar Singh, N. Ganguly, "Velocity inversion in cross-hole seismic tomography by counter-propagation neural network, genetic algorithm and evolutionary programming techniques", Geophys. J. Int., 1999, Vol. 138, pp. 108-124. [6] M. Ning, W. Yanping, H. Zhengyi, B. Zongdi, " An Iterative Algorithm Using a Neural Network for Nonlinear Traveltime Tomography", Proceedings of ICSP'96, 1996, pp. 130-136. [7] J Claerbout, Fundamentals of Geophysical Data Processing with Applications to Petroleum Prospecting, McGraw Hill, New York, 1976.

1051-4651/02 $17.00 (c) 2002 IEEE