Document not found! Please try again

Elastic full waveform inversion based on mode decomposition: the

0 downloads 0 Views 6MB Size Report
GJI Marine geosciences and applied geophysics. Elastic full waveform ... State Key Laboratory of Marine Geology, Tongji University, Shanghai, China. E-mail: ...
Geophysical Journal International Geophys. J. Int. (2017) 209, 606–622 Advance Access publication 2017 February 1 GJI Marine geosciences and applied geophysics

doi: 10.1093/gji/ggx038

Elastic full waveform inversion based on mode decomposition: the approach and mechanism T.F. Wang and J.B. Cheng State Key Laboratory of Marine Geology, Tongji University, Shanghai, China. E-mail: [email protected]

Accepted 2017 January 31. Received 2016 December 3; in original form 2016 July 26

SUMMARY Elastic full waveform inversion (EFWI) aims to reduce the misfit between recorded and modelled multicomponent seismic data for deducing a detailed model of elastic parameters in the subsurface. Because the explicit computation and inversion of the Hessian matrix is extremely resource intensive, a gradient-based (rather than Hessian-based) minimization is generally applied for large-scale applications. However, the multiparameter trade-off effects cause cross-talks in the computed gradients and thus severely affect the convergence and the quality of the inverted model. Recently, preconditioning the gradients based on elastic wave mode decomposition has been suggested for mitigating the parameter trade-offs in the EFWI process. In this paper, we propose a mode decomposition (MD)-based EFWI approach in which the preconditioned gradients are obtained through the cross-correlation of the forward and decomposed adjoint wavefields in the time domain. Based on the decomposed Frech´et derivatives, we explain the mechanism of this approach through analyses of Hessian and resolution matrices and comparison with the Gauss–Newton gradients. Numerical examples of a simple fluid-saturated model and the Marmousi-II model demonstrate that the MD-based preconditioned conjugate-gradient approach can mitigate the trade-off between the P- and S-wave velocities and achieve fast convergence without any Hessian-involved calculations. Key words: Waveform inversion; Inverse theory; Elasticity and anelasticity; Wave propagation.

1 I N T RO D U C T I O N Elastic full waveform inversion (EFWI), in the manner of Tarantola (1986) and Mora (1987), provides the ability to invert the multicomponent data for elastic parameters of the subsurface, for example, P- and S-wave velocities and density. Despite the computational requirements of EFWI, it has been applied to real data (Crase et al. 1992; Djikpesse & Tarantola 1999; Sears et al. 2008, 2010; Prieux et al. 2013b; Vigh et al. 2014). The implementations can be described in the time domain (Shipp & Singh 2002), the frequency domain (Pratt et al. 1998; Brossier et al. 2009), or a hybrid method using time-domain forward modelling with casting the inverse problem in the frequency domain (Nihei & Li 2007; Sirgue et al. 2008). However, involving several parameters increases the nonlinearity of the EFWI process and introduces parameter trade-offs (or cross-talks), caused by the coupling effects of the different physical parameters (Forgues & Lambare 1997). In marine environments, the nonlinearity of the EFWI of streamer or multicomponent ocean-bottom seismic data becomes even more serious because of the limited amount of P-to-S conversion in the case of a soft seabed (Sears et al. 2008). Many researchers have noted that parametrization is of great importance for mitigating the parameter trade-offs. The choice of parametrization can be made through investigating the scattering

606

 C

radiation patterns (Wu & Aki 1985; Tarantola 1986; Plessix & Cao 2011; Gholami et al. 2013) or the Hessian operator (Operto et al. 2013; Innanen 2014). Another popular solution to the nonlinearity problem is to use multiscale approaches by decoupling the high and low wavenumbers of velocity variation (Symes & Caeazzone 1991; Bunks et al. 1995; Clement et al. 2001; Shin & Cha 2008; de Hoop et al. 2012; Xu et al. 2012; Biondi & Almomin 2013; Ma & Hale 2013; Alkhalifah 2015; Warner & Guasch 2016). Considerable efforts are required to extend these approaches from acoustic to elastic media. As Tarantola (1986) suggested, one should first reconstruct the parameters that have a primary influence on the data followed by the parameters with a secondary influence. Sears et al. (2008) introduced a successful hierarchical strategy using different subsets of ocean bottom cable (OBC) data with temporal windowing. Operto et al. (2013) discussed strategies to reduce the nonlinearity and to control the trade-off between parameters by hierarchically selecting the data components to invert and the parameter classes to update. Xu & McMechan (2014) proposed a multistep-length gradient scheme to improve the accuracy of density model updating. A strategy to mitigate the trade-offs is to account for the Hessian operator within the optimization scheme. The diagonal blocks of the inverse Hessian act as scaling filters to address the geometrical spreading and band-limited effects, whereas the off-diagonal blocks have the effect of suppressing trade-offs (Pratt

The Authors 2017. Published by Oxford University Press on behalf of The Royal Astronomical Society.

EFWI based on mode decomposition et al. 1998; Fichtner & Trampert 2011; Operto et al. 2013; Innanen 2014; Pan et al. 2016). To avoid the unfeasible calculation of the Hessian and its inverse, many studies aim to find a good approximation of the inverse Hessian, for example, using the pseudo-Hessian (Shin et al. 2001; Choi & Shin 2008) or constructing the Hessian kernel iteratively using a quasi-Newton method, such as the limited-memory BFGS (L-BFGS) method (Nocedal & Wright 2006; Brossier et al. 2009). Sheen et al. (2006) proposed a time-domain Gauss–Newton (GN) EFWI approach based on the reciprocity principle and the convolution theorem to calculate the approximate Hessian. Bae et al. (2012) developed a frequencydomain acoustic-elastic coupled waveform inversion based on the GN conjugate gradient method. Pratt et al. (1998) explained the physical meaning of the gradient and Hessian operators using the partial-derivative wavefield caused by a ‘virtual source’. The overlap of partial-derivative wavefields at certain ranges of scattering angles is responsible for the parameter trade-off problems. Therefore, Wang et al. (2015a) proposed mitigating the trade-offs by decomposing the elastic wavefield into P- and S-wave data subsets and selectively exploiting them in different stages of a hierarchical EFWI workflow. Based on numerical investigations of the objective functions using the decoupled P- and S-wave data misfits, Ren & Liu (2016) demonstrated that wavefield separation can reduce the nonlinearity of EFWI. In addition, mode decoupling has become a key step in elastic reverse time migration (ERTM) for obtaining physically interpretable images, for example, Yan & Sava (2008) and Wang et al. (2016). In this paper, rather than empirically preconditioning the gradients by decomposing both forward and adjoint wavefields and the seismograms, for example, Ren & Liu (2016), an efficient preconditioning that only decomposes the adjoint wavefields is proposed and demonstrated with more physical insights. The reconstruction of density is a very challenging problem (Virieux & Operto 2009). To avoid complications, we focus on reconstruction of P- and S-wave velocities in the main context and only investigate the potential of the proposed approach for three-parameter inversion including density in discussion. First, based on the elastic Born approximation, we analyse the radiation patterns with respect to P- and Swave velocity perturbations and derive the decomposed Frech´et derivatives or Jacobian matrices. Then, we investigate the contributions of the decoupled P- and S-wave components and present the cross-term approximations for gradient calculation. Using the adjoint-state method (Plessix 2006), the preconditioned gradients are efficiently calculated through zero-lag cross-correlation of the forward and the decomposed adjoint wavefields. To obtain more understanding of the mechanism, we investigate the MD-based Hessian and resolution matrices, and we compare the preconditioned gradients between MD-based and GN-based approaches. After that, we demonstrate the MD-based EFWI approach on a fluid-saturated sandstone model and the Marmousi-II model (Martin et al. 2006) using OBC geometry. Finally, we give a brief discussion about the density inversion before drawing some conclusions.

607

where ui and fi are the ith component of the particle displacement vector and the body force, respectively; ρ is the density; and cijkl is the component of the stiffness tensor. All indices change from 1 to 3, and Einstein’s summation convention over repeated indices is implied. For the isotropic case, the stiffness tensor satisfies: ci jkl = λδi j δkl + μ(δik δ jl + δil δ jk ),

(2)

where λ and μ are the Lam´e parameters and δ ij is Kronecker’s symbol. According to the Born approximation, a perturbation at x in the stiffness, δci jkl (x), will generate a perturbed wavefield δu, that is,     ∂δu i2 ∂ ∂u 0k ∂ 0 ∂δu k ρ 2 − c = δci jkl (3) ∂t ∂ x j i jkl ∂ xl ∂xj ∂ xl in which the background wavefield u0 satisfies eq. (1) in the background media with stiffness ci0jkl . It implies that the perturbed wavefields result from the secondary source caused by δci jkl (x) and the incident wavefields. For convenience, the majority of the derivations in this paper are expressed in the frequency domain, although most of the operations are in the time domain. Using the representation and divergence theorems (Kamath & Tsvankin 2016), the perturbed wavefields satisfy:  ∂u 0k (x, ω) ∂G ni (r, x, ω) δci jkl (x)d(x), δu n (r, ω) = − ∂ xl ∂xj (x) (n = 1, 2, 3), (4) where ω denotes the frequency, G ni (r, x, ω) is the background elastodynamic Green’s function denoting the displacement at location r in the n direction due to a unit point force at x in the i direction, and (x) is the integration volume. Note that eq. (4) only considers the first-order scattering. Representing the stiffness perturbation with velocity perturbations, that is, δci jkl =

∂ci jkl ∂ci jkl δV p + δVs , ∂ Vp ∂ Vs

(5)

and ∂ci jkl = 2ρV p δi j δkl , ∂ Vp ∂ci jkl = 2ρVs (−2δi j δkl + δik δ jl + δil δ jk ), ∂ Vs

(6)

we rewrite eq. (4) in a velocity parametrization as   Jn,V p (r, x, ω)δV p (x) δu n (r, ω) = (x)

 + Jn,Vs (r, x, ω)δVs (x) d(x),

(7)

with Jn,M (r, x, ω) = Ti j,M (x, ω)G ni, j (r, x, ω),

M ∈ {V p , Vs },

(8)

and ∂ci jkl ∂u 0k (x, ω) , ∂M ∂ xl ∂G ni (r, x, ω) , G ni, j (r, x, ω) = ∂xj Ti j,M (x, ω) =

2 T H E F O RWA R D P R O B L E M The subsurface of the Earth can be considered as elastic media under the elastodynamic assumption. Seismic wave propagation in such media is governed by the wave equation   ∂u k ∂ ∂u 2 (1) ci jkl = fi , ρ 2i − ∂t ∂xj ∂ xl

(9)

in which Jn,M represents the partial derivative wavefields (Pratt et al. 1998) with respect to the velocity perturbations, Tij,M represents the secondary (traction) source induced by the background wavefield u0 due to the velocity perturbation δVp (or δVs ), and Gni,j are the spatial derivatives of the Green’s function. Given the model

608

T.F. Wang and J.B. Cheng

(b)

(c) Figure 1. The radiation patterns of Vp (red) and Vs (blue) perturbations. (a) PP, (b) PS, (c) SP and (d) SS modes.

grid and receiver numbers L and K, we can rewrite eq. (8) in a discrete form: Jn,M (xl , rk ) = (T M (xl ) : G (xl , rk ))n ,

k = 1, 2, . . . , K ; l = 1, 2, . . . , L ,

(10)

with the double dot denoting the following operation: (T M : G )n =

3 

Ti j,M G ni, j .

2.1 Elastic wave mode decomposition (11)

i, j=1

For simplicity, we omit the index n to represent the multicomponent data. Therefore, we express eq. (7) as a matrix multiplication:

δV p  JV p JVs = δu, (12) δVs where JV p and JVs are the Frech´et derivatives or Jacobian matrices with the size of K × L, δV p and δVs are the model parameter vectors with the length of L, and δu denotes the scattering elastic wavefield recorded by the receivers. For brevity, we rewrite eq. (12) as Jδm = δu,

(13)

with J = (JV p JVs ) and δm = (δV p δVs ) . Eq. (12) or (13) shows that the product of the Jacobian matrix and the parameter perturbation vector corresponds to the scattering data responses, of which the superposition constitutes the entire scattering wavefields. The radiation patterns presented in Fig. 1 show that δVp only scatters P-waves in the PP mode, whereas δVs scatters both P- and S-waves in all modes. It is difficult to distinguish the scattered P-waves coming from which parameter perturbation. Fortunately, the scattered S-waves are only induced by the Vs perturbations; therefore, for the single-scattering elastic wavefields, the coupling effects only occur for the P-wave mode. The overlap of the partial derivatives with respect to different physical parameters T

at certain ranges of scattering angles is responsible for the parameter trade-offs in the inversion (Tarantola 1986; Operto et al. 2013). Naturally, we take mode decomposition as a tool for mitigating the coupling effects in an EFWI process.

The elastic wavefield u = (u x , u y , u z ) can be decomposed into a P-wavefield and an S-wavefield: u = u P + u S , with u P = (u xP , u yP , u zP ) and u S = (u xS , u Sy , u zS ). For isotropic media, this decomposition is model independent and can be expressed as wavenumber-domain operations as follows (Zhang & McMechan 2010): ˜ ˜ P = k(k · U), U

˜ S = −k × (k × U) ˜ U

(14)

where k = (k x , k y , k z ) is the normalized wave vector and the tilde ˜ denotes the wavefields in the wavenumber domain. Accordingly, we can decompose the Frech´et derivative: P + J SM , JM = JM

(15)

with P JM = F −1 (k(k · J˜ M )),

J SM = F −1 (−k × (k × J˜ M )),

(16)

−1

in which F denotes the inverse Fourier transform from the wavenumber to space domain and J˜ M is the Frech´et derivative in the wavenumber domain. Thus, we split eq. (13) into two parts, as follows: J P δm = δu P ,

J S δm = δu S ,

(17)

with δu = δu P + δu S ,

(18)

EFWI based on mode decomposition

609

and J = J P + JS ,

(19)

where the K × 2L matrices J P = (JVPp JVPs ) and J S = (JVS p JVS s ) represent the Frech´et derivatives (or Jacobian matrices) associated with the P- and S-wave components, respectively. Eq. (17) describes the Born approximation of scattered wavefields with mode decomposition. Note that the secondary source is activated only at the location of the parameter perturbation and when the background wavefield arrives at this location. Although the secondary source (or background wavefield) may have both P- and S-wave components, we find that it is not necessary to distinguish them for the inversion (this will be demonstrated in Section 6). Therefore, the Jacobian matrices with mode decomposition satisfy: W JW M (xl , rk ) = T M (xl ) : G (xl , rk ),

W ∈ {P, S}.

(20)

This implies that the mode decomposition operation only acts on the spatial derivatives of the scattering Green’s function G . Although there have been some attempts to implement ‘decoupled’ propagation of P- and S-waves, for example, Ma & Zhu (2003); Cheng et al. (2016), they can obtain satisfactory mode decomposition only when the medium is sufficiently smooth (Brytik et al. 2011; Wang et al. 2015b). Therefore, rather than cumbersomely propagating the decoupled wave modes, we extract the single-mode vector fields using eq. (16) from the propagated elastic wavefields. 3 T H E I N V E R S E P RO B L E M The inverse problem corresponding to eq. (13) is to find an optimal model that can interpret the recorded seismic data. It can be solved by minimizing the misfit function: 1 † δd δd, (21) 2 where δd denotes the misfit vector between observed data and calculated data with δd = F(uobs − ucal ), here F is the sample function at the receiver location and the superscript † denotes the transpose conjugate operator. The standard approach is to minimize the misfit function using a gradient- or Hessian-based algorithm. The gradient is related to the Jacobian matrix via



J†V p δd gV p =R † , (22) g= gVs JVs δd E=

where R denotes the real part of the results. To obtain a model update, the gradient-based algorithm uses g, whereas the Hessianbased algorithm utilizes both the gradient and the Hessian. The Hessian-based algorithms have a quadratic convergence rate but suffer from prohibitive computational costs and substantial storage requirements, even for acoustic FWI applications (Liu et al. 2015; M´etivier et al. 2015). To achieve a balance between accuracy and efficiency, we solve eq. (21) iteratively with the gradient-based minimization, that is, mk+1 = mk − αk gk ,

(23)

where m is the model parameter vector, k is the iteration number, and α k and gk are the step length and gradient at the kth iteration, respectively. A general EFWI process requires a large number of iterations; thus, good preconditioning of the gradients accelerates convergence. The gradients in eq. (22) have no internal mechanism to determine, if any, of which the variations in the data residuals are due to

Figure 2. Schematic illustration of gradient calculation through zero-lag correlation between the partial derivative wavefields and the residual seismogram. Only a point perturbation is given in the background media for illustration.

δVp and which are due to δVs . We split the gradients in eq. (22) into two parts in terms of P- and S-wave data residuals, that is,

P [JVPp + JVS p ]† δd P gV p =R , (24) † gVPs [JVPs + JVS s ] δd P and

S † [JVPp + JVS p ] δd S gV p =R , † gVS s [JVPs + JVS s ] δd S

(25)

where gW M represents the gradient of a certain physical parameter induced by the data residual of a given wave mode. Decoupling the data residuals is quite cumbersome because it is very challenging to implement P/S separation on the acquisition surface, particularly with incomplete boundary conditions or/and when the medium is heterogeneous in the near-surface. Thus, we need some strategies to avoid mode decomposition of the multicomponent seismograms. The gradient can be viewed as the zero-lag cross-correlation between the partial derivative wavefield and the residual seismogram in the time domain (Pratt et al. 1998). As shown in Fig. 2, given a point perturbation in the background media, the data residual is the misfit of the observed and the modelled seismograms. The partial derivative wavefield represents the characteristic point diffraction responses caused by the secondary source. Generally, P- and S-waves have different background velocities; thus, they have quite different kinematics (e.g. traveltimes and curvatures) in the partialderivative wavefield and the residual seismogram. Therefore, the zero-lag cross-correlation of the same wave mode dominates the gradient calculation, whereas that of different wave modes is almost eliminated through incoherent interference. Consequently, we have the following cross-term approximations: †

[JVS p ] δd P ≈ 0, †

[JVS s ] δd P ≈ 0, †

[JVPp ] δd S ≈ 0, †

[JVPs ] δd S ≈ 0,

(26)

in which 0 is a null matrix. In addition, the radiation patterns (of a single-scattering wavefield) show that the perturbation of P-wave velocity does not generate S-wave scatterings; therefore, we also have JVS p = 0.

(27)

610

T.F. Wang and J.B. Cheng

This yields †

[JVS p ] δd P = 0, †

[JVS p ] δd S = 0.

(28)

3.1 Preconditioning the gradients based on mode decomposition Using the relations given in eqs (26) and (28), we obtain the following approximate formulations: †

J†V p δd P ≈ [JVPp ] δd,



J†Vs δd S ≈ [JVS s ] δd.

Accordingly, we obtain decomposition:

P † [JVPp ] δd gV p ≈R , † gVPs [JVPs ] δd

the

gradients

(29) based

on

mode

(30)

and



S 0 gV p ≈ R . † gVS s [JVS s ] δd

(31)

Eqs (30) and (31) imply that we can avoid decomposing the data residuals in gradient calculations through the decomposed Frech´et derivatives. We further suggest abandoning the term of gVPs to mitigate the trade-off effects because the coupling effects only occur for the P-wave mode. Thus, we select the decomposed P-wave and S-wave Frech´et derivatives to construct the preconditioned gradients of Vp and Vs , respectively, that is,

P † [JVPp ] δd gV p gˆ V p = ≈ R . (32) † gˆ Vs gVS s [JVS ] δd s

Here, we use the hat ˆ to denote the preconditioned gradients based on mode decomposition. Eventually, the EFWI problem is iteratively solved using the MD-based approach:   Q1 gˆ V p , (33) mk+1 = mk − αk Q2 gˆ Vs k in which Q1 and Q2 denote the preconditioners. The step length α k is found using the parabolic fitting technique (Vigh & Starr 2008).

3.2 Gradient calculation using the adjoint-state method Explicitly forming the Frech´et derivatives requires performing as many forward modellings as the number of model parameters, which is impractical for real data (Virieux & Operto 2009). To compute the gradients without explicitly constructing the Jacobian matrices, we refer to the adjoint-state approach (Tromp et al. 2005; Plessix 2006). Exploiting the spatial reciprocity of Green’s function, the original gradients in eq. (22) can be computed through cross-correlation of the forward-propagated wavefields and the back-propagated data residuals in the time domain:  T ∂u i ∂ψk δi j δkl dt, gV p = −2ρV p 0 ∂ x j ∂ xl  T ∂u i ∂ψk (−2δi j δkl + δik δ jl + δil δ jk )dt, (34) gVs = −2ρVs 0 ∂ x j ∂ xl where ui is the forward wavefields from a source and ψ k is the adjoint wavefields reconstructed from the data residuals at the re-

ceivers. Note that the first line of eq. (34) automatically applies the divergence operations to the forward and adjoint wavefields. As shown in eq. (20), mode decomposition of the Frech´et derivative is equivalently applied to the scattering Green’s function. This implies that it is only necessary to decompose the adjoint wavefields to obtain the preconditioned gradients using the adjoint-state method, that is,  T ∂u i ∂ψkP δi j δkl dt, gˆ V p = −2ρV p 0 ∂ x j ∂ xl  T ∂u i ∂ψk δi j δkl dt, = −2ρV p 0 ∂ x j ∂ xl  T ∂u i ∂ψkS (δik δ jl + δil δ jk )dt, (35) gˆ Vs = −2ρVs 0 ∂ x j ∂ xl in which ψ P and ψ S are the P- and S-wave displacements in the adjoint wavefields, respectively. Due to the divergence operations implied in the calculation of gV p , we always have gˆ V p = gV p . Compared with eq. (34), we have dropped the term involving −2δ ij δ kl in the calculation of the preconditioned gradient of S-wave velocity because the divergence of the S-wave field will be zero. Therefore, mode decomposition is automatically included in the gradient calculation for P-wave velocity, but it is explicitly required to precondition the gradient of S-wave velocity. Decomposing the adjoint wavefields introduces extra computations, which mainly include two fast Fourier transforms (FFTs) in every time step. To reduce the memory requirements and computational cost, we propose resampling the forward and adjoint wavefields along the time axis and only decomposing the resampled adjoint wavefields before the cross-correlation operations. It is possible to decompose the forward wavefields when another hierarchal strategy is considered, such as individually using S-to-S and P-to-S modes in different stages of the inversion. We will explain why it is not recommended in practice in Section 6.

4 T H E M E C H A N I S M T O M I T I G AT E T H E TRADE-OFFS We have presented the approach for preconditioning the gradients based on elastic wave mode decomposition. Nevertheless, the most direct (but expensive) approach to mitigate the parameter tradeoffs is to resort to the Hessian-based minimization. To reveal the mechanism of the MD-based EFWI approach, we will investigate the Hessian and resolution matrices using the decomposed Frech´et derivatives and compare the preconditioned gradients between the GN and MD-based approaches.

4.1 The Hessian matrix and its mode-dependent components The multiparameter Hessian is a square and symmetric matrix with a block structure. The off-diagonal blocks measure the crosscorrelation of Frech´et derivative wavefields with respect to different physical parameters and act to mitigate the trade-off effects. When the problem is approximately linear, or when the data residuals are small, the full Hessian reduces to the approximate Hessian Ha (Pratt et al. 1998): Ha = R[J† J].

(36)

Considering the mode decomposition of the Frech´et derivatives (see eq. 19), we can decompose Ha into the following four components:

EFWI based on mode decomposition

611

elements in the off-diagonal blocks represent the strong trade-offs between parameters. We have the following useful observations: First, the cross-term components, HaP S and HaS P , have negligible contributions to Ha . This is because the corresponding Frech´et derivatives have little coherence due to the differences in the background P- and S-wave velocities. Note that there are a very small number of non-zero elements in these cross-term components. These elements correspond to the cross-correlation of the decomposed Frech´et derivatives with very large magnitudes in the nearsource zone. Thus, we can assume that HaP S ≈ 0 and HaS P ≈ 0, and we have: Ha ≈ HaP P + HaSS .

(38)

Second, the off-diagonal blocks of Ha are almost the same as HaP P , and there is only a non-zero block on the right bottom of HaSS because J S = (0 JVS s ). These observations emphasize that the parameter trade-offs result from the P-wave fields rather than the S-wave fields. Figure 3. The approximate Hessian Ha .

4.2 The model resolution matrix and its components

HaP P = R[J P ]† [J P ], HaP S HaS P HaSS

P †

= R[J ] [J ], S

= R[J S ]† [J P ], = R[J S ]† [J S ].

(37)

Computationally, the Hessian matrix is out of reach for large-scale applications. To evaluate the contribution of each component, we numerically illustrate them using a very small model discretized by 30 × 30 grids with an interval of 5 m in the x and z directions. A pure P-wave source is triggered centrally on the top, and the receivers are located on the four boundaries. The approximate Hessian (Fig. 3) and its components (Fig. 4) are computed explicitly using the time-domain Frech´et derivatives. The appreciable non-zero

The qualitative knowledge of the Hessian based on the decomposed Frech´et derivatives is not sufficient for understanding the mechanism of the MD-based approach. To evaluate the contributions of Pand S-wave data to the inversion, we further investigate how mode decomposition affects the resolution matrix in the model space. The model resolution matrix is typically calculated through the Hessian matrix and its inverse (Menke 1989; Snieder & Trampert 1999). For the inverse problem related to eq. (13), we update the model using: ˜ = −Ha−1 J† δd, δm

(39)

˜ is the estimated model perturbation with the total data where δ m residual. According to the Born approximation δd = Fδu, then

Figure 4. The four components of the approximate Hessian: (a) HaP P , (b) HaP S , (c) HaS P and (d) HaSS .

612

T.F. Wang and J.B. Cheng

Figure 5. Schematic illustration of the resolution matrix. The resolution matrix can be divided into four blocks for the inverse problem of two physical parameters.

substituting eq. (13) into eq. (39) and omitting the sample function F, we obtain: ˜ = Rδm, δm

(40)

in which δm denotes the true model and the model resolution matrix R satisfies: R = −Ha−g Ha .

(41)

Note that we use the pseudo-inverse (or generalized inverse) of the Hessian, Ha−g , rather than Ha−1 because the approximate Hessian is always ill-posed due to the limited measurements. As shown in Fig. 5, the resolution matrix acts as a filter on the true model. If the measurement is ‘perfect’, the resolution matrix will exactly be an identity matrix, that is, R = I, and thus, the model parameters are uniquely determined. In general, R = I; therefore, the model estimates are weighted averages of the true model parameters. The diagonal blocks of R indicate the intra-parameter effect and imply how the corresponding parameters are resolved, whereas the off-diagonal blocks indicate the cross-parameter effect. If the off-diagonal blocks have appreciable non-zero elements, the parameter trade-off effects will not be neglected. If the decomposed P-wave data are used, eq. (39) becomes: ˜ P = −Ha−1 J† δd P , δm

(42)

˜ P represents the estimated model with P-wave data. Folwhere δ m lowing the derivation from eqs (39) to (41), we obtain the resolution matrix of the P-wave data: R P = −Ha−g HaP ,

(43)

with HaP = J† J P . Similarly, the resolution matrix of the S-wave data satisfies: R S = −Ha−g HaS , † S

(44)

= J J . We can prove that R = R + R . with Continuing the previous small-model experiments, we can numerically illustrate the model resolution matrices R, R P and R S . The original resolution matrix (Fig. 6a) is band-diagonal, and the diagonal elements are positive but less than unity. This implies that the inverse of the approximate Hessian provides a very good preconditioner for resolving the perturbations of Vp and Vs . The resolution matrices corresponding to the decomposed P- and S-wave data (Figs 6b and c) exhibit some interesting features. For example, the non-zero diagonal blocks of R P and R S simply constitute those of R. Both blocks at the bottom of R P are null, except for some HaS

P

S

Figure 6. Resolution matrix and its components: (a) R, (b) R P and (c) R S .

artefacts (denoted by arrows) caused by the numerical errors in the near-source zone. The top right blocks of R P and R S have the same magnitudes but opposite signs; thus, they can be cancelled out to obtain the corresponding elements in R. These features show that, for this linearized inverse problem (without cycle skipping), both

EFWI based on mode decomposition

613

Figure 7. Comparison among the original, GN and MD-based gradients of the first iteration: (a) δVp = 0, δVs = 0, (b) δVp = 0, δVs = 0, and (c) δVp = 10δVs . Note that there are three panels with two rows and seven columns. The first row corresponds to Vp , and the second corresponds to Vs . The columns from ˜ p , δV ˜ s ); and the ˜ = (δ V left to right are the true model m = (V p , Vs ); the gradients g = (gV p , gVs ), g P = (gVPp , gVPs ), gS = (gVS p , gVS s ); the GN gradients δ m P P P S S S ˜ ˜ ˜ ˜ ˜ = (δ V p , δ Vs ) and δ m ˜ = (δ V p , δ Vs ), respectively. preconditioned gradients based on mode decomposition δ m

wave modes have contributions to resolve Vp , but the P-wave mode has few contributions to resolve Vs . The conventional gradientbased method suffers from the trade-off effects because only the P-wave mode is involved in the calculation of the gradient of P-wave velocity (see eq. 34) and the scattering P-wave data may be induced by the perturbation of S-wave velocity. Moreover, Vs can be well resolved when only the S-wave data are used. The preconditioning based on mode decomposition conformably exploits these features, and thus, it has the potential to mitigate the parameter trade-offs.

4.3 Comparison with the GN gradients The GN approach addresses the trade-off problem using the inverse of the approximate Hessian to precondition the original gradients, that is, ˜ = −Ha−1 g. δm

(45)

Given the pseudo-inverse of Ha as:   D E , Ha−g = F G the GN approach actually updates the model using:  P  DgV p + EgVPs + EgVS s mk+1 = mk − αk , GgVS s k

(46)

(47)

because δVsP ≈ 0,

δVs ≈ δVsS = −GgVS s ,

(48)

(see Appendix A). Using the same model and geometry as in the previous experiments, we will compare the original, GN and MD-based gradients in Fig. 7. We set three different combinations of parameter perturbations with a size of 10×10 m in the centre of the model. The three setups are the following: (a) δVp = 0, δVs = 0; (b) δVp = 0, δVs = 0; and (c) δVp = 10δVs . The background velocities are used as the initial models. Note that only scattered P-waves exist in the first

614

T.F. Wang and J.B. Cheng

setup. We observe remarkable trade-offs between the original gradients, gV p and gVs . For perturbation of a single physical parameter, the gradient of another parameter provides an opposite descent direction (see Figs 7a and b). Although the P-wave data residual may carry information of δVs (see gVPs in Figs 7b and c), the inversion of Vs with P-wave data using the conventional gradient-based method suffers from the trade-offs. In the third setup, gVPs even provides an incorrect updating direction. As expected, gVS s always provides a correct direction because the S-wave data residuals are only related to δVs . Therefore, gVPs often dominates the trade-off gradients. Generally, the trade-off gradient plays a negative role for the seismic data with strong P-waves, unless an appropriate preconditioning is applied to the original gradients (e.g. the GN gradients in Fig. 7). More importantly, the last three columns in Fig. 7 numerically verify eq. (48), and illustrate the differences and similarities between the GN and MD-based gradients. Unlike conventional gradientbased approaches, the GN approach implicitly superposes three decomposed gradients, namely, gVPp , gVPs and gVS s , using the sub-blocks of the inverse Hessian as weights to obtain a well-preconditioned gradient for the P-wave velocity. For the S-wave velocity, the GN approach actually preconditions the gradient associated with the S-wave data using a preconditioner G, which is almost the pseudo† inverse of [JVS s ] JVS s (see eq. A6). As shown in eq. (33), the MD-based gradients also require further preconditioning to accelerate the convergence. For instance, the preconditioner Q2 is only required to approximate G to address the illumination and band-limited effects of S-waves. Therefore, for Vs inversion, the MD-based approach is almost a single-parameter problem but using the decomposed S-wave fields. Thus, the preconditioning with Q2 will be less expensive, for example, using the single-parameter pseudo-inverse Hessian or l-BFGS method. This type of preconditioning nearly provides a GN gradient for Vs and helps to alleviate the trade-off effects in the iterations. It provides a good way to accelerate the convergence without Hessian calculation.

5 EXAMPLES In this section, we take two synthetic examples to demonstrate our MD-based EFWI approach. Inverted models using the conventional gradient-based approach are also shown for comparison. We use a pure P-wave source to synthesize the multicomponent seismograms. Simultaneous inversion of Vp and Vs is implemented with good initial models. To avoid local minimum, we perform the inversion from low to high frequencies with low-pass filtering of the data in the time domain. Thus, the inversion is divided into four stages with the frequency bands 0–2, 0–4, 0–6 and 0–10 Hz. For simplicity, both the original and MD-based gradients are further preconditioned through a depth-dependent compensation of the illumination (K¨ohn et al. 2012). To reduce the computational cost for the MD-based gradients, we resample the forward and adjoint wavefields with a time step of 10.0 ms prior to applying the cross-correlation. Hybrid programming with message-passing-interface (MPI) plus OpenMP is employed for both approaches using preconditioned conjugate gradient (PCG) optimization.

5.1 A fluid-saturated sandstone model We define a fluid-saturated reservoir embedded in a homogeneous background of sandstone with Vp = 3.14 km s−1 , Vs = 1.56 km s−1 , and ρ = 2000 kg m−3 . The upper part of the reservoir is gassaturated with Vp = 2.6 km s−1 and Vs = 1.66 km s−1 , whereas

the lower part is water-saturated with Vp = 3.0 km s−1 and Vs = 1.66 km s−1 ; see Fig. 8. The velocities of the reservoirs are given according to the physical model of sandstone rock (Mavko et al. 2009). On the surface, 32 shots are triggered with a horizontal interval of 100 m, and 400 receivers are deployed with an interval of 10 m. We use the background parameters as the initial model for inversion. A maximum of 10 iterations are performed in every stage. As the figures show, the conventional EFWI provides an acceptable reconstruction of Vp but a very bad Vs model due to the strong trade-off effects. Because the gradients suffer from the trade-offs in the early stage, they result in inappropriate updates of the long-wavelength velocity components. Thus, the inversion eventually converges to a local minimum due to limited iterations. However, we observe that both Vp and Vs are recovered very well by the MD-based inversion approach due to using the mode decomposition-based gradients.

5.2 Marmousi-II model Various EFWI methods have been applied to OBC seismic data over the past ten years. It is relatively easy to retrieve the models with a low Poisson’s ratio (in hard seabed cases) aided by the lowfrequency data components (Bae et al. 2012). However, the soft seabed environment is more common in reality. In this setting, the nonlinearity and parameter trade-offs become even more serious due to the limited amount of P-to-S conversion (Sears et al. 2008; Prieux et al. 2013a). Considering the more realistic cases of soft seabed environments, we use the original P- and S-wave velocities of the Marmousi-II model (Martin et al. 2006) to test the proposed approach, as shown in Figs 9(a) and (b). The uncorrelated structure between different parameters can show the trade-off effects more clearly. To more clearly demonstrate the features of the MD-based method, we have inserted a high-velocity thin layer in the S-wave velocity model at a depth of approximately 1.5 km. During the inversion, we calculate the forward and adjoint wavefields using the staggered-grid FD algorithm with a space interval of 5.0 m. The total recording time is 8 s with an interval of 0.5 ms. To reduce the computational cost, only 40 shots are evenly triggered at a depth of 20 m in the water, and 2800 receivers are placed at the sea bottom to simulate an OBC survey. The initial models are obtained by smoothing the true models (prior to inserting the thin layer) through the ‘smooth2’ function of Seismic Unix software with an aperture of 300 m. A similar multistage strategy of the previous example is used, except that we extend the number of iterations to 40 for every stage. Fig. 10 shows that the MD-based approach provides better inverted models than the conventional approach. In the shallow part (above 1.5 km), both approaches have recovered Vp very well, although the latter loses some resolution for the Vs model. Obviously, the high contrasts in S-wave velocities caused by the inserted thin layer represent a great challenge for the conventional gradient-based method. As shown in Figs 10(a) and (b), remarkable trade-offs are observed in the inverted models, particularly in the left parts of the high-velocity thin layer. These trade-offs lead to incorrect positioning and misfocusing of the horizons (denoted by arrows) below the thin layer. However, the MD-based inversion successfully mitigates these trade-offs and recovers more details of the velocity models, including the structures of faults and anticlines. The vertical profiles (Fig. 11) further verify that the MD-based approach outperforms the conventional approach at all depths. Fig. 12 displays the normalized data misfits as a function of iteration in the first two stages. We see that the MD-based method

EFWI based on mode decomposition

615

Figure 8. EFWI of a fluid-saturated sandstone model: the true Vp (left) and Vs (right) models (top), and the inverted models using the PCG (mid) and the MD-based (bottom) methods, respectively.

converges faster than the conventional method. In the first stage, the MD-based method achieves a higher level of data fitting and obtains more accurate model updates for the low-to-intermediate wavenumbers. In the second stage, the conventional gradient-based inversion eventually stops after 25 iterations with a very large data misfit, whereas the MD-based inversion achieves a good convergence rate and data fitting within 22 iterations. These results confirm that preconditioning the gradients through mode decomposition has successfully mitigated the parameter trade-offs and accelerate the convergence. Table 1 lists the iteration numbers in each stage and the total computing times on an 8-node workstation with 15-cores in each node. We set the maximum iteration number as 40 for each stage. The actual number of iterations before termination varies in different stages because the iteration will be terminated when an appropriate step-length cannot be determined. For conventional method, serious trade-off effects hinder reliable reconstruction of the macromodel from the low-frequency data. This may lead to

quick termination of the iterations in the high-frequency stages due to the unreasonable gradients. But the MD-based method obtains reliable macromodels in the low-frequnecy stages and thus appropriately performs more iterations in the high-frequency stages. That is why the MD-based method provides inverted Vp and Vs models with higher accuracy. Generally, the MD-based method requires more computational cost due to the increased iterations. But the average time of each iteration only increased about 18 per cent. 6 DISCUSSION 6.1 Necessity of further decomposed gradients It is possible to apply mode decomposition to both forward and adjoint wavefields to get further decomposed gradients corresponding to different mode conversion, that is, PP, PS, SP and SS. However, it is not easy to design an EFWI algorithm to individually fit

616

T.F. Wang and J.B. Cheng

Figure 9. The SEG Marmousi-II model: (a) and (b) are true Vp and Vs models, respectively, while (c) and (d) are initial Vp and Vs models, respectively. Note the P-wave velocity anomalies related to the gas sand reservoirs and the newly inserted high-velocity thin layer in the S-wave velocity field.

these decomposed seismograms, because we cannot distinguish the incident wave modes based on P/S separation on the acquisition surface. Otherwise, one has to use single-parameter inversion with the separated single-mode seismograms. For instance, in the second stage of their four-stage EFWI scheme, Ren & Liu (2016) only invert the P-wave velocity applying the separated P-wave fields when strong S-wave energy is involved, or merely update the S-wave velocity by matching the separated S-wave fields for the weak S-wave case. Note that decomposing the forward wavefields in gradient calculation is equivalent to distinguish the wave modes of the secondary source (or the background wavefield) which is a part of

Figure 10. The inverted Marmousi-II model: the inverted results using the PCG (a, b) and the MD-based methods (c, d). (a) and (c) are Vp models while (b) and (d) are Vs models.

the Frech´et derivatives. So we can rewrite the linearized forward problem as J X Y δm = δu X Y ,

(49)

where X and Y denote the modes of the secondary source and the scattering Green’s function, respectively, and δu X Y denotes the perturbed wavefield of the X-to-Y mode. Following the way to derive eq. (43), we get the corresponding resolution matrix: R X Y = Ha−g J† J X Y .

(50)

Fig. 13 shows the resolution matrices of the PP, PS, SP and SS modes with the same model and geometry as in Fig. 6 but using a mixed source generating both P and S components. The obvious band-diagonal effects are due to more complicated wave

EFWI based on mode decomposition

617

Figure 12. Normalized L2 norms as a function of iteration for the PCGbased (dash) and MD-based (solid) inversions in the first (red) and second (blue) stages.

Figure 11. The velocity profiles at 3.0 km (a, b) and 9.0 km (c, d) with the true models (black), the initial models (blue), the PCG-based (yellow) and MD-based (green) inverted models.

phenomena. We can see that R P P behaves very similar to R P while R S P is almost null. These features imply that the inverted Vp from PP data also contains cross-talks from Vs . The separated SP data have few contribution to the inversion of both physical parameters. Note that the examples of Ren & Liu (2016) also show noisy and weak SP gradients (see figs 19 and 20 in their paper), even though the mixed source have produced adequate S-waves. One plausible explanation of this may be that the energy scattered by S-to-P mode is very weak as demonstrated in Fig. 14. Moreover, it is also not easy to distinguish the contributions of the PS and SS modes, because the non-zero elements of R P S and R SS have similar distribution. Therefore, mode decomposition of forward wavefields may have very limited potentials to further mitigate the trade-off effects, although it leads to more computational costs. This is why we only decompose the adjoint wavefields to precondition the gradients. 6.2 Reconstruction of density The density perturbations scatter both P- and S-waves, although they have hardly any effect on the phases or traveltimes of both wave

modes. Most of these scattered energy is in the opposite direction to the incident wave propagation (Wu & Aki 1985; Tarantola 1986). Therefore, as a second-order parameter, density is difficult to reconstruct due to its poor sensitivity and the multiparameter trade-off effect (Tarantola 1986; Forgues & Lambare 1997). This is why many studies of EFWI considered the constant density case (Shipp & Singh 2002; Sears et al. 2008; Brossier et al. 2009). Only a few efforts obtained reasonable density reconstruction through hierarchical strategy or/and parametrization choices, for example, Jeong et al. (2012). Recently, based on the subspace method (Kennett et al. 1988), Xu & McMechan (2014) developed a multistep-length gradient-type EFWI approach to suppress crosstalk among Vp , Vs and ρ. Yang et al. (2016) proposed a biparameter acoustic FWI approach using the inverse Hessian based on scattering-integral to improve simultaneous estimation of Vp and ρ. Although Ren & Liu (2016) proposed a four-stage hierarchical EFWI scheme based on wavefield separation, they only apply P/S separation to the second stage to suppress the crosstalk between Vp and Vs , and then improve simultaneous three-parameter inversion using the multistep-length approach. In their overthrust examples, the inverted densities still have large deviation to the true values. Therefore, it is necessary to further investigate the potential of gradient preconditioning based on mode decomposition once the density reconstruction is taken into account. Using the density–velocity parametrization, the gradient of density is expressed as (Mora 1987; K¨ohn et al. 2012):  (51) gˆ ρ = V p2 − 2Vs2 gλ + Vs2 gμ + gρ ,

Table 1. The total computational costs. Method Conventional MD-based

stage1 0–2 Hz 40 40

Iteration number stage2 stage3 0–4 Hz 0–6 Hz 25 22

5 33

stage4 0–10 Hz 7 13

Time spent (hour) Total Average 41.4 68.3

0.538 0.632

618

T.F. Wang and J.B. Cheng

Figure 13. Resolution matrices associated with different mode conversion data using mix source: (a) R, (b) R P P , (c) R P S , (d) R S P and (e) R SS .

where gλ , gμ and gρ are the gradients using the density-Lame coefficient parametrization, namely: 

T

∂u i ∂ψk δi j δkl dt, ∂ x j ∂ xl

T

∂u i ∂ψk (δik δ jl + δil δ jk )dt, ∂ x j ∂ xl

T

∂ 2ui ψi dt. ∂t 2

gλ = − 0

 gμ = −

0

 gρ = −

0

Figure 14. Radiation pattern of SP and SS modes with a same normalization: SP mode (red) and SS mode (blue).

(52)

The aforementioned gradient decomposition for simultaneous estimation of Vp and Vs follows the fact that S-wave data are more sensitive to Vs perturbation under the Born approximation in general. However, there is no similar logic for the density because its perturbations generate both scattered P- and S-wave energy. So we combine equations 35 and 51 to implement simultaneous threeparameter inversion. To demonstrate this scheme, we introduce the original density model (see Fig. 15) into the previous Marmousi-II example and perform the inversion with the synthetic data using the same acquisition geometry and inversion strategies.

EFWI based on mode decomposition

619

introducing gradient preconditioning based on mode decomposition, we obtain acceptable inverted Vp and Vs models, but an inaccurate density model with footprints mainly from the Vs perturbations. Anyway, we observe that mode decomposition has contribution to mitigate the ill-posedness and enhance the resolution of the reconstructed velocity models (see Figs 17 and 18). To improve the inversion of density, we need develop more advanced hierarchical strategies or/and use the inverse Hessian to suppress the trade-off effect more thoroughly.

7 C O N C LU S I O N S

Figure 15. The true (a) and initial (b) density Marmousi-II model.

As shown in Fig. 16, due to the strong trade-off effect, conventional EFWI using PCG optimization fails to achieve reasonable reconstruction of the three parameters. On the left part with thick soft seabed, the ill-posedness of the inverse problem increases when introducing density. The strong crosstalk between Vp and Vs even leads to mispositioned structures in the inverted models. By

Multiparameter trade-offs challenge the conventional gradientbased EFWI approach, even when only considering P- and S-wave velocities. Radiation patterns show that both velocity perturbations generate perturbed P-wave fields and thus lead to the coupling effects at certain ranges of scattering angles. In contrast, the perturbed S-wave fields only result from the perturbation of S-wave velocity. Through introducing elastic wave mode decomposition, we found that the cross-correlation between the decomposed Frech´et derivatives and the data residuals of different wave modes have negligible contributions to the gradients. Applying the cross-term approximations, we have derived the MD-based gradients, which can be efficiently calculated using the adjoint-state method. Based on the decomposed Frech´et derivatives, we have investigated the components of the Hessian matrix and the resolution matrices associated with the decomposed P- and S-wave data. These investigations confirm that isolating the P-wave components in gradient calculations for S-wave velocity do have effects on mitigating the parameter trade-offs. Accordingly, the inversion of Vs using the MD-based

Figure 16. Comparison between conventional and MD-based method with density variation: (a)–(c) are the inverted Vp , Vs and ρ with conventional method, (d)–(f) are the inverted Vp , Vs and ρ with MD-based method.

620

T.F. Wang and J.B. Cheng So far, the density reconstruction is still a tough problem in waveform inversion. As we observed, mode decomposition does not improve the inversion of density, but help us to obtain reasonable reconstruction of Vp and Vs , even though introducing density obviously increases the ill-posedness of the inverse problem.

AC K N OW L E D G E M E N T S This work was supported by the National Natural Science Foundation of China (# 41474099, # 41674117) and the National Science and Technology Major Project (# ZX05027001-008). We thank the useful discussion with Tariq Alkhalifah and Wenyong Pan, and the supports of the open-source packages of DENISE and Madagascar. We thank the two anonymous reviewers, as well as the editor Xiaofei Chen for their very helpful comments to improve the paper.

REFERENCES

Figure 17. The velocity and density profiles at 3.0 km (a, b) with the true models (black), the initial models (blue), the PCG-based (yellow) and MDbased (green) inverted models.

Figure 18. The velocity and density profiles at 9.0 km (a, b) with the true models (black), the initial models (blue), the PCG-based (yellow) and MD-based (green) inverted models.

approach is equivalent to a single-parameter inversion with S-wave data. It can almost always achieve a GN convergence without any Hessian-related calculation if a good preconditioner is applied to address the S-wave illumination and band-width effects. For the MD-based simultaneous inversion, an optimal updating of Vs significantly improves the inversion of Vp . The numerical examples, especially the Marmousi model with the soft seabed structure, verify the advantages of our MD-based EFWI method for mitigating the trade-offs and accelerating convergence.

Alkhalifah, T., 2015. Scattering-angle based filtering of the waveform inversion gradients, Geophys. J. Int., 200(1), 363–373. Bae, H.S., Pyun, S., Chung, W., Kang, S.-G. & Shin, C., 2012. Frequencydomain acoustic-elastic coupled waveform inversion using the gaussnewton conjugate gradient method, Geophys. Prospect., 60, 413–432. Biondi, B. & Almomin, A., 2013. Tomographic full-waveform inversion (TFWI) by combing FWI and wave-equation migration velocity analysis, Leading Edge, 32(9), 1074–1080. Brossier, R., Operto, S. & Virieux, J., 2009. Seismic imaging of complex onshore structures by 2D elastic frequency-domain full-waveform inversion, Geophysics, 74(6), WCC105–WCC118. Brytik, V., de Hoop, M.V., Smith, H.F. & Uhlmann, G., 2011. Decoupling of modes for the elastic wave equation in media of limited smoothness, in Proceeding of the Project Review, Vol. 1, pp. 193–202, Geo-Mathematical Imaging Group, Purdue University, West Lafaytte, IN. Bunks, C., Saleck, F.M., Zaleski, S. & Chavent, G., 1995. Multiscale seismic waveform inversion, Geophysics, 60(5), 1457–1473. Cheng, J.B., Alkhalifah, T., Wu, Z.D., Zou, P. & Wang, C.L., 2016. Simulating propagation of decoupled elastic waves using low-rank approximate mixed-domain integral operators for anisotropic media, Geophysics, 81(2), T63–T77. Choi, Y. & Shin, C., 2008. Frequency-domain elastic full waveform inversion using the new pseudo Hessian matrix: expericence of elastic marmousi 2 synthetic data, Bull. seism. Soc. Am., 98(5), 2402–2415. Clement, F., Chavent, G. & Gomez, S., 2001. Migration-based traveltime waveform inversion of 2-D simple structures: a synthetic example, Geophysics, 66, 845–860. Crase, E., Wideman, C., Noble, M. & Tarantola, A., 1992. Nonlinear elastic waveform inversion of land seismic reflection data, J. geophys. Rev., 97(B4), 4685–4703. de Hoop, M.V., Qiu, L. & Scherzer, O., 2012. Local analysis of inverse problems: holder stability and iterative reconstruction, Inverse Probl., 28(4), 045001, doi:10.1088/0266-5611/28/4/045001. Djikpesse, H. & Tarantola, A., 1999. Multiparameter L2 norm waveform fitting: interpretation of gulf Mexico reflection seismograms, Geophysics, 64, 1023–1035. Fichtner, A. & Trampert, J., 2011. Hessian kernels of seismic data functionals based upon adjoint techniques, Geophys. J. Int., 185(2), 775–798. Forgues, E. & Lambare, G., 1997. Parameterization study for acoustic and elastic ray + Born inversion, J. Seism. Explor., 6, 253–278. Gholami, Y., Brossier, R., Operto, S., Ribodetti, A. & Virieux, J., 2013. Which parameterization is suitable for acoustic vertical transverse isotropic full waveform inversion? Part 1: sensitivity and trade-off analysis, Geophysics, 78(2), R81–R105. Innanen, K.A., 2014. Seismic avo and the inverse hessian in precritical reflection full waveform inversion, Geophys. J. Int., 199(2), 717–734.

EFWI based on mode decomposition Jeong, W., Lee, H.-Y. & Min, D.-J., 2012. Full waveform inversion strategy for density in the frequency domain, Geophys. J. Int., 188(3), 1221–1242. Kamath, N. & Tsvankin, I., 2016. Elastic full-waveform inversion for VTI media: methodology and sensitivity analysis, Geophysics, 81(2), C53–C68. Kennett, B.L., Sambridge, M.S. & Williamson, P.R., 1988. Subspace methods for large inverse problem with multiple parameter classes, Geophys. J., 94, 237–247. K¨ohn, D., De Nil, D., Kurzmann, A., Przebindowska, A. & Bohlen, T., 2012. On the influence of model parametrization in elastic full waveform tomography, Geophys. J. Int., 191(1), 325–345. Liu, Y., Yang, J., Chi, B. & Dong, L., 2015. An improved scattering-integral approach for frequency-domain full waveform inversion, Geophys. J. Int., 202(3), 1827–1842. Ma, D.T. & Zhu, G.M., 2003. P- and S-wave separated elastic wave equation numerical modeling (in Chinese), Oil Geophys. Prospect., 38, 482–486. Ma, Y. & Hale, D., 2013. Wave-equation reflection traveltime inversion with dynamic warping and full waveform inversion, Geophysics, 78(6), R223–R233. Martin, G.S., Wiley, R. & Marfurt, K.J., 2006. Marmousi 2: an elastic upgrade for marmousi, Leading Edge, 25(2), 156–166. Mavko, G., Mukerji, T. & Dvorkin, J., 2009. The Rock Physics Handbook: Tools for Seismic Analysis of Porous Media, Cambridge Univ. Press. Menke, W., 1989. Geophysical Data Analysis: Discrete Inverse Theory, Academic Press Inc. Mora, P., 1987. Nonlinear two-dimensional elastic inversion of multioffset seismic data, Geophysics, 52(9), 1211–1228. M´etivier, L., Brossier, R., Operto, S. & Virieux, J., 2015. Acoustic multiparameter FWI for the reconstruction of P-wave velocity, density and attenuation: preconditioned truncated newton approach, SEG Technical Program Expanded Abstracts, pp. 1198–1203. Nihei, K.T. & Li, X., 2007. Frequency response modelling of seismic waves using finite difference time domain with phase sensitive detection (TDPSD), Geophys. J. Int., 169(3), 1069–1078. Nocedal, J. & Wright, S., 2006. Numerical Optimization, Springer Science & Business Media. Operto, S., Gholami, Y., Prieux, V., Ribodetti, A., Brossier, R., Metivier, L. & Virieux, J., 2013. A guided tour of multiparameter full-waveform inversion with multicomponent data: from theory to practice, Leading Edge, 32(9), 1040–1054. Pan, W., Innanen, K.A., Margrave, G.F., Fehler, M.C., Fang, X. & Li, J., 2016. Estimation of elastic constants for HTI media using Gauss–Newton and full-Newton multiparameter full waveform inversion, Geophysics, 81(5), R275–R291. Plessix, R.-E., 2006. A review of the adjoint-state method for computing the gradient of a functional with geophysical applications, Geophys. J. Int., 167(2), 495–503. Plessix, R.E. & Cao, Q., 2011. A parameterization study for surface seismic full waveform inversion in an acoustic vertical transversely isotropic medium, Geophys. J. Int., 185, 539–556. Pratt, R.G., Shin, C. & Hick, G., 1998. Gauss–Newton and full newton methods in frequency–space seismic waveform inversion, Geophys. J. Int., 133(2), 341–362. Prieux, V., Brossier, R., Operto, S. & Virieux, J., 2013a. Multiparameter full waveform inversion of multicomponent ocean-bottom-cable data from the Valhall field. Part 1: Imaging compressional wave speed, density and attenuation, Geophys. J. Int., 194, 1640–1664. Prieux, V., Brossier, R., Operto, S. & Virieux, J., 2013b. Multiparameter full waveform inversion of multicomponent ocean-bottom-cable data from the Valhall field. Part 2: Imaging compressive-wave and shear-wave velocities, Geophys. J. Int., 194, 1665–1681. Ren, Z. & Liu, Y., 2016. A hierarchical elastic full-waveform inversion scheme based on wavefield separation and the multistep-length approach, Geophysics, 81(3), R99–R123. Sears, T., Singh, S. & Barton, P., 2008. Elastic full waveform inversion of multi-component OBC seismic data, Geophys. Prospect., 56, 843–862. Sears, T.J., Barton, P.J. & Singh, S.C., 2010. Elastic full waveform inversion of multicomponent ocean-bottom cable seismic data:

621

application to Alba Field, U.K. North Sea, Geophysics, 75(6), R109–R119. Sheen, D.H., Tuncay, K., Baag, C.E. & Ortoleva, P.J., 2006. Time domain Guass–Newton seismic waveform inversion in elastic media, Geophys. J. Int., 167, 1373–1384. Shin, C. & Cha, Y.H., 2008. Waveform inversion in the Laplace domain, Geophys. J. Int., 173, 922–931. Shin, C., Jang, S. & Min, D.-J., 2001. Improved amplitude preservation for prestack depth migration by inverse scattering theory, Geophys. Prospect., 49(5), 592–606. Shipp, R.M. & Singh, S.C., 2002. Two-dimensional full wavefield inversion of wide-aperture marine seismic streamer data, Geophys. J. Int., 151(2), 325–344. Sirgue, L., Etgen, J. & Albertin, U., 2008. 3D frequency-domain waveform inversion using time-domain finite-difference methods, 70th EAGE, Extended Abstracts, p. F022. Snieder, R. & Trampert, J., 1999. Inverse Problems in Geophysics, Springer. Symes, W. & Caeazzone, J., 1991. Velocity inversion by differential semblance optimization, Geophysics, 56, 654–663. Tarantola, A., 1986. A strategy for nonlinear elastic inversion of seismic reflection data, Geophysics, 51(10), 1893–1903. Tromp, J., Tape, C. & Liu, Q., 2005. Seismic tomography, adjoint methods, time reversal and banana-doughnut kernels, Geophys. J. Int., 160(1), 195–216. Vigh, D. & Starr, E.W., 2008. 3D prestack plane-wave, full-waveform inversion, Geophysics, 73(5), VE135–VE144. Vigh, D., Jiao, K., Watts, D. & Sun, D., 2014. Elastic full-waveform inversion application using multicomponent measurements of seismic data collection, Geophysics, 79(2), R63–R77. Virieux, J. & Operto, S., 2009. An overview of full-waveform inversion in exploration geophysics, Geophysics, 74(6), WCC1–WCC26. Wang, C., Cheng, J. & Arntsen, B., 2016. Scalar and vector imaging based on wave mode decoupling for elastic reverse time migration in isotropic and transversely isotropic media, Geophysics, 81(5), S383–S398. Wang, T., Cheng, J. & Wang, C., 2015a. Elastic wave mode decoupling for full waveform inversion, in 77th EAGE Conference and Exhibition 2015, Expanded Abstracts. Wang, W., McMechan, G.A. & Zhang, Q., 2015b. Comparison of two algorithms for isotropic elastic P and S vector decomposition, Geophysics, 80(4), T147–T160. Warner, M. & Guasch, L., 2016. Adaptive waveform inversion: theory, Geophysics, 81(6), R429–R445. Wu, R. & Aki, K., 1985. Scattering characteristics of elastic waves by an elastic heterogeneity, Geophysics, 50(4), 582–595. Xu, K. & McMechan, G.A., 2014. 2D frequency-domain elastic fullwaveform inversion using time-domain modeling and a multistep-length gradient approach, Geophysics, 79(2), R41–R53. Xu, S., Wang, D., Chen, F., Lambare, G. & Zhang, Y., 2012. Inversion on reflected seismic wave, 82nd Annual International Meeting, SEG, Expanded Abstracts, pp. 1–7. Yan, J. & Sava, P., 2008. Isotropic angle-domain elastic reverse-time migration, Geophysics, 73(6), S229–S239. Yang, J., Liu, Y. & Dong, L., 2016. Simultaneous estimation of velocity and density in acoustic multiparameter full-waveform inversion using an improved scattering-integral approach, Geophysics, 81(6), R399–R415. Zhang, Q. & McMechan, G.A., 2010. 2D and 3D elastic wavefield vector decomposition in the wavenumber domain for VTI media, Geophysics, 75(3), D13–D26.

A P P E N D I X A : I N V E S T I G AT I O N O F T H E G AU S S – N E W T O N G R A D I E N T S Using eqs (45) and (46), we have the preconditioned gradients of the GN approach:     DgVPp + E(gVPs + gVS s ) δV p ˜ = =− . (A1) δm (FgVPp + GgVPs ) + GgVS s δVs

622

T.F. Wang and J.B. Cheng

Note that we have used the relations gV p = gVPp + gVS p , gVs = gVPs + gVS s , and gV p = gVPp to obtain the above formulation. Naturally, we ˜ into two parts: δ m ˜ = δm ˜ P + δm ˜ S, can split the preconditioned δ m with ⎤ ⎡  P DgVPp + EgVPs δV p ⎦, ˜P= (A2) δm = −⎣ FgVPp + GgVPs δVsP and ˜ = δm S



δV Sp δVsS



 =−

EgVS s

 (A3)

GgVS s . P

S

S

δVsP ≈ 0, δVs ≈ δVsS = −GgVS s ,



R S ≈ Ha−g [J S ] J S .

(A5)

Substituting eq. (46) into (A5), we obtain    0 0 D E S R ≈ † F G 0 [JVS s ] JVS s   † 0 E[JVS s ] JVS s = , † 0 G[JVS s ] JVS s †

˜ = R δm and δ m ˜ = R δm, we have Considering δ m P



Because HaS = J† J S ≈ [J S ] J S , eq. (44) becomes

(A4)

because the bottom blocks of R P are almost null (see Fig. 6b).

(A6)

because J S = (0 JVS s ). Note that [JVS s ] JVS s represents the autocorrelation of the S-wave Frech´et derivative with respect to the perturbation of Vs . As shown in Fig. 5, the diagonal blocks of the resolution matrices are almost identity matrices if the observation † is perfect. Thus, G approximately denotes the inverse of [JVS s ] JVS s . Therefore, the preconditioner G addresses the geometrical spreading and band-limited effects for the inversion of Vs with the S-wave data.

Suggest Documents