Lectures on Full Waveform Inversion - Part 2 Synthetic Data Applications

Lectures on Full Waveform Inversion - Part 2 Synthetic Data Applications Daniel Köhn, Denise De Nil, Wolfgang Rabbel

June 22, 2017

Full Waveform Inversion - Part 2 Synthetic Data Applications

1

Review of the FWT algorithm

2

Conjugate Gradient and Quasi-Newton l-BFGS

3

Simple example: A spherical low velocity anomaly

4

The CTS Test Problem

5

The Marmousi-2 model

Review of the FWT algorithm Pure Gradient Method Residual Energy E 250

200

Density ρ ®

150

100

50

P−wave velocity Vp ®

Gradient method: mn+1 = mn − µn Pn

∂E ∂m

n

Review of the FWT algorithm

Final gradients The gradients for the Lamé parameters λ, µ and the density ρ can be written as X Z ∂ux ∂uy ∂E ∂Ψx ∂Ψy dt =− + + ∂λ(x) ∂x ∂y ∂x ∂y sources Z X ∂E ∂ux ∂uy ∂Ψx ∂Ψy =− dt + + ∂µ(x) ∂y ∂x ∂y ∂x sources ∂ux ∂Ψx ∂uy ∂Ψy +2 + ∂x ∂x ∂y ∂y Z X ∂ 2 uy ∂ 2 ux ∂E = dt Ψx + Ψy ∂ρ(x) sources ∂t2 ∂t2

Conjugate Gradient and Quasi-Newton l-BFGS Gradient method requires 200 iterations Residual Energy E 250

Density ρ →

200

150

100

50

P−wave velocity Vp →

I’m not happy with the far too slow convergence speed ...

Conjugate Gradient and Quasi-Newton l-BFGS Gradient method get stuck in narrow valley Residual Energy E 250

Density ρ →

200

150

100

50


... and then there could be cases like this.

Conjugate Gradient and Quasi-Newton l-BFGS

Conjugate Gradient Minimization of the quadratic form by using conjugate search directions instead of the gradient (Hestenes and Stiefel, 1952) Extension to nonlinear objective functions (Fletcher and Reeves, 1964; Polak and Riebière, 1969) Details, mathematical proofs [Nocedal and Wright, 1999]

Conjugate Gradient and Quasi-Newton l-BFGS Conjugate Gradient Algorithm 1

∂E Calculate the steepest decent direction: ∆xn = − ∂m

2

Compute βn according to

n

Fletcher-Reeves: βnFR = Polak-Riebi` ere: βnPR =

∆xT n ∆xn ∆xT n−1 ∆xn−1

∆xT n (∆xn −∆xn−1 ) ∆xT n−1 ∆xn−1 ∆xT (∆x −∆x

)

Hestenes-Stiefel: βnHS = − sT n (∆xnn −∆xn−1 n−1 ) Dai-Yuan: βnDY = − sT

n−1 ∆xT n ∆xn

n−1 (∆xn −∆xn−1 )

Popular choice βn = max{0, βnPR } which allows an automatic direction reset 3

Update conjugate direction: sn = ∆xn + βn sn−1

4

Estimate step length µn

5

Update material parameters: mn+1 = mn + µn sn

Conjugate Gradient and Quasi-Newton l-BFGS Quasi-Newton l-BFGS Idea: Approximate the product of the inverse Hessian with the gradient by finite-differences.

Quasi-Newton Limited Memory Broyden-Fletcher-Goldfarb-Shanno (l-BFGS) method.

The L-BFGS Algorithm Quasi-Newton L-BFGS Method (loop 1) The Limited-Memory Broyden-Fletcher-Goldfarb-Shanno method (see also Nocedal & Wright (1999), Brossier (2009)) At iteration step n: ∂E 1 Compute g = n ∂m n 2

Compute and store sn = mn+1 − mn Compute and store yn = gn+1 − gn

3

q = gn

4

for i = n-1 to n-m do ρi = y T1s i

i

αi = ρi siT q q = q − αi yi end for

The L-BFGS Algorithm

Quasi-Newton L-BFGS Method (loop 2) T y sn−1 n−1 T y yn−1 n−1

1

Compute Hn0 =

2

Compute z = Hn0 q

3

for i = n-m to n-1 do βi = ρi yiT z z = z + si (αi − βi ) end for

4

Hn gn = z

5

Update model mn+1 = mn − µn Hn gn

Conjugate Gradient and Quasi-Newton l-BFGS Gradient method (200 iterations) Residual Energy E 250

Density ρ →

200

150

100

50


Conjugate Gradient and Quasi-Newton l-BFGS Conjugate Gradient (30 iterations) Residual Energy E 250

Density ρ →

200

150

100

50


Conjugate Gradient and Quasi-Newton l-BFGS Quasi-Newton l-BFGS (20 iterations) Residual Energy E 250

Density ρ →

200

150

100

50


Problems related to local non-linear optimization Uni-modal objective function (1 minimum) Residual Energy E 250

Density ρ →

200

150

100

50


E = (1 − Vp)2 + 100(ρ − Vp 2 )2 (Rosenbrock, 1960)

Problems related to local non-linear optimization Multi-modal objective function (multiple minima) Residual Energy E 250

Density ρ →

200

150

100

50


E = (Vp 2 + ρ − 11)2 + (Vp + ρ2 − 7)2 (Lichtblau, 1972)


Density ρ →

200

150

100

50




Density ρ →

200

150

100

50




Density ρ →

200

150

100

50




Simple example: A spherical low velocity anomaly Pressure wavefield: simple acoustic test problem V [m/s] − True Model

V [m/s] − Starting Model

p

p

2400 50

50

100

100

2300

2100

y [m]

y [m]

2200

150

2000 150 1900

1800 200

200 1700

1600 250

250 20

40

60

80 100 x [m]

120

140

160

20

40

60

80 100 x [m]

120

140

160

Simple acoustic test problem: A spherical low velocity anomaly in a homogeneous full space.

Simple example: A spherical low velocity anomaly Pressure wavefield: simple acoustic test problem

Simple example: A spherical low velocity anomaly Starting model Vp [m/s] − True Model

Vp [m/s] − Start Model

Vp0 = 2000 m/s

Vp = 2000 m/s

50

100

100 Depth [m]

Depth [m]

0

50

Vp = 1700 m/s

150

200

150

200

250

250 50

100 Distance [m]

150

50

100 Distance [m]

150

Simple acoustic test problem: homogenous starting model.

Simple example: A spherical low velocity anomaly Seismic sections: initial model, true model, data residuals True Model uobs y

Initial Data Residuals δ uy = umod −uobs y y 0.04

0.045

0.045

0.045

0.05

0.05

0.05

time [s]

0.04

time [s]

time [s]

Starting Model umod y 0.04

0.055

0.055

0.055

0.06

0.06

0.06

0.065

50

100 trace #

150

0.065

50

100 trace #

150

0.065

50

100 trace #

150

Seismic sections of the y-component for the simple test problem: The starting model (left), the true model (center) and the data residuals (right).

Simple example: A spherical low velocity anomaly Non-linear optimization of P-wave velocity model Minimize objective function by CG for the P-wave velocity vp : n n+1 n n −1 ∂E vp = vp − µ H ∂vp with gradient ∂E/∂vp, Hessian H and step-length µ Efficient gradient calculation by time-domain adjoint method X Z ∂E ∂ux ∂uy ∂Ψx ∂Ψy = −2ρvp dt + + , ∂vp ∂x ∂y ∂x ∂y sources with the forward wavefield u and adjoint wavefield Ψ, respectively.


Forward, adjoint and correlated wavefields (gradient) for shot 45

Simple example: A spherical low velocity anomaly The effect of the preconditioning operator P Gradient − δ λ

−11

x 10 1

Gradient − δ λ (rescale)

−15

x 10

Precond. Gradient − δ λ

−12

x 10 10

−1 0

50

9

−2

8

−3

−1

7

100

−4

y [m]

6 −2

−5

150

5

−6

4

−7

3

−8

2

−9

1

−3 200 −4

250

0

−5 50

100 x [m]

150

50

100 x [m]

150

50

100 x [m]

150

The effect of the preconditioning operator P. The Gradient δλ0 before (left) and after the application of the preconditioning operator (right). Artifacts due to low ray-coverage are more prominent in the rescaled image of the unpreconditioned gradient (center).

Simple example: A spherical low velocity anomaly P-wave velocity model FWT result Vp [m/s] − Iteration No. 155

Vp [m/s] − Iteration No. 10

Vp [m/s] − True Model

2400 50

50

50

100

100

100

2300

2100

y [m]

y [m]

y [m]

2200

2000

150

150

150

200

200

200

1900 1800 1700 1600

250

250 50

100 x [m]

150

250 50

100 x [m]

150

50

100 x [m]

150

Inversion results for the P-wave velocity model of the spherical low velocity anomaly after 10 (left) and 155 FWT iterations (center) compared with the true model (right).

Simple example: A spherical low velocity anomaly Seismic sections: FWT result, true model, data residuals True Model uobs

Final Data Residuals δ u = umod−uobs

y

y

0.04

0.045

0.045

0.045

0.05

0.05

0.05

time [s]

0.04

time [s]

time [s]

Final Model (Iteration 155) umod y 0.04

0.055

0.055

0.055

0.06

0.06

0.06

0.065

50

100 trace #

150

0.065

50

100 trace #

150

0.065

50

100 trace #

y

y

150

Seismic sections (y-component) for the inversion result (left), the true model (center) and the data residuals (right).

The CTS Test Problem

The Cross-Triangle-Square (CTS) model The CTS model by D. De Nil and D. K¨ ohn P−wave velocity [m/s] 2500 500

y [m]

1000 1500

2000

2000 2500 3000

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

1500

S−wave velocity [m/s] 1400 500 1300 y [m]

1000 1200 1500 1100

2000

1000

2500 3000

900 1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Density ρ [kg/m3] 2200 500

2150

y [m]

1000 2100 1500 2050 2000 2000 2500 3000

1950 1000

[Köhn et al., 2012]

2000

3000

4000

5000

6000

7000

8000

9000

10000

The Cross-Triangle-Square (CTS) model

CTS model: acquisition geometry Acquistion Geometry 100 sources

400 receiver

200

400

y [m]

600

800

1000

1200

1400 1000

2000

3000

4000

5000 x [m]

6000

7000

8000

9000

10000

The Cross-Triangle-Square (CTS) model

CTS model: starting model P−wave velocity [m/s] 2500 500

y [m]

1000 1500

2000

2000 2500 3000

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

1500

S−wave velocity [m/s] 1400 500 1300 y [m]

1000 1200 1500 1100

2000

1000

2500 3000

900 1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Density ρ [kg/m3] 2200 500

2150

y [m]

1000 2100 1500 2050 2000 2000 2500 3000

1950 1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

The Cross-Triangle-Square (CTS) model Influence of frequency filtering P−wave velocity (result) [m/s] 2400

y [m]

500 1000

2200

1500

2000

2000

1800

2500 1600 3000

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

S−wave velocity (result) [m/s] 1400 500 1300 y [m]

1000 1200 1500 1100

2000

1000

2500 3000

900 1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

3

Density ρ (result) [kg/m ] 2200 500

2150

y [m]

1000 2100 1500 2050 2000 2000 2500 3000

1950 1000

No frequency filter

2000

3000

4000

5000

6000

7000

8000

9000

10000


y [m]

500 1000

2200

1500

2000

2000

1800

2500 1600 3000

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000


1000 1200 1500 1100

2000

1000

2500 3000

900 1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

3


2150

y [m]

1000 2100 1500 2050 2000 2000 2500 3000

1950 1000

2000

3000

4000

5000

Low pass frequency filters: 5.0-10.0 Hz

6000

7000

8000

9000

10000


y [m]

500 1000

2200

1500

2000

2000

1800

2500 1600 3000

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000


1000 1200 1500 1100

2000

1000

2500 3000

900 1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

3


2150

y [m]

1000 2100 1500 2050 2000 2000 2500 3000

1950 1000

2000

3000

4000

5000

6000

Low pass frequency filters: 2.0-5.0-10.0 Hz

7000

8000

9000

10000

The Cross-Triangle-Square (CTS) model Influence of the model parametrization Lame parameter λ (result) [Pa]

9

x 10 8

500

7

y [m]

1000 6 1500 5

2000

4

2500 3000

3 1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Lame parameter µ (result) [Pa]

9

x 10 4

500

3.5

y [m]

1000 3 1500 2.5

2000

2

2500 3000

1.5 1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

3


2150

y [m]

1000 2100 1500 2050 2000 2000 2500 3000

1950 1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Lamé parameters, low pass frequency filters: 2.0-5.0-10.0 Hz

The Cross-Triangle-Square (CTS) model Influence of the model parametrization P−wave impedance (result) [kg/s m2]

6

x 10 5

500 4.5

y [m]

1000 1500

4 2000 2500 3000

3.5 1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

y [m]

S−wave impedance (result) [kg/s m2]

6

x 10

500

2.8

1000

2.6

1500

2.4

2000

2.2

2500 3000

2 1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

3


2150

y [m]

1000 2100 1500 2050 2000 2000 2500 3000

1950 1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Seismic impedances, low pass frequency filters: 2.0-5.0-10.0 Hz



[Martin et al., 2006] NX = 500 gridpoints × NY = 174 gridpoints → 87000 gridpoints × 3 parameter classes (Vp, Vs, density) → 261000 model parameters


Seismic modelling and inversion codes are benchmarked on 1 node of the NEC cluster at Kiel university: 2 Intel Xeon E5-2670 CPUs (16 cores, clock speed 2.6 GHz) 128 GB DDR4 RAM

Marmousi-2 benchmarks (forward problem) First-arrival travel time map

0.0 Depth [km]

RAJZEL Eikonal FD Run-time (1 core): 0.05 s

0.5 1.0 1.5 2.0 2.5 3.0 0.0

2.0

4.0 6.0 Distance [km]

8.0

10.0

8.0

10.0

Pressure wavefield (time = 1.922 s)

0.0

DENISE time-domain FD Run-time (16 cores): 2.1 s

Depth [km]

0.5 1.0 1.5 2.0 2.5 3.0 0.0

0.0


10 Hz monochromatic pressure wavefield

0.5 Depth [km]

GERMAINE frequency-domain FD Run-time (1 core): 1.3 s

2.0

1.0 1.5 2.0 2.5 3.0 0.0

2.0


8.0

10.0

Marmousi-2: acquisition geometry

Depth [km]

0.5 1 1.5 2 2.5 3 1

2

3

4 5 6 Distance [km]

7

8

9

10

100 airgun sources, 40 m below the free-surface Source wavelet: low-pass filtered spike (fmax = 15 Hz) OBC with 400 multi-component receivers (x,y-component)

The Marmousi-2 model Propagation of the Pressure Wavefield Pressure wavefield (time = 1.351 s)

0.0 0.5

Depth [km]

1.0 1.5 2.0 2.5 3.0 3.5 0.0

1.0

2.0

3.0 Distance [km]

4.0

5.0

Click here for fancy 30 fps wavefield movie

6.0

The Marmousi-2 model Preconditioning Operator Gradient δ Vp (no Preconditioning)

−15

x 10

0.5

5

y [km]

1 0

1.5 2

−5 2.5 3

−10 1

2

3

4

5 x [km]

6

7

8

9

10

Gradient δ V (Preconditioning)

−16

p

x 10 1.5

0.5

1

y [km]

1

0.5

1.5

0

2

−0.5

2.5

−1

3

−1.5 1

2

3

4

5 x [km]

6

7

8

9

10

Marmousi-2 (Vp ), Start Model V [m/s]

P−wave velocity (Traveltime Tomography)

p

Depth [km]

0.5

4500

1 1.5

4000

2 2.5

3500

3 1

2

3

4

5

6

7

8

9

10

P−wave velocity (true model)

3000

2500

Depth [km]

0.5 2000

1 1.5

1500

2 2.5 3

1000 1

2

3

4 5 6 Distance [km]

7

8

9

10

Marmousi-2 (Vp ), Freq. 2 Hz, 50 It. V [m/s]

P−wave velocity (Waveform Tomography)

p

Depth [km]

0.5

4500

1 1.5

4000

2 2.5

3500

3 1

2

3

4

5

6

7

8

9

10


3000

2500

Depth [km]

0.5 2000

1 1.5

1500

2 2.5 3

1000 1

2

3

4 5 6 Distance [km]

7

8

9

10

Marmousi-2 (Vp ), Freq. 2-5 Hz, 75 It. V [m/s]


p

Depth [km]

0.5

4500

1 1.5

4000

2 2.5

3500

3 1

2

3

4

5

6

7

8

9

10


3000

2500

Depth [km]

0.5 2000

1 1.5

1500

2 2.5 3

1000 1

2

3

4 5 6 Distance [km]

7

8

9

10

Marmousi-2 (Vp ), Freq. 2-5-10 Hz, 90 It. V [m/s]


p

Depth [km]

0.5

4500

1 1.5

4000

2 2.5

3500

3 1

2

3

4

5

6

7

8

9

10


3000

2500

Depth [km]

0.5 2000

1 1.5

1500

2 2.5 3

1000 1

2

3

4 5 6 Distance [km]

7

8

9

10

Marmousi-2 (Vp ), Freq. 2-5-10-20 Hz, 70 It. V [m/s]


p

Depth [km]

0.5

4500

1 1.5

4000

2 2.5

3500

3 1

2

3

4

5

6

7

8

9

10


3000

2500

Depth [km]

0.5 2000

1 1.5

1500

2 2.5 3

1000 1

2

3

4 5 6 Distance [km]

7

8

9

10

Marmousi-2 (Vs ), Freq. 2-5-10-20 Hz, 70 It. V [m/s]

S−wave velocity (Waveform Tomography)

s

Depth [km]

0.5

2600

1 2400

1.5 2

2200

2.5 2000

3 1

2

3

4

5

6

7

8

9

10

1800 1600

S−wave velocity (true model) 1400

Depth [km]

0.5 1

1200

1.5

1000

2 800

2.5 3

600 1

2

3

4 5 6 Distance [km]

7

8

9

10

Marmousi-2 (Density ρ), Freq. 2-5-10-20 Hz, 70 It. ρ [kg/m3] 2800

Density (Waveform Tomography)

Depth [km]

0.5 1

2600

1.5 2

2400

2.5 3

2200 1

2

3

4

5

6

7

8

9

10 2000

Density (true model) 1800

Depth [km]

0.5 1

1600

1.5 2

1400

2.5 3 1

2

3

4 5 6 Distance [km]

7

8

9

10

1200

The Marmousi-2 model Seismic section for shot 50 (start model) Seismic Section

1 2

Time [s]

3 4 5 6 7 50

100

150

200 channel #

250

300

350

400

The Marmousi-2 model Seismic section for shot 50 (FWT result) Seismic Section

1 2

Time [s]

3 4 5 6 7 50

100

150

200 channel #

250

300

350

400

The Marmousi-2 model Seismic section for shot 50 (true model) Seismic Section

1 2

Time [s]

3 4 5 6 7 50

100

150

200 channel #

250

300

350

400


Evolution of the L2-Norm Evolution of the Residual energy

0

10

Normalized Residual energy

1 Hz 2.5 Hz 5 Hz 10 Hz

−1

10

−2

10

10

20

30

40 50 Iteration step No.

60

70

80

90

The Marmousi-2 model Influence of Hessian approximations So far we used a simple linear scaling with depth as Hessian approximation {Ha1 }−1 =depth

More sophisticated: Integrated forward wavefield + approximation of the receiver Greens function (Plessix & Mulder, 2004) {Ha2 }−1 =

R

dt|u(xs

,x,t)|2

asinh

xmax −x r z

−asinh

xmin −x r z

−1

max = minimum and maximum receiver positions xmin r , xr xs = source position

Marmousi-2 - influence of Hessian: PCG + Ha1 Vs [m/s]


Depth [km]

0.5

2600

1 2400

1.5 2

2200

2.5 2000

3 1

2

3

4

5

6

7

8

9

10

1800 1600


Depth [km]

0.5 1

1200

1.5

1000

2 800

2.5 3

600 1

2

3

4 5 6 Distance [km]

7

8

9

10

Marmousi-2 - influence of Hessian: PCG + Ha2 Vs [m/s]


Depth [km]

0.5

2600

1 2400

1.5 2

2200

2.5 2000

3 1

2

3

4

5

6

7

8

9

10

1800 1600


Depth [km]

0.5 1

1200

1.5

1000

2 800

2.5 3

600 1

2

3

4 5 6 Distance [km]

7

8

9

10

Marmousi-2 - influence of Hessian: l-BFGS + Ha2 Vs [m/s]


Depth [km]

0.5

2600

1 2400

1.5 2

2200

2.5 2000

3 1

2

3

4

5

6

7

8

9

10

1800 1600


Depth [km]

0.5 1

1200

1.5

1000

2 800

2.5 3

600 1

2

3

4 5 6 Distance [km]

7

8

9

10

References

K¨ ohn, D., De Nil, D., Kurzmann, A., Przebindowska, A., and Bohlen, T. (2012). On the influence of model parametrization in elastic full waveform tomography. Geophysical Journal International, 191(1):325–345. Martin, G., Wiley, R., and Marfurt, K. (2006). Marmousi2 - An elastic upgrade for Marmousi. The Leading Edge, 25:156–166. Nocedal, J. and Wright, S. (1999). Numerical Optimization. Springer, New York.

Lectures on Full Waveform Inversion - Part 2 Synthetic Data Applications

Lectures on Full Waveform Inversion - Part 2 Synthetic Data Applications

Suggest Documents

Time-lapse full waveform inversion: Synthetic and real data ... - OnePetro

Seismic Applications of Full Waveform Inversion

SH full waveform inversion: application to synthetic ...

Multi-scale full waveform inversion

Full-waveform inversion of seismic data with the ... - Westpaq

Data normalization strategies for full-waveform inversion

Constrained Seismic Well Data Waveform Inversion - CiteSeerX

Preconditioning of full-waveform inversion in

Pseudo full waveform inversion of borehole - COSMO

full waveform inversion using an efficient

Acoustic Full Waveform Inversion of marine reflection

Multiparameter Full-Waveform Inversion for Acoustic ... - Schlumberger

Optimal Transport for Seismic Full Waveform Inversion

Full waveform inversion based on scattering angle ...

Elastic full waveform inversion based on mode decomposition: the

Full-waveform Velocity Inversion Based on the Acoustic Wave Equation

Analysis of full-waveform LiDAR data for forestry applications

Full Article Full waveform inversion of reflection seismic data for ocean ...

Full Waveform Inversion for Seismic Velocity and Anelastic Losses in ...

Performances of 3D Frequency-Domain Full-Waveform Inversion ...

Fast 3D frequency-domain full-waveform inversion with a parallel ...

An overview of full-waveform inversion in exploration geophysics

Robust time-domain full waveform inversion with normalized zero-lag ...

Introduction Full Waveform Inversion (FWI) is a ...