3-D Projected $ L_ {1} $ inversion of gravity data

3 downloads 70 Views 3MB Size Report
Jul 17, 2016 - The 3D reconstructed model is in agreement with the known drill-hole information. arXiv:1601.00114v2 [physics.geo-ph] 17 Jul 2016 ...
submitted to Geophys. J. Int.

arXiv:1601.00114v2 [physics.geo-ph] 17 Jul 2016

3-D Projected L1 inversion of gravity data Saeed Vatankhah 1 , Rosemary A. Renaut 2 and Vahid E. Ardestani 1 1

Institute of Geophysics, University of Tehran, Iran

2

School of Mathematical and Statistical Sciences, Arizona State University, Tempe, AZ, USA.

SUMMARY

Sparse inversion of the gravity problem is considered. The L1 -type stabilizer reconstructs models with sharp boundaries and blocky features and can be implemented using an iteratively reweighted least squares stabilization. The regularized least squares problem at each iteration is solved on the projected subspace obtained using Golub-Kahan iterative bidiagonalization applied to the linear system of equations. The unbiased predictive risk estimator (UPRE) for determining the regularization parameter is extended for the solution on the projected space but leads to an underestimation of the regularization parameter. Further examination of the spectrum of the projected matrix demonstrates that the problem is mildly to moderately ill-posed and thus it is only the dominant spectral values of the projected operator that accurately approximate the spectral values of the original operator. In contrast, using a truncated spectrum of the projected matrix in estimating the regularization parameter yields suitable choices for the regularization parameter. The parameter choice method obtained in this way is denoted as Truncated UPRE (TUPRE). Simulations using synthetic examples with added noise show that TUPRE is efficient and provides acceptable solutions of the original problem solved on a significantly smaller subspace. The method is used on the gravity data from Mobrun ore body, northeast of Noranda, Quebec, Canada. The 3D reconstructed model is in agreement with the known drill-hole information.

2

S. Vatankhah, R. A. Renaut, V. E. Ardestani

Key words: Inverse theory; Numerical approximation and analysis; Gravity anomalies and Earth structure; Asia

1

INTRODUCTION

The gravity data inversion problem is the estimation of the unknown subsurface density and its geometry from a set of gravity observations measured on the surface. This is a challenging problem for several reasons: foremost of these is the non-uniqueness of the problem, algebraic ambiguity results because there are fewer observations than the number of model parameters, but also by Gauss’s theorem non-uniqueness arises due to the physics of the problem, (Li & Oldenburg 1996). Further, the data are always contaminated with noise, which, with the ill-conditioning of the model, leads to sensitivity of the solution to the noise and to the numerical algorithm for finding the solution. Thus, the inversion of gravity data is an example of an under-determined and ill-posed problem for which a stable and geologically plausible solution is feasible only with the imposition of additional information about the model. Here we consider the minimization of a global objective function consisting of data misfit, Φ(m), and stabilizing regularization, S(m), with relative weighting determined by regularization parameter α, P α (m) = Φ(m) + α2 S(m).

(1)

Data misfit Φ(m) measures how well an obtained model, m, reproduces the observed data, dobs . For gravity data it is standard to assume that the noise in the data is uncorrelated and Gaussian, with mean 0 and covariance matrix Cd , although it arises due to several sources such as untreated instrumental or geologic noise. Assuming that the standard deviation of the noise in the data is known, then a weighted L2 measure of the error between the observed and the predicted data is used giving Φ(m) = kWd (Gm − dobs )k22

(2)

Here, Wd is a diagonal constant matrix whose ith element is the inverse of the standard deviation −1/2

of the ith datum, i.e. Wd = Cd

. We note that throughout we use the standard definition of the

Lp -norm of arbitrary vector x ∈ Rn n X 1 kxkp = ( |xi |p ) p , p ≥ 1.

(3)

i=1

There are several choices for the stabilizer, S(m), depending on the type of model features one wants to recover through the inversion. A typical choice for geophysical applications is given by S(m) = kW (m)(m − mapr )k22 in which mapr is an estimate of the model parameters, possibly known from a previous investigation, or taken to be zero (Li & Oldenburg 1996), and W (m) is a

3-D Projected L1 inversion of gravity data

3

combined matrix of the individual weighting terms, including potentially a depth weighting matrix, and model dependent matrices that approximate low order derivative operators in each dimension, (Li & Oldenburg 1996, (4)). Although this type of inversion has been used successfully in much geophysical literature, models recovered in this way are characterized by smooth features, especially blurred boundaries, which are not always consistent with real geological structures (Farquharson 2008). There are situations in which the sources are localized and separated by sharp, distinct interfaces, requiring alternative approaches. In the geophysical community several approaches have been used to identify distinct interfaces. Suppose that W (m) = WL0 (m) = diag(((m − mapr )2 + 2 )−1/2 ),

0 <   1,

(4)

so that W (m), and hence (1), depend non linearly on m. The model-space iteratively reweighted least squares (IRLS) algorithm can be used to solve the problem, (Bruckstein et al. 2009). At each iteration (k)

k, m(k) is obtained using the weighting matrix WL0 (m) calculated using m(k−1) , where for any (1)

variable the superscript (k) denotes that variable at iteration k. The iteration is initialized with WL0 = I. Taking mapr = 0 yields the compactness criterion for gravity inversion that seeks to minimize the area (or volume in 3D) of the causative body, (Last & Kubik 1983). The minimum support (MS) stabilizer uses mapr 6= 0 and minimizes the total volume of nonzero departure of the model parameters from the given prior model, (Portniaguine & Zhdanov 1999). Portniaguine & Zhdanov (1999) also use instead the gradient of the model parameters in the stabilization term via S(m) = kD(∇m)|∇m|k22 , where D(∇m) is defined by diag((|∇m|2 + 2 )−1/2 ), yielding the minimum gradient support (MGS) stabilizer. This constraint minimizes the volume over which the gradient of the model parameters is nonzero and thus yields models with sharp boundaries. Another possibility for the reconstruction of sparse models is to use a stabilizer which minimizes the L1 -norm of the model or gradient, in this case known as the total variation (TV), of the model parameters (Farquharson & Oldenburg 1998; Farquharson 2008; Loke et al. 2003). The L1 -norm stabilizer permits occurrence of large elements in the inverted model among mostly small values and can, therefore, be used to obtain models with non-smooth properties (Sun & Li 2014). Although the L1 -norm stabilizer has favorable properties, and yields a convex function that can be solved by linear programming algorithms, its use for the solution of large scale problems is not feasible. Here we implement the L1 -norm stabilizer using the IRLS algorithm as applied for the MS stabilizer, but just requiring the replacement of the square root in WL0 in (4) by a fourth root. The IRLS algorithm is further extended for the gravity inverse problem by including the depth weighting in W (m). For small-scale problems in which S(m) yields a Tikhonov function for (1) the solution may be found efficiently using the generalized singular value decomposition (GSVD) or singular value decom-

4

S. Vatankhah, R. A. Renaut, V. E. Ardestani

position (SVD), dependent on the choice for W (m). The main advantage of using an SVD or GSVD decomposition is that these decompositions permit the expression of the regularization parameterchoice method in a computationally convenient form (Chung et al. 2008). When the size of the problem increases, the use of these decompositions is computationally impractical and, then, powerful algorithms should be used to reduce memory and CPU requirements. For example Li & Oldenburg (2003) applied the wavelet transform to compress the sensitivity matrix and Boulanger & Chouteau (2001) used the symmetry of the gravity forward model to minimize the size of the sensitivity matrix. Data-space inversion was used by Siripunvaraporn & Egbert (2000) and Pilkington (2009) leading to a system of equations with dimension equal to the number of observations. Iterative methods offer an alternative approach through projection of the problem to a smaller subspace, see Oldenburg et al. (1993). Here we use the iterative LSQR algorithm developed by Paige & Saunders (1982a; 1982b) that employs the Lanczos Golub-Kahan bidiagonalization algorithm to develop a smaller Krylov subspace for the solution. The LSQR is analytically equivalent to applying the conjugate gradient (CG) algorithm but has more favorable analytic properties particularly for ill-conditioned systems, Paige & Saunders (1982a), and has been widely adopted for the solution of regularized least squares problems, e.g. (Bj¨orck 1986; Hansen 1998; Hansen 2007). In this paper, the goal is to present a framework for regularization parameter estimation for the solution on the underlying Krylov subspace. The flexibility of the LSQR, as compared to CG, is that solutions can be found for for multiple values of the regularization parameter with comparatively little additional effort through the use of the SVD for the projected system (Oldenburg & Li 1994).

Some of the techniques used widely for estimating a suitable regularization parameter, α, include the L-curve (LC) (Hansen 1992), Generalized Cross Validation (GCV) (Golub et al. 1979; Marquardt 1970), the χ2 -discrepancy principle (Mead & Renaut 2009; Renaut et al. 2010; Vatankhah et al. 2014c), the unbiased predictive risk estimator (UPRE) (Vogel 2002) and the Morozov discrepancy principle (MDP) (Morozov 1966). Our previous investigation for the gravity inverse problem, see Vatankhah et al (2015), has demonstrated that the UPRE parameter-choice method is preferred, especially for high noise levels. However, effective parameter estimation techniques that are useful in the context of efficient iterative Krylov-based procedures are still needed. For example, Chung et al. (2008) showed that the GCV method generally overestimates the regularization parameter for the subspace, but a smaller regularization parameter can be obtained using a weighted GCV method. Here we focus, therefore, our discussion on extension of the UPRE for projected space, additionally extending the approach introduced in (Renaut et al. 2015). The new Truncated UPRE (TUPRE) for regularization parameter estimation is introduced.

3-D Projected L1 inversion of gravity data

5

The outline of this paper is as follows. In Section 2 we review the inversion methodology and the IRLS algorithm for the L1 -norm stabilizer with depth weighting. Obtaining the Tikhonov solution via the Golub-Kahan iterative bidiagonalization is briefly reviewed in Section 2.2 leading to the projected IRLS algorithm for the L1 stabilizer with depth weighting. Section 3 is devoted to regularization parameter estimation with the UPRE for the full and projected problems given in Section 3.1. Results for synthetic examples are illustrated in Section 4. The reconstruction of an embedded cube with high density within a homogeneous medium is used for contrasting the algorithms using the SVD, Section 4.2 and the projected algorithm, Section 4.3. These results demonstrate the need to use the truncated UPRE which is introduced and applied in Section 4.4. The reconstruction of a more complex structure using the TUPRE is presented in Section 4.5. The approach is applied in Section 5 on gravity data acquired for the determination of the density distribution for the Mobrun ore body, northeast of Noranda, Quebec, Canada. Conclusions and future work follow in Section 6.

2

INVERSION METHODOLOGY

We briefly review the well-known approach for the standard 3-D inversion of gravity data. The subsurface volume is discretized using a set of cubes, in which the cells size are kept fixed during the inversion, and the values of densities at the cells are the model parameters to be determined in the inversion (Boulanger & Chouteau 2001; Li & Oldenburg 1998). Then for unknown model parameters, m, here the density of each cell ρj , such that m = (ρ1 , ρ2 , . . . , ρn ) ∈ Rn and measured data dobs ∈ Rm , the gravity data satisfy the underdetermined linear system dobs = Gm,

G ∈ Rm×n ,

m  n,

mj = ρj , j = 1 : n.

(5)

G is the matrix resulting from the discretization of the forward operator which maps from the model space to the data space. Practically, dobs = dexact + η,

η ∈ Rm ,

η ∼ N (0, Cd ),

(6)

where dexact is the unknown exact data and we use the notation η ∼ N (0, Cd ) to indicate that the error in the measurements is assumed to be Gaussian and uncorrelated with covariance matrix Cd and mean 0. The goal of the gravity inverse problem is to find a stable and geologically plausible density model m that reproduces dobs at the noise level.

2.1 L1 inversion method Solving problem (5) is challenging due to the ill-posed nature of the problem and regularization is required to stabilize the solution. Furthermore, it is useful to have an algorithm that includes depth

6

S. Vatankhah, R. A. Renaut, V. E. Ardestani

weighting and prior model information in the formulation. Introducing the residual and discrepancy from the background data, r = dobs − Gmapr

and y = m − mapr ,

(7)

respectively, it is immediate from (5) that r = Gy and we may rewrite (1) in terms of y, P α (y) = kWd (Gy − r)k22 + α2 S(y).

(8)

Assuming minimization of (8) to give y(α), the model is updated by m(α) = mapr + y(α).

(9)

To obtain the L1 stabilization, S(y) = kyk1 , we observe, following for example (Voronin 2012; q q 2 2 2 Wohlberg & Rodriguez 2007), that |yi | = yi / yi can be approximated by yi / yi2 + 2 for small and positive . Thus kyk1 ≈

n X

yi2

i=1

q yi2 + 2

n X = (WL1 )2ii yi2 = kWL1 (y)yk22 ,

for

(WL1 (y))ii =

i=1

(yi2

1 . (10) + 2 )1/4

An approximate solution of (8) for the L1 regularizer is obtained by minimizing P α (y) = kWd (Gy − r)k22 + α2 kWL1 (y)yk22 .

(11)

Introducing, for general variable x, Sp (x) =

n X i=1

sp (xi )

where sp (x) =

x2 (x2 + 2 )

2−p 2

,

(12)

illustrates that the difference between the MS and L1 -norm stabilizers, (4) and (10), is the order of the root, obtained with p = 0 and p = 1, respectively. When  is sufficiently small, (12) yields the approximation of the Lp norm for p = 2 and p = 1, while the case with p = 0 corresponds to the compactness constraint used in Last & Kubik (1983). S0 (x) does not meet the mathematical requirement to be regarded as a norm, and is commonly used to denote the number of nonzero entries in x. Fig. 1 demonstrates the impact of the choice of  on sp (x) for  = 1e−9 , Fig. 1(a), and  = 0.5, Fig. 1(b). For larger p, more weight is imposed on large elements of x, large elements will be penalized more heavily than small elements during minimization (Sun & Li 2014). Hence L2 tends to discourage the occurrence of large elements in the inverted model, yielding smooth models, while L1 and L0 allow large elements leading to the recovery of data with blocky features. Further, the L0 -norm is non-quadratic and s0 (x) asymptotes to one away from 0, regardless of the magnitude of x. Hence the penalty on the model parameters does not depend on their relative magnitude, only on whether or not they lie above or below a threshold dependent on  (Ajo-Franklin et al. 2007). Further, L1 does not necessarily lead to a very sparse solution and thus L0 is better for preserving sparsity. On the other

3-D Projected L1 inversion of gravity data (a)

(b)

2.5

2.5 p=0, ε=0.5 p=1 , ε=0.5 p=2

p=0, ε=1e−9 p=1 , ε=1e−9 p=2

2

1.5

Sp(x)

Sp(x)

2

1

0.5

0 −1.5

7

1.5

1

0.5

−1

−0.5

0

0.5

1

1.5

0 −1.5

−1

−0.5

x

0

0.5

1

1.5

x

Figure 1. Illustration of different norms for two values of parameter . (a)  = 1e−9; (b)  = 0.5.

hand, as compared to L1 , the disadvantage of L0 is the greater dependency on the choice of , so that it is less robust than L1 . At each step of the IRLS algorithm, assuming that the null spaces of Wd G and W (y) do not intersect, we find the unique minimizer of (11), where for the moment we consider arbitrary model ˜ = Wd G dependent matrix W (y). For ease of presentation we replace W (y) by W . Introducing G and ˜r = Wd r, yields ˜T G ˜ + α2 W T W )−1 G ˜ T ˜r. y(α) = (G

(13)

When W is square and diagonal, W T = W . Further, when W is invertible, W −1 = (W T )−1 . Thus W T W = W 2 and ˜ −1 W + α2 W 2 ) = W (W −1 G ˜ T GW ˜ −1 + α2 )W ˜ T GW ˜T G ˜ + α2 W T W ) = (W W −1 G (G ˜˜ T G ˜˜ + α2 I )W, ˜ T GW ˜ −1 + α2 In )W = W T (G = W T ((W T )−1 G n ˜˜ = GW ˜ −1 , corresponding to variable preconditioning of G. ˜ Applying this identity in (13) gives with G ˜˜ T G ˜˜ + α2 I )−1 (W T )−1 G ˜˜ T G ˜˜ + α2 I )−1 G ˜˜ T ˜r, ˜ T ˜r = W −1 (G y(α) = W −1 (G n n and we obtain the standard Tikhonov form for h(α) = W y(α), ˜˜ − ˜rk2 + α2 khk2 , P α (h) = kGh 2 2

where

˜˜ = W GW (y)−1 G d

and ˜r = Wd r.

(14)

The solution ˜˜ T G ˜˜ + α2 I )−1 G ˜˜ T ˜r = G(α)˜ ˜˜ r, h(α) = (G n

(15)

˜˜ ˜˜ T G ˜˜ + α2 I )−1 G ˜˜ T , yields the model update m(α) = m + W −1 h(α). For smallwhere G(α) = (G n apr ˜˜ see Appendix A. scale problems, the numerical solutions of (15) can be obtained using the SVD for G,

8

S. Vatankhah, R. A. Renaut, V. E. Ardestani

We note that for diagonal matrices of size n × n, multiplication, inversion and storage requirements are minimal, of O(n) only.

2.1.1

Depth Weighting

For the gravity inversion problem we set W (y) = WL1 (y)Wz

where

Wz = diag(zj−β ),

β > 0.

(16)

Wz is the depth weighting matrix and zj is the mean depth of cell j. Parameter β determines the cell weighting, see Li & Oldenburg (1998) and Boulanger & Chouteau (2001).

2.1.2

The IRLS Algorithm

Application of the IRLS for the solution of the inverse problem requires the designation of a termination test to determine whether or not an acceptable solution has been reached. Two criteria are chosen to terminate the algorithm; either the solution satisfies the noise level, √ (dobs )i − (Gm)i 2 χ2Computed = k k2 ≤ m + 2m, ηi

(17)

or a maximum number of iterations, Kmax , is reached. Additionally, at each step practical lower and upper bounds on the density, [ρmin , ρmax ], are imposed to recover reliable subsurface models. If at any iterative step a given density value falls outside the bounds, the value at that cell is projected back to the nearest constraint value. At iteration k we use the approximation sp (x) ≈ sp (x(k) , x(k−1) ) =

(x(k) )2 ((x(k−1) )2 + 2 )(

2−p ) 2

,

(18)

where x(k) denotes the unknown model parameter. Typically, as the iterations proceed, if x(k) converges, the approximation is increasingly better. The IRLS algorithm for small scale L1 inversion is summarized in Algorithm 1. Observe that this algorithm can be used immediately in conjunction with (k+1)

the MS stabilizer in place of the L1 stabilization by replacing WL1

(k+1)

in step 13 by WL0

using (4).

We return to the estimation of α(k) in Algorithm 1 step 7 in Section 3.1.

2.2

Projected L1 inversion method

The SVD decomposition is useful for analysis and determination of h(α) for small scale problems but is not practical when the size of the problem increases. Here we use the Golub Kahan bidiagonalization (GKB) algorithm Bidiag 1 which is the fundamental step of the LSQR algorithm for solving damped least squares given in Paige & Saunders (1982a; 1982b). The solution of the inverse problem is projected to a smaller subspace of size t as briefly reviewed here for the system of equations

3-D Projected L1 inversion of gravity data

9

Algorithm 1 Iterative L1 Inversion Algorithm Input: dobs , mapr , G, Wd ,  > 0, ρmin , ρmax , Kmax , β ˜ obs = Wd dobs ˜ = Wd G, and d 1: Calculate Wz , G

3:

Initialize m(0) = mapr , W (1) = Wz , k = 0 ˜˜ (1) = G(W (1) )−1 ˜ obs − Gm ˜ (0) , G ˜ Calculate ˜r(1) = d

4:

while Not converged, (17) not satisfied, and k < Kmax do

2:

5:

k =k+1

6:

˜˜ (k) = U ΣV T Find the SVD: G

7: 8:

Use regularization parameter estimation to find α(k) P uT r(k) σi2 i ˜ Set h(k) = m i=1 σ 2 +(α(k) )2 σi vi i

9: 10:

Set m(k) = m(k−1) + (W (k) )−1 h(k) Impose constraint conditions on m(k) to force ρmin ≤ m(k) ≤ ρmax

13:

Test convergence and exit loop if converged ˜ obs − Gm ˜ (k) Calculate the residual ˜r(k+1) = d  −1/4  (k+1) (k+1) Set WL1 = diag (m(k) − m(k−1) )2 + 2 , as in (10), and W (k+1) = WL1 Wz

14:

˜˜ Calculate G

11: 12:

15:

(k+1)

(k+1) )−1 ˜ = G(W

end while

Output: Solution ρ = m(k) . K = k. ˜˜ and right hand side vector ˜r. Bidiagonal matrix B ∈ R(t+1)×t and matrices with system matrix G t Ht+1 ∈ Rm×(t+1) , At ∈ Rn×t with orthonormal columns are generated such that, see (Paige & Saunders 1982a; Paige & Saunders 1982b), ˜˜ = H B , GA t t+1 t

Ht+1 et+1 = ˜r/k˜rk2 .

(19)

Here, and throughout, we use t to indicate the number of steps in GKB, and use a t-dependent subscript on any variable that is obtained through this process. So, for example, Ht+1 is the matrix of size m by t + 1 which is obtained after t steps and we explicitly note that et+1 is the unit vector of length t + 1 with a 1 in the first entry. The columns of At span the Krylov subspace Kt given by ˜˜ T G, ˜˜ G ˜˜ T ˜r) = span{G ˜˜ T ˜r, (G ˜˜ T G) ˜˜ G ˜˜ T ˜r, (G ˜˜ T G) ˜˜ 2 G ˜˜ T ˜r, . . . , (G ˜˜ T G) ˜˜ t−1 G ˜˜ T ˜r}, Kt (G

(20)

and an approximate solution ht that lies in this Krylov subspace will have the form ht = At zt , zt ∈ Rt . The matrix W −1 , which acts as a right-preconditioner for the system, is updated at each ˜˜ and ˜r depend on IRLS iteration k. Thus the Krylov subspace iteration k (Algorithm 1 step 13), hence G (20) is changed at each iteration k. Here W is not used to accelerate convergence but to enforce some specific regularity condition on the solution (Gazzola & Nagy 2014).

10

S. Vatankhah, R. A. Renaut, V. E. Ardestani

˜˜ the projected matrix B with large enough t inFor the ill-posed problem with system matrix G t ˜ ˜ (Paige & Saunders 1982a; Paige & Saunders 1982b). To write (14) in herits the ill-conditioning of G, terms of the projected space, we define the residuals for the full and projected problems by, respectively, see Chung et.al. (2008), ˜˜ − ˜r, Rfull (ht ) = Gh t

and

Rproj (zt ) = Bt zt − k˜rk2 et+1 ,

(21)

then ˜˜ z − ˜r = H B z − H k˜rk e Rfull (ht ) = GA t t t+1 t t t+1 2 t+1 = Ht+1 Rproj (zt ).

(22)

By the column orthogonality of Ht+1 the fidelity norm is preserved under projection, kRfull (ht )k22 = kRproj (zt )k22 . Similarly, by the column orthogonality of At , khk22 = kzk22 . Thus (14) is replaced by P ζ (z) = kBt z − k˜rk2 et+1 k22 + ζ 2 kzk22 .

(23)

Here regularization parameter ζ replaces α to make it explicit that, while ζ has the same role as α as a regularization parameter, we do not assume that the regularization required is the same on the ˜˜ projected and full spaces. Since the dimensions of B are small as compared to the dimensions of G, t

the solution of the projected problem (23) is obtained efficiently from zt (ζ) = (BtT Bt + ζ 2 It )−1 BtT k˜rk2 et+1 = Bt (ζ)k˜rk2 et+1 ,

(24)

where Bt (ζ) = (BtT Bt + ζ 2 It )−1 BtT can be obtained using the SVD, see Appendix A, and the update for the global solution is immediately given by mt (ζ) = mapr + W −1 At zt (ζ). Although Ht+1 and At have orthonormal columns in exact arithmetic, Krylov methods lose orthogonality in finite precision. This means that after a relatively low number of iterations the vectors in Ht+1 and At are no longer orthogonal and the relationship between (14) and (23) does not hold. Here we therefore use Modified Gram Schmidt, see Hansen (2007), reorthogonalization to maintain the column orthogonality, which is also important for replicating the dominant spectral properties of ˜˜ by B . We summarize the steps which are needed for implementation of the projected L inverG t

1

sion in Algorithm 2 for a specified projected subspace size t. We emphasize the differences between Algorithms 1 and 2. First the size of the projected space t needs to be given. Then steps 6 to 8 in Algorithm 1 are replaced by steps 6 to 10 in Algorithm 2.

2.3

Practicalities of the Algorithms

˜˜ both the SVD and the bidiagAlthough the algorithms suggest that one explicitly needs the matrix G ˜ and W directly. It is sufficient that one can efficiently onalization algorithms can be modified to use G ˜ G ˜ T , and diagonal matrices W , particularly because operations W x for perform operations with G,

3-D Projected L1 inversion of gravity data

11

Algorithm 2 Iterative Projected L1 Inversion Algorithm Using Golub-Kahan bidiagonalization Input: dobs , mapr , G, Wd ,  > 0, ρmin , ρmax , t, Kmax , β ˜ obs = Wd dobs ˜ = Wd G, and d 1: Calculate Wz , G (1)

3:

Initialize m(0) = mapr , WL1 = In , W (1) = Wz , k = 0 ˜˜ (1) = G(W (1) )−1 ˜ obs − Gm ˜ (0) , G ˜ Calculate ˜r(1) = d

4:

while Not converged, (17) not satisfied, and k < Kmax do

2:

5:

k =k+1

6:

˜˜ Find the factorization: G

7:

Find the SVD: Bt

8: 9:

(k)

(k)

(k)

At

(k)

(k)

= Ht+1 Bt

(k)

with Ht+1 et+1 = ˜r(k) /k˜r(k) k2

= U ΓV T

Use regularization parameter estimation to find ζ (k) P r(k) k2 et+1 ) γ2 uT (k˜ (k) Set zt = ti=1 γ 2 +(ζi(k) )2 i vi γi (k)

i

(k) (k)

10:

Set ht

= At zt

11:

Set m(k) = m(k−1) + (W (k) )−1 ht

12:

Impose constraint conditions on m(k) to force ρmin ≤ m(k) ≤ ρmax

(k)

15:

Test convergence and exit loop if converged ˜ obs − Gm ˜ (k) Calculate the residual ˜r(k+1) = d  −1/4  (k+1) (k+1) Set WL1 = diag (m(k) − m(k−1) )2 + 2 , as in (10), and W (k+1) = WL1 Wz

16:

˜˜ Calculate G

13: 14:

17:

(k+1)

(k+1) )−1 ˜ = G(W

end while

Output: Solution ρ = m(k) . K = k.

arbitrary x involve just a scaling of the vector x for diagonal W , as do operations with W −1 . Further ˜ which uses storage O(m × n). diagonal matrices require linear storage, here O(n), as compared to G With respect to the total cost of the algorithms, it is clear that the costs at a given iteration differ due to the replacement of steps 6 to 8 in Algorithm 1 by steps 6 to 10 in Algorithm 2. Roughly the full algorithm requires the SVD for a matrix of size m × n, the generation of the update h(k) , using the SVD, and the estimate of the regularization parameter. Given the SVD, and an assumption that one searches for α over a range of q logarithmically distributed estimates for α, with q  m  n, as would be expected for a large scale problem, then the estimate of α is negligible in contrast to the other two steps. For m  n, the cost is dominated by terms of O(n3 ) for finding the SVD, (Golub & van Loan 1996). In the projected algorithm the SVD step for Bt and the generation of ζ are dominated by terms (k)

of O(t3 ). In addition the update ht

is a matrix vector multiply of O(mt) and generating the factor-

ization is O(mnt), (Paige & Saunders 1982a). Effectively for t  m, the dominant term O(mnt) is actually O(mn) with t as a scaling, as compared to high cost O(n3 ). The differences between the

12

S. Vatankhah, R. A. Renaut, V. E. Ardestani

costs then increase dependent on the number of iterations K that are required. A more precise estimate of all costs is beyond the scope of this paper, and depends carefully on the implementation used for the SVD and the GKB factorization, both also depending on the sparsity structure of model matrix G.

2.4

Regularization

Both Algorithms 1 and 2 require the determination of a regularization parameter, α, ζ, steps 7 and 8, respectively. Different approaches have been considered for solving the regularized problem using iterative methods. For example, the LSQR algorithm as described in (Paige & Saunders 1982a; Paige & Saunders 1982b) uses a fixed ζ = α. This can be useful when solving multiple problems in which one has prior information on how the noise enters the problem and the degree of regularization that will be required. A hybrid method adjusts parameter ζ for each projected space t, for increasing t, (Chung et al. 2008; Kilmer & O’Leary 2001), and assesses the convergence of the solution, but still requires a mechanism to find ζt for each t. Here we wish to find ζ optimally for a fixed projected problem of size t and it is therefore important that the resulting solution appropriately regularizes the full problem, i.e. that effectively ζopt ≈ αopt , where ζopt and αopt are the optimal regularization parameters for the projected and full problems, respectively. It is clear that the projected solution zt (ζ) depends explicitly on the subspace size, t. Although we will discuss the effect of choosing different t on the solution, our focus here is not on using existing techniques for finding an optimal subspace size topt , see for example the discussions in e.g. (Hnˇetynkov´a et al 2009; Renaut et al. 2015). Instead with our emphasis that the solution should be an appropriately regularized solution of the full problem, we examine the spectrum of Bt . For small t, the singular values of Bt approximate the largest singular ˜˜ however, for larger t the smaller singular values of B approximate the smallest singular values of G, t

˜˜ so that there is no immediate one to one alignment of the small singular values between values of G, ˜˜ and B with increasing t. Thus, if the regularized projected problem is to give a good approximation G t

for the regularized full problem, it is important to choose t such that the dominant singular values of ˜˜ are well approximated by those of B , effectively capturing the dominant subspace for the solution, G t

and that the estimate of the regularization parameter appropriately uses the spectral information. This is discussed carefully in Section 3.1.

3

REGULARIZATION PARAMETER ESTIMATION

Whether solving the full problem using the SVD or the projected problem, equations (14) and (23), respectively, requires regularization parameters α or ζ, respectively. There are many approaches in the literature for determining the regularization parameter which are well documented for example in

3-D Projected L1 inversion of gravity data

13

(Hansen 1998). The application of these techniques for the projected problem is not always immediate, see for example the discussion by Chung et.al. (2008) on the weighted GCV. Here we focus on the UPRE for which our previous investigations have shown that an effective estimation of the regularization parameter is obtained (Renaut et al. 2015; Vatankhah et al. 2015; Vatankhah et al. 2014c). The derivation of the UPRE for a general Tikhonov function is given in Appendix B.

3.1

Unbiased predictive risk estimator

˜˜ with A, h with x, α with λ, and ˜r with b in (B.1). We note that For the full problem we identify G due to weighting using the inverse square root of the covariance matrix for the noise, the covariance matrix for the noise in ˜r is I. Thus the UPRE for the full problem is αopt = arg min{U (α) := k(H(α) − Im )˜rk22 + 2 trace(H(α)) − m}. α

(25)

˜˜ G(α). ˜˜ Here H(α) = G Typically αopt is found by evaluating (25) for a range of α, for example by the SVD see Appendix C, with the minimum found within that range of parameter values. For the projected problem we identify zt with x, Bt with A, ζ with λ, and k˜rk2 et+1 with b. In order to deduce the UPRE we must consider the noise distribution. We first observe that given ˜r = T ˜ T ˜ T η ˜rexact +˜ ˜, η , then k˜rk2 et+1 = Ht+1 r consists of a deterministic and stochastic part, Ht+1 rexact +Ht+1 T η ˜ and column orthogonal Ht+1 , Ht+1 ˜ is a random vector of length t+ 1 where for white noise vector η

with covariance matrix It . Thus, we immediately obtain the UPRE for the projected problem ζopt = arg min{U (ζ) := k(H(ζ) − It+1 )k˜rk2 et+1 k22 + 2 trace(H(ζ)) − (t + 1)}. ζ

(26)

Here H(ζ) = Bt Bt (ζ) is the influence matrix for the subspace. As for the full problem, see Appendix C, the SVD of the matrix Bt can be used to find ζopt . In Section 4.1 we show that in some situations (26) does not work well. Then, a modification is introduced that does not use the entire subspace for a given t, but rather uses a truncated spectrum from Bt for finding the regularization parameter, thus assuring that the dominant ttrunc singular values are appropriately regularized.

4 4.1

SYNTHETIC EXAMPLES Model of an embedded cube

The initial goal of the presented verification with simulated data is to contrast Algorithms 1 and 2. We use a simple small-scale model that includes a cube with density contrast 1 g cm−3 embedded in an homogeneous background. The cube has size 200 m in each dimension and is buried at depth 50 m, Fig. 2(a). Simulation data on the surface, dexact , are calculated over a 20 × 20 regular grid with 50 m

14

S. Vatankhah, R. A. Renaut, V. E. Ardestani

(b)

mGal 2

800 3

g/cm

1

Depth(m)

0 100 200

0.5

300

Northing(m)

1.5

(a)

600 1 400 0.5 200

400 500

0

200

400

600

800

0

Easting(m)

0

0 0

200

400

600

800

Easting(m)

Figure 2. (a) Model of the cube on an homogeneous background. The density contrast of the cube is 1 g cm−3 . (b) Data due to the model and contaminated with noise N 2.

grid spacing. To add noise to the data, a zero mean Gaussian random matrix Θ of size m ×10 was generated. Then, setting dcobs = dexact + (τ1 (dexact )i + τ2 kdexact k) Θc ,

(27)

for c = 1 : 10, with noise parameter pairs (τ1 , τ2 ), for three choices, N 1 : (0.01, 0.001), N 2 : (0.02, 0.005) and N 3 : (0.03, 0.01), gives 10 noisy right-hand side vectors for 3 levels of noise. Fig. 2(b) shows noise-contaminated data for one right-hand side, here c = 7, and for N 2. This noise model is standard in the geophysics literature, see e.g. (Li & Oldenburg 1996), and incorporates effects of both instrumental and physical noise. For the inversion the model region of depth 500 m, is discretized into 20 × 20 × 10 = 4000 cells of size 50m in each dimension. The background model mapr = 0 and parameters β = 0.8 and 2 = 1e−9 are chosen for the inversion. Realistic upper and lower density bounds ρmax = 1 g cm−3 and ρmin = 0 g cm−3 , are specified. The iterations are terminated when, approximating (17), χ2Computed ≤ 429, or k = Kmax = 50 is attained.

4.2

Solution using Algorithm 1

The inversion was performed using Algorithm 1 for the 3 noise levels, and all 10 right-hand side data vectors. The final iteration K, the final regularization parameter α(K) and the relative error of the reconstructed model RE (K) =

kmexact − m(K) k2 kmexact k2

(28)

are recorded. Table 1 gives the average and standard deviation of α(K) , RE (K) and K over the 10 samples. It was explained by Farquharson (2004) that it is efficient if the inversion starts with a large value of the regularization parameter. This prohibits imposing excessive structure in the model at early

3-D Projected L1 inversion of gravity data

15

Table 1. The inversion results, for final regularization parameter α(K) , relative error RE (K) and number of iterations K obtained by inverting the data from the cube using Algorithm 1, with 2 = 1e−9, average (standard deviation) over 10 runs. Noise

α(1)

α(K)

RE (K)

K

N1

47769.1

117.5(10.6)

0.319(0.017)

8.2(0.4)

N2

48623.4

56.2(8.5)

0.388(0.023)

6.1(0.6)

N3

48886.2

36.2(9.1)

0.454(0.030)

5.8(1.3)

iterations which would otherwise require more iterations to remove artificial structure. In this paper the method introduced by Vatankhah et.al. (2014b; 2015) was used to determine an initial regularization ˜˜ are known, the initial value parameter, α(1) . Because the non zero singular values, σ of matrix G i

α(1) = (n/m)3.5 (σ1 /mean(σi )),

(29)

where the mean is taken over positive σi , can be selected. For subsequent iterations the UPRE method is used to estimate α(k) . To illustrate the results, Figs. 3-6 provide details for right-hand side c = 7 and for all noise levels. Fig. 3 shows the reconstructed models, indicating that a focused image of the subsurface is possible in all cases using Algorithm 1. The constructed models have sharp and distinct interfaces within the embedded medium. The progression of the data misfit Φ(m), the regularization term S(m) and regularization parameter α(k) with iteration k are presented in Fig. 4. Φ(m) is initially large and decays quickly in the first few steps, but the decay rate decreases dramatically as k increases. Fig. 5 shows the progression of the relative error RE (K) , (28), as a function of k. In all cases there is a dramatic decrease in the relative error for small k, after which the error decreases slowly. The UPRE function for iteration k = 4 is shown in Fig. 6. Clearly, for all cases the curves have a nicely defined minimum, which is important in the determination of the regularization parameter. We return to this when considering the projected case. The role of  is very important, small values lead to a sparse model that becomes increasingly smoother as  increases. To determine the dependence of Algorithm 1 on other values of 2 , we examined the results using right-hand side data for c = 7 for noise N 2 with 2 = 0.5 and 2 = 1e−15 with all other parameters chosen as before. For 2 = 1e−15 the results are close to those obtained with 2 = 1e−9, and are not presented here. For 2 = 0.5 the results are significantly different. Fig. 3(d) shows the reconstructed model, indicating a smeared-out and fuzzy image of the original model. The maximum of the obtained density is about 0.85 g cm−3 , 85% of the imposed ρmax . The progression of the data misfit, the regularization term and regularization parameter are presented in Fig. 4(d), while

S. Vatankhah, R. A. Renaut, V. E. Ardestani

(a) Depth(m)

0 100 200

0.5

300 400 500

0

200

400

(b)

g/cm3 1

600

100 200

0.5

300 400 500

0

800

0

200

(c)

1

200

0.5

300 400 400

600

0

800

0

800

3

g/cm

0

Depth(m)

Depth(m)

100

200

600

(d)

3

g/cm

0

0

400

Easting(m)

Easting(m)

500

g/cm3 1

0

Depth(m)

16

0.8

100

0.6

200 300

0.4

400

0.2

500

0

200

Easting(m)

400

600

0

800

Easting(m)

Figure 3. The reconstructed model obtained by inverting the noise-contaminated right-hand side with c = 7 using Algorithm 1 for noise cases (a) N 1; (b) N 2; (c) N 3 with 2 = 1e−9 and (d) N 2 when 2 = 0.5.

the relative error function and UPRE curve at iteration 4 are shown in Fig. 5(d) and Fig. 6(d), respectively. Clearly more iterations are needed to terminate the algorithm, K = 31, and at the final iteration α(31) = 1619.2 and RE (K) = 0.563 respectively, larger than their counterparts in the case 2 = 1e−9. We found that  of order 10−4 to 10−8 is appropriate for the L1 inversion algorithm. Hereafter, we fix 2 = 1e−9.

(a)

10 10

(b)

10 5

(c)

10 5

10 5

(d)

10 5 10 0

10 0

10 0

10 -5

10 -5

10 -5

10 0 10 -5 10 -10

10 -10 2

4 6 8 Iteration number

10 -10 1

2 3 4 5 6 Iteration number

10 -10 1

2 3 4 5 6 Iteration number

10 20 30 Iteration number

Figure 4. Inverting the noise-contaminated right-hand side with c = 7 using Algorithm 1. The progression of the data misfit, Φ(m) indicated by ?, the regularization term, S(m) indicated by +, and the regularization parameter, α(k) indicated by , with iteration k for noise cases (a) N 1; (b) N 2; (c) N 3 with 2 = 1e−9 and (d) N 2 when 2 = 0.5.

3-D Projected L1 inversion of gravity data

0.8

0.6

0.4

0.9 0.8 0.7 0.6 0.5

0.3 2

0.9 0.8 0.7 0.6

0.9 0.8 0.7 0.6

0.4

4 6 8 Iteration number

(d)

1

0.5

0.4 0.2

(c)

1

Relative error

Relative error

Relative error

(b) Relative error

(a)

1

17

0.5

2 4 6 Iteration number

2 4 6 Iteration number

10 20 30 Iteration number

Figure 5. Inverting the noise-contaminated right-hand side with c = 7 using Algorithm 1. The progression of the relative error RE (K) at each iteration for noise cases (a) N 1; (b) N 2; (c) N 3 with 2 = 1e−9 and (d) N 2 when 2 = 0.5.

4.2.1

Discussion of the degree of ill-posedness

Before presenting the results using Algorithm 2, it is useful to examine the underlying properties of ˜˜ in 14. There are various notions of the degree to which a given property is ill-posed. We the matrix G use the heuristic that if there exists ν such that σj ≈ O(j −ν ) then the system is mildly ill-posed for 0 < ν < 1 and moderately ill-posed for ν > 1. If σj ≈ O(e−νj ) for ν > 1 then the problem is said to be severely ill-posed, see e.g. (Hoffmann 1986; Huang & Jia 2016). These relationships relate to the speed with which the spectral values decay to 0, decreasing much more quickly for the more severely ˜˜ immediately suggests that the problem is mildly to ill-posed cases. Examination of the spectrum of G moderately ill-posed, and under that assumption it is not difficult to do a nonlinear fit of the σj to Cj −ν to find ν. Here we use lsqcurvefit from Matlab. The data fitting result for the case N 2 is shown in Fig. 7 over 4 iterations of the IRLS for a given sample. There is a very good fit to the data in each case, and ν is clearly iteration dependent. These results are representative of the results obtained using both N 1 and N 3. We conclude that the gravity inversion problem as used here is mildly to moderately ill-posed, with the parameter ν changing, but not dramatically, each iteration.

300

U(α)

500

400

800

300

600

200

200

U(α)

U(α)

1000

100

100

0

0

-100

-100

0 -500

α

200 400 600 800 1000

α

400 200 0 -200

-200

-200

1000 2000 3000 4000

(d)

(c)

(b) 400

U(α)

(a) 1500

100 200 300 400 500

1000 2000 3000

α

α

Figure 6. Inverting the noise-contaminated right-hand side with c = 7 using Algorithm 1. The UPRE function at iteration 4 for noise cases (a) N 1; (b) N 2; (c) N 3 with 2 = 1e−9 and (d) N 2 when 2 = 0.5.

18

S. Vatankhah, R. A. Renaut, V. E. Ardestani Data Fit ν = 0.5925

4

Data Fit ν = 1.0215

4

10

10 σi

σi

data fit

data fit

3

2

10

10

2

0

10

10 50

100

150

200

250

300

350

400

Data Fit ν = 0.82304

4

50

150

200

250

300

350

400

Data Fit ν = 0.92474

4

10

100

10 σi

σi

data fit

2

10

data fit

2

10

0

0

10

10

−2

−2

10

10 50

100

150

200

250

300

350

400

50

100

150

200

250

300

350

400

˜˜ and the data fit function for a case N 2 at the first iteration, iteration Figure 7. The original spectrum of matrix G 1, intermediate iterations 3 and 5 and the final iteration, here iteration 6, shown left to right and top to bottom with a title indicating ν at each iteration. Here ν is .5925, 1.0215, 0.82304 and 0.92474, for iterations 1, 3, 5, and 6, respectively.

4.3

Solution using Algorithm 2

Algorithm 2 is used to reconstruct the model for 4 different values of t, t = 3, 100, 200 and 400, in order to examine the impact of the size of the projected subspace on the solution and the estimated parameter ζ. The results are given in Table 2. Here, in order to compare the algorithms, the initial regularization parameter, ζ (1) , is set to the value that would be used on the full space. Generally, for small t the estimated regularization parameter is less than the counterpart obtained for the full case for the specific noise level. Comparing Tables 1 and 2 it is clear that with increasing t, the estimated ζ increases, reaching α(K) of the full space when t = m. For t = 3, the algorithm is computationally very fast and the relative error of the reconstructed model is acceptable, but still larger than that obtained using the full model. As t becomes greater than 3, and approaches t = 200, the results are not satisfactory. For a sample case, t = 100, the results are presented in Table 2. The relative error is very large and the reconstructed model is generally not acceptable. Although the results with the least noise are acceptable, they are still worse than those obtained with the other selected choices for t. In this case, t = 100, and for high noise levels, the algorithm usually terminates when it reaches k = Kmax = 50, indicating that the solution does not satisfy the noise level constraint (17). For t = 200 the results are again acceptable, although less satisfactory than the results obtained with the full space. With increasing t the results improve, until for t = m = 400 the results reproduce

3-D Projected L1 inversion of gravity data

19

Table 2. The inversion results, for final regularization parameter ζ (K) , relative error RE (K) and number of iterations K obtained by inverting the data from the cube using Algorithm 2, with 2 = 1e−9, and ζ (1) = α(1) for the specific noise level as given in Table 1. In each case the average (standard deviation) over 10 runs. The rows corresponding to noise cases N 1, N 2 and N 3, resp. t=3

t=100

t=200

t=400

ζ (K)

RE (K)

K

ζ (K)

RE (K)

K

ζ (K)

RE (K)

K

ζ (K)

RE (K)

K

75.5(2.9)

.427(.037)

13.5(0.5)

98.9(12.0)

.452(.043)

10.0(0.7)

102.2(11.3)

.330(.019)

8.8(0.4)

117.5(10.6)

0.319(0.017)

8.2(0.4)

46.7(14.2)

.472(.041)

7.8(0.6)

42.8(10.4)

1.009(.184)

28.1(10.9)

43.8(7.0)

.429(.053)

6.7(0.8)

56.2(8.5)

0.388(0.023)

6.1(0.6)

25.6(4.7)

.493(.03)

6.1(0.7)

8.4(13.3)

1.118(.108)

42.6(15.6)

27.2(6.3)

.463(.036)

5.5(0.5)

32.6(9.1)

0.454(0.030)

5.8(1.3)

those obtained with Algorithm 1. Furthermore, in this case, the smallest relative error for the projected space solution is obtained. This confirms the results in Renaut et al. (2015) that when t approximates the numerical rank of the sensitivity matrix, ζproj = αfull . To illustrate the results, we show the reconstructed models using Algorithm 2 with different t and for right-hand side c = 7 for noise N 2, Fig. 8. As illustrated, the reconstructed models for t = 3, 200 and 400 are acceptable, while for t = 100 the results are completely wrong. For some right-hand sides c with t = 100, the reconstructed models may be much worse than shown in Fig. 8(b). The progression of the data misfit, the regularization term and the regularization parameter with iteration k are presented in Fig. 9, while RE (K) is shown in Fig. 10. For t = 100 and for high noise levels, usually the estimated value for ζ (k) using (26) for 1 < k < K is small, corresponding to under regularization and yielding a large error in the solution. To understand why the UPRE leads to under regularization we illustrate the UPRE curves for iteration k = 4 in Fig. 11. It is immediate that when using moderately sized t, U (ζ) may not have a well-defined minimum, because U (ζ) becomes increasingly flat. Thus the algorithm may find a minimum at a small regularization parameter which leads to under regularization of the dominant, and more accurate, terms in the expansion. This can cause problems for moderate t, t < 200. On the other hand, as t increases, e.g. for t = 200 and 400, it appears that there is a unique minimum of U (ζ) and the regularization parameter found is appropriate. Unfortunately, this situation creates a conflict with the need to use t  m for large problems.

4.4

Extending the projected UPRE by spectrum truncation

To determine the reason for the difficulty with using U (ζ) to find an optimal ζ for small to moderate ˜˜ in Fig. 12 for the chosen choices of t for iteration t, we illustrate the singular values for B and G t

4, and for the case with t = 100 the singular values at iterations 1, 3, 5 and final iteration 14. It can

S. Vatankhah, R. A. Renaut, V. E. Ardestani

(a) Depth(m)

100 200

0.5

300 400 500

0

200

400

(b)

g/cm3 1

0

600

0

800

100 200

0.5

300 400 500

0

200

Easting(m)

(c) 200

0.5

300 400 400

600

0

800

0

800

3

g/cm

1

0

Depth(m)

Depth(m)

100

200

600

(d)

3

g/cm

1

0

400

Easting(m)

0

500

g/cm3 1

0

Depth(m)

20

100 200

0.5

300 400 500

0

200

Easting(m)

400

600

0

800

Easting(m)

Figure 8. The reconstructed model obtained by inverting the noise-contaminated right-hand side with c = 7 for noise N 2 using Algorithm 2 with 2 = 1e−9. (a) t = 3; (b) t = 100; (c) t = 200; and (d) t = 400.

be seen that for very small t we can only expect to accurately capture a small number of the singular values. In Fig. 12(a) we show that t = 3 accurately estimates the first 3 singular values, but Fig. 12(b) shows that using t = 100 we do not estimate the first 100 singular values accurately, rather only about the first 80 are given accurately. The spectrum decays too rapidly, because Bt captures the condition of the full system matrix. For t = 200 we capture about 160 to 170 correctly. In Figs. 12(e)-12(g) we look at the singular values with iteration k for the problem of size t = 100 for iterations 1, 3, 5, and 14, as compared to the first 150 singular values of the full matrix. We see that the behavior is similar (a)

10 5

(b)

10 5

(c)

10 5

10 0

10 0

10 0

10 0

10 -5

10 -5

10 -5

10 -5

10 -10 2

4 6 8 Iteration number

10 -10

10 -10 2

4 6 8 10 12 14 Iteration number

(d)

10 5

10 -10 2

4 6 Iteration number

1

2 3 4 5 6 Iteration number

Figure 9. Inverting the noise-contaminated right-hand side with c = 7 for noise N 2 using Algorithm 2 with 2 = 1e−9. The progression of the data misfit, Φ(m) indicated by ?, the regularization term, S(m) indicated by +, and the regularization parameter, ζ (k) indicated by , with iteration k, for (a) t = 3; (b) t = 100; (c) t = 200; and (d) t = 400.

3-D Projected L1 inversion of gravity data

0.9 0.8 0.7 0.6

Relative error

Relative error

Relative error

(c)

(b) 1.2

1

0.8

0.5

(d) Relative error

(a)

1

0.9 0.8 0.7 0.6 0.5 0.4

0.4 2

4 6 8 Iteration number

2

0.9 0.8 0.7 0.6 0.5 0.4

0.3

0.6

0.3 2 4 6 Iteration number

4 6 8 10 12 14 Iteration number

21

2 4 6 Iteration number

Figure 10. Inverting the noise-contaminated right-hand side with c = 7 for noise N 2 using Algorithm 2 with 2 = 1e−9. The progression of the relative error RE (K) at each iteration for (a) t = 3; (b) t = 100; (c) t = 200; and (d) t = 400.

for all iterations. Thus using all the singular values from Bt generates regularization parameters which ˜˜ is are determined by the smallest singular values, rather than the dominant terms. Indeed, because G predominantly mildly ill-posed over all iterations the associated matrices Bt do not accurately capture a right subspace of size t. Specifically, Huang & Jia ( 2016) have shown that the LSQR iteration captures the underlying Krylov subspace of the system matrix better when that matrix is severely or moderately ill-posed, and therefore LSQR has better regularizing properties in these cases. On the other hand, for the mild cases LSQR is not sufficiently regularizing, additional regularization is needed, and the right singular subspace is contaminated by inaccurate spectral information that causes difficulty for effectively regularizing the projected problem in relation to the full problem. ˜˜ to obtain B , but use Now suppose that we use t steps of the GKB on matrix G t

ttrunc = ω t

ω < 1,

(30)

singular values of Bt in estimating ζ, in step 8 of Algorithm 2. Our examinations suggest taking 0.7 ≤ ω < 1, with this choice consistent across all iterations k. With this choice the smallest singular values (b)

(c)

12000 10000

1200

800 600 400

U(ζ)

1000

U(ζ)

U(ζ)

8000 6000

400

600

ζ

800

400

250

300

200

200

150

100

4000

100

0

2000

50

-100

0 200

(d)

300

U(ζ)

(a) 1400

0

200

600

ζ

1000

-200

200 400 600 800 1000

ζ

200 400 600 800 1000

ζ

Figure 11. Inverting the noise-contaminated right-hand side with c = 7 for noise N 2 using Algorithm 2 with 2 = 1e−9. The UPRE function at iteration 4 for (a) t = 3; (b) t = 100; (c) t = 200; and (d) t = 400.

22

S. Vatankhah, R. A. Renaut, V. E. Ardestani

Table 3. The inversion results for final regularization parameter α(K) , relative error RE (K) and number of iterations K obtained by inverting the data from the cube using Algorithm 2, with 2 = 1e−9, using TUPRE with t = 100, and ζ (1) = α(1) for the specific noise level as given in Table 1. In each case the average (standard deviation) over 10 runs. Noise

ζ (K)

RE (K)

K

N1

131.1(9.2)

0.308(0.007)

6.7(0.8)

N2

52.4(6.2)

0.422(0.049)

6.8(0.9)

N3

30.8(3.3)

0.483(0.060)

6.9(1.1)

of Bt are ignored in estimating ζ. We denote the approach, in which we replace step 8 in Algorithm 2 with estimates using ttrunc , truncated UPRE (TUPRE). We comment that the approach may work equally well for alternative regularization techniques, but this is not a topic of the current investigation, neither is a detailed investigation for the choice of ω. Furthermore, we note this is not a standard filtered truncated SVD for the solution. The truncation of the spectrum is used in the estimation of the regularization parameter, but all t terms of the singular value expression are used for the solution. To show the efficiency of TUPRE, we run the inversion algorithm for case t = 100 for which the original results are not realistic. The results using TUPRE are given in Table 3 for noise N 1, N 2 and N 3, and illustrated, for right-hand side c = 7 for noise N 2, in Fig. 13. Comparing the new results with those obtained in section 4.3, demonstrates that the TUPRE method yields a significant improvement in the solutions. Indeed, Fig. 13(c) shows the existence of a well-defined minimum for U (ζ) at iteration k = 4, as compared to Fig. 11(b). Further, it is possible to now obtain acceptable solutions with small t, which is demonstrated for reconstructed models obtained using t = 10, 20, 30 and 40 in Fig. 14. All results are reasonable, hence indicating that the method can be used with high confidence even for small t. We note here that t should be selected as small as possible, t  min(m, n), in order that the implementation of the GKB is fast, while simultaneously the condition of the original matrix should be captured. In addition, for very small t, for example t = 10 in Fig. 14, the method chooses the smallest singular value as regularization parameter, there is no nice minimum for the curve. Our analysis suggests that t > m/20 is a suitable choice for application in the algorithm, although smaller t may be used as m increases. In all our test examples we use ω ≈ 0.7.

4.5

Model of multiple embedded bodies

A model consisting of six bodies with various geometries, sizes, depths and densities is used to verify the ability and limitations of Algorithm 2 implemented with TUPRE for the recovery of a larger and more complex structure. Fig. 15(a) shows a perspective view of this model. The density and dimension

3-D Projected L1 inversion of gravity data (a) 10

4

10

3

10

1

10

2

10

0

10 1

10

3

10

2

10

-1

10

4

10

0

100

200

300

400

10

0

10

4

10

3

(e)

3

10

(b)

0

100

200

300

400

10

4

10

10

10

4

2

10

2

0

10

0

10

-2

10

3

10

2

10

1

(f)

(c)

0

100

200

300

400

2

0

50

100

150

10

(d)

-2

0

100

200

(g)

2

10 1 10

10

23

10

3

10

2

10

1

300

400

(h)

10 0

0

0

50

100

150

10

0

0

50

100

150

10

-1

0

50

100

150

˜˜ blue , and B , red ?, at iteration 4 using Algorithm 2 with 2 = 1e−9 Figure 12. The singular values of G, t for the noise-contaminated right-hand side with c = 7 and noise N 2. (a) t = 3; (b) t = 100; (c) t = 200 and t = 400, for plots 12(a)-12(d). In 12(e)-12(h) the singular values of Bt for t = 100 and the first 150 singular ˜˜ for iterations 1, 3, 5 and the final iteration 14. values of G,

(b)

(a)

5

10

(c)

0

10

−5

10

−10

10

250

200

0.8

U(ζ)

Relative error

1

0.6

0.4

150

100

50

0

0.2 2 4 6 Iteration number

8

2 4 6 Iteration number

400

ζ

600

800

1000

g/cm3

(d)

1

0

Depth(m)

200

8

100 200 0.5 300 400 0

500 0

200

400

600

800

Easting(m)

Figure 13. Results for right-hand side c = 7 for noise N 2 using Algorithm 2, with 2 = 1e−9 using TUPRE when t = 100 is chosen; (a) The progression of the data misfit, Φ(m) indicated by ?, the regularization term, S(m) indicated by +, and the regularization parameter, ζ (k) indicated by , with iteration k; (b) The progression of the relative error RE (K) at each iteration; (c) The TUPRE function at iteration 4; and (d) The reconstructed model.

S. Vatankhah, R. A. Renaut, V. E. Ardestani g/cm3

(a)

1

Depth(m)

0 100 200

0.5 300 400 0

200

400

600

1

0

0

500

100 200 0.5 300 400 0

500

800

0

200

Easting(m) g/cm3

(c)

200 0.5 300 400 0

500 600

800

g/cm3 1

0

Depth(m)

Depth(m)

100

400

600

(d)

1

200

400

Easting(m)

0

0

g/cm3

(b) Depth(m)

24

100 200 0.5 300 400 0

500

800

0

Easting(m)

200

400

600

800

Easting(m)

Figure 14. The reconstructed model by inverting noise-contaminated for right-hand side c = 7 for noise N 2 using Algorithm 2, with 2 = 1e−9, using TUPRE. (a) t = 10; (b) t = 20; (c) t = 30 and (d) t = 40.

of each body is given in Table 4. Fig. 16 shows four plane-sections of the model. The surface gravity data are calculated on a 100 × 60 grid with 100 m spacing, for a data vector of length 6000. Noise is added to the exact data vector as in (27) with (τ1 , τ2 ) = (.02, .001). Fig. 15(b) shows the noisecontaminated data. The subsurface extends to depth 1200 m with cells of size 100 m in each dimension yielding the unknown model parameters to be found on 100 × 60 × 12 = 72000 cells. The inversion assumes mapr = 0, β = 0.6, 2 = 1e−9 and imposes density bounds ρmin = 0 g cm−3 and ρmax = 1 g cm−3 . The iterations are terminated when, approximating (17), χ2Computed ≤ 6110, or k > Kmax = 20. The (b)

mGal 8

Northing(m)

5000 4000

6

3000

4

2000 2 1000 0

0 0

2000

4000

6000

8000

Easting(m)

Figure 15. (a) The perspective view of the model. Six different bodies embedded in an homogeneous background. Darker and brighter bodies have the densities 1 g cm−3 and 0.8 g cm−3 , respectively; (b) The noise contaminated gravity anomaly due to the model.

3-D Projected L1 inversion of gravity data g/cm3

(a)

g/cm3

(b)

1

1

5000

5000

4000 0.6 3000 0.4 2000

0.8

Northing(m)

Northing(m)

0.8

0.2

1000 0

4000 0.6 3000 0.4 2000

2000

4000

6000

0.2

1000 0

0 0

8000

0 0

2000

Easting(m)

4000

6000

8000

Easting(m) g/cm3

(c)

g/cm3

(d)

1

1

5000

5000

4000 0.6 3000 0.4 2000 0.2

1000 0

0.8

Northing(m)

0.8

Northing(m)

25

4000 0.6 3000 0.4 2000

0 0

2000

4000

6000

0.2

1000 0

8000

0 0

2000

Easting(m)

4000

6000

8000

Easting(m)

Figure 16. The original model is displayed in four plane-sections. The depths of the sections are: (a) Z = 100 m; (b) Z = 300 m; (c) Z = 500 m and (d) Z = 700 m.

inversion is performed using Algorithm 2 but with the TUPRE parameter choice method for step 8 with ω = 0.7. The initial regularization parameter is ζ (1) = (n/m)3.5 (γ1 /mean(γi )), for γi , i = 1 : t. We implement the code for two values of t, t = 350 and t = 50. Figs. 17 and 19 show the planesections of the resulting model for the inversions. The results of both inversions are close to the original Table 4. Assumed parameters for model with multiple bodies. Source number

x × y × z dimensions (m)

Depth (m)

Density (g cm−3 )

1

1000 × 2500 × 400

200

1

2

2500 × 1000 × 300

100

1

3

500 × 500 × 100

100

1

4

1000 × 1000 × 600

200

0.8

5

2000 × 500 × 600

200

0.8

6

1000 × 3000 × 400

100

0.8

26

S. Vatankhah, R. A. Renaut, V. E. Ardestani

model, indicating that the algorithm is generally able to give realistic results even for a very small value of t. The inversions are completed on a Core i7 CPU 3.6 GH desktop computer in less than 30 minutes for t = 350 and 5 minutes for t = 50. As illustrated, the horizontal borders of the bodies are recovered and the depths to the top are close to those of the original model. At the intermediate depth, the shapes of the anomalies are acceptably reconstructed, while deeper into the subsurface the model does not match the original so well. The anomalies 1 and 2 extend to a greater depth than in the true case, although for case t = 350 the discrepancy is less significant. Using the incorrect upper density bounds for anomalies 4, 5 and 6, impacts the resulting model. For anomalies 4 and 6 the maximum density 1 g cm−3 is attained, exceeding the true value, while the value for anomaly 5 is close to the true value 0.8. This is a common problem for algorithms aimed at focusing the image of the subsurface, and it is important to impose realistic density limits, see Portniaguine & Zhdanov (1999). In addition, Fig. 21 shows a 3-D perspective view of the inversion results with a density greater than 0.6 g cm−3 . The progression of the data misfit, the regularization term and the regularization parameter with iteration k, the TUPRE function at the final iteration, and the progression of the relative error at each iteration are presented in Fig. 18, for case t = 350, and Fig. 20, for case t = 50, respectively. The memory requirements for all variables for the case with t = 350 are given in Appendix D. We now implement Algorithm 2 for the projected subspace of size t = 50 using the UPRE function (26) rather than the TUPRE to find the regularization parameter. All other parameters are the same as before. The results are illustrated in Figs 22 and 23. The recovered model is not reasonable and the relative error is very large. The iterations are terminated at Kmax = 20. This is similar to our results obtained for the cube and demonstrate that truncation of the subspace is required in order to obtain a suitable regularization parameter.

5

REAL DATA: MOBRUN ANOMALY, NORANDA, QUEBEC

To illustrate the relevance of the approach for a practical case we use the residual gravity data from the well-known Mobrun ore body, northeast of Noranda, Quebec, which is a massive pyritic body in a precambrian volcanic environment. We carefully digitized the data from Figure 10.1 in Grant & West (1965), and re-gridded onto a regular grid of 74 × 62 = 4588 data in x and y directions respectively, with grid spacing 10 m, see Fig. 24. In this case, the densities of the pyrite and volcanic host rock were taken to be 4.6 g cm−3 and 2.7 g cm−3 , respectively (Grant & West 1965). For the data inversion we use a model with cells of width 10 m in the eastern and northern directions. In the depth dimension the first 10 layers of cells have a thickness of 5 m, while the subsequent layers increase gradually to 10 m. The maximum depth of the model is 160 m. This yields a model with the z-coordinates: 0 : 5 : 50, 56, 63, 71, 80 : 10 : 160 and a mesh with 74 × 62 × 22 = 100936

3-D Projected L1 inversion of gravity data g/cm3

(a)

g/cm3

(b)

1

1

5000

5000

4000 0.6 3000 0.4 2000

0.8

Northing(m)

Northing(m)

0.8

0.2

1000 0

4000 0.6 3000 0.4 2000

2000

4000

6000

0.2

1000 0

0 0

8000

0 0

2000

4000

Easting(m)

6000

8000

Easting(m) g/cm3

(c)

g/cm3

(d)

1

1

5000

5000

4000 0.6 3000 0.4 2000

0.8

Northing(m)

0.8

Northing(m)

27

0.2

1000 0

4000 0.6 3000 0.4 2000

0

0 0

2000

4000

6000

0.2

1000

8000

0 0

2000

4000

Easting(m)

6000

8000

Easting(m)

Figure 17. For the data in Fig. 15(b): The reconstructed model using Algorithm 2 with t = 350 and TUPRE method. The depths of the sections are: (a) Z = 100 m; (b) Z = 300 m; (c) Z = 500 m and (d) Z = 700 m.

(b)

(a)

10

(c)

1400

10

1.1 Relative error

1200 5

10

U(ζ)

1000

0

10

800 600 400

−5

10

200 −10

1 0.9 0.8 0.7

0

10

2

4 6 8 10 Iteration number

20

40

ζ

60

80

2

4 6 8 10 Iteration number

Figure 18. For the data in Fig. 15(b) using Algorithm 2 with t = 350 and TUPRE method. (a) The progression of the data misfit, Φ(m) indicated by ?, the regularization term, S(m) indicated by +, and the regularization parameter, ζ (k) indicated by , with iteration k; (b) The TUPRE function at iteration 11; and (c) The progression of the relative error RE (K) at each iteration.

28

S. Vatankhah, R. A. Renaut, V. E. Ardestani g/cm3

(a)

g/cm3

(b)

1

1

5000

5000

4000 0.6 3000 0.4 2000

0.8

Northing(m)

Northing(m)

0.8

0.2

1000 0

4000 0.6 3000 0.4 2000

0

0 0

2000

4000

6000

0.2

1000

8000

0 0

2000

4000

Easting(m)

6000

8000

Easting(m) g/cm3

(c)

g/cm3

(d)

1

1

5000

5000

4000 0.6 3000 0.4 2000 0.2

1000 0

0.8

Northing(m)

Northing(m)

0.8 4000

0.6 3000 0.4 2000

0

0 0

2000

4000

6000

0.2

1000

8000

0 0

2000

4000

Easting(m)

6000

8000

Easting(m)

Figure 19. For the data in Fig. 15(b): The reconstructed model using Algorithm 2 with t = 50 and TUPRE method. The depths of the sections are: (a) Z = 100 m; (b) Z = 300 m; (c) Z = 500 m and (d) Z = 700 m.

(b)

(a)

10

(c)

350

10

1.1 Relative error

300 5

10

U(ζ)

250

0

10

200 150

−5

10

100 −10

1 0.9 0.8 0.7

50

10

2

4 6 8 10 Iteration number

12

20

40

ζ

60

80

2

4 6 8 10 Iteration number

12

Figure 20. For the data in Fig. 15(b) using Algorithm 2 with t = 50 and TUPRE method. (a) The progression of the data misfit, Φ(m) indicated by ?, the regularization term, S(m) indicated by +, and the regularization parameter, ζ (k) indicated by , with iteration k; (b) The TUPRE function at iteration 12; and (c) The progression of the relative error RE (K) at each iteration.

3-D Projected L1 inversion of gravity data

29

Figure 21. For the data in Fig. 15(b): Isosurface of the 3-D inversion results with a density grater than 0.6 g cm−3 , using Algorithm 2 and TUPRE method. (a) with t = 350 ; and (b) with t = 50.

cells. We suppose each datum has an error with standard deviation (0.03(dobs )i + 0.004kdobs k). Algorithm 2 is used with TUPRE, t = 300 and 2 = 1.e−9. The inversion process terminates after 11 iterations. The cross-section of the recovered model at y = 285 m and a plane-section at z = 50 m are shown in Figs. 25(a) and 25(b), respectively. Fig. 26 shows the 3D view of the reconstructed model where the density cut off is equal to 4 g cm−3 . The depth to the surface is about 10 to 15 m, and the 3

(a)

(b)

g/cm

g/cm

1

1

5000

5000

4000 0.6 3000 0.4 2000

0.8

Northing(m)

Northing(m)

0.8

0.2

1000 0 2000

4000

6000

4000 0.6 3000 0.4 2000 0.2

1000 0

0 0

8000

0 0

2000

Easting(m)

4000

6000

8000

Easting(m) g/cm3

(c)

g/cm3

(d)

1

1

5000

5000

4000 0.6 3000 0.4 2000 0.2

1000 0

0 0

2000

4000

6000

Easting(m)

8000

0.8

Northing(m)

0.8

Northing(m)

3

4000 0.6 3000 0.4 2000 0.2

1000 0

0 0

2000

4000

6000

8000

Easting(m)

Figure 22. For the data in Fig. 15(b): The reconstructed model using Algorithm 2 with t = 50 and projected UPRE. The depths of the sections are: (a) Z = 100 m; (b) Z = 300 m; (c) Z = 500 m and (d) Z = 700 m.

30

S. Vatankhah, R. A. Renaut, V. E. Ardestani

10

(b)

4

x 10

(a)

(c)

2

10

1.1

U(ζ)

0

10

−5

Relative error

1.5

5

10

1

0.5

10

−10

0

10

5 10 15 Iteration number

20

20

40

ζ

60

1.05 1 0.95 0.9 0.85

80

5 10 15 20 Iteration number

Figure 23. For the data in Fig. 15(b) using Algorithm 2 with t = 50 and projected UPRE. (a) The progression of the data misfit, Φ(m) indicated by ?, the regularization term, S(m) indicated by +, and the regularization parameter, ζ (k) indicated by , with iteration k; (b) The UPRE function at iteration 20; and (c) The progression of the relative error RE (K) at each iteration.

body extends to a maximum 110 m depth. The results are in good agreement with those obtained from previous investigations and drill hole information, especially for depth to the surface and intermediate depth, see Figures 10-11 and 10-24 in Grant & West (1965). To compare with the results of other inversion algorithms, we suggest the paper by Ialongo et al. (2014), where they illustrated the inversion results for the gravity and first-order vertical derivative of the gravity ((Ialongo et al. 2014, Figures 14 and 16)). The algorithm is fast, requiring less than 30 minutes, and yields a model with blocky features. The progression of the data misfit, the regularization term and the regularization parameter with iteration k and the TUPRE function at the final iteration are shown in Figs. 27(a) and 27(b).

mGal 1.6

600

1.4 500

Northing (m)

1.2 400

1 0.8

300

0.6 200 0.4 0.2

100

0 0 0

100

200

300

400

500

600

700

Easting (m)

Figure 24. Residual Anomaly of Mobrun ore body, Noranda, Quebec, Canada.

3-D Projected L1 inversion of gravity data (a)

g/cm

3 4.5

50

4 3.5

100

g/cm3

(b) Northing(m)

Depth(m)

0

31

600

4.5

400

4 3.5

200 3

3 150

0 0

200

400

0

600

200

400

600

Easting(m)

Easting(m)

Figure 25. For the data in Fig. 24: The reconstructed model using Algorithm 2 with t = 300 and TUPRE method. (a) cross-section at y = 285 m and (b) plane-section at z = 50 m.

Figure 26. For the data in Fig. 24: the 3-D view of the recovered model with density cut off 4 g cm−3 .

(b)

(a)

10

1600

10

1400 1200 0

10

U(ζ)

1000 800 600

−10

10

400 200 −20

0

10

2

4 6 8 10 Iteration number

10

20

ζ

30

Figure 27. For the data in Fig. 24: (a) The progression of the data misfit, Φ(m) indicated by ?, the regularization term, S(m) indicated by +, and the regularization parameter, ζ (k) indicated by , with iteration k and (b) The TUPRE function at iteration 11.

32 6

S. Vatankhah, R. A. Renaut, V. E. Ardestani CONCLUSIONS

We have developed an algorithm for sparse inversion of gravity data using an L1 norm stabilizer. The algorithm has been validated using a simulated small-scale simple cube model and a more complex model consisting of multiple bodies. Our results show that the gravity inverse problem can be efficiently and effectively inverted using the GKB projection with regularization applied on the projected space. The UPRE was developed for the projected problem and then used to estimate the regularization parameter for the subspace solution. Our results demonstrate that the regularization parameters selected using the UPRE for the projected problem lead to under regularization and solutions that are not satisfactory. Based on the analysis of the spectrum of the projected matrix, we showed that by truncation of the small singular values the regularization parameter can be found effectively in subspace using UPRE method. The new method, here denoted as TUPRE, gives results using the projected subspace algorithm which are comparable with those obtained for the full space, while just requiring the generation of a small projected space. The presented algorithm is practical and efficient and has been illustrated for the inversion of gravity data for the Mobrun ore body, Quebec, Canada. The reconstructed model extended from 10-15 m to 110 m in depth. While the examples used in the paper are from small to moderate size, and the code is implemented using MATLAB on a desk top computer, we believe for very large problems the situation is the same and truncation is required for parameter estimation in conjunction with using the LSQR algorithm. Future work will look at the algorithm for other data sets, and additional justification mathematically of the truncated UPRE algorithm.

ACKNOWLEDGMENTS Rosemary Renaut acknowledges the support of NSF grant DMS 1418377: “Novel Regularization for Joint Inversion of Nonlinear Problems”.

REFERENCES Ajo-Franklin, J. B., Minsley, B. J. & Daley, T. M., 2007. Applying compactness constraints to differential traveltime tomography, Geophysics, 72(4), R67-R75. Bj¨orck, A, 1996, Numerical Methods for Least Squares Problems, SIAM Other Titles in Applied Mathematics, SIAM Philadelphia U.S.A. Boulanger, O. & Chouteau, M., 2001. Constraint in 3D gravity inversion, Geophysical prospecting, 49, 265-280. Bruckstein, A. M., Donoho, D. L. & Elad, M., 2009. From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images, SIAM Rev., 51(1), 34–81. Chung, J., Nagy, J. & O’Leary, D. P., 2008. A weighted GCV method for Lanczos hybrid regularization ETNA, 28, 149-167.

3-D Projected L1 inversion of gravity data

33

Farquharson, C. G., 2008. Constructing piecwise-constant models in multidimensional minimum-structure inversions, Geophysics, 73(1), K1-K9. Farquharson, C. G. & Oldenburg, D. W., 1998. Nonlinear inversion using general measure of data misfit and model structure, Geophys. J. Int., 134, 213-227. Farquharson, C. G. & Oldenburg, D. W., 2004. A comparison of Automatic techniques for estimating the regularization parameter in non-linear inverse problems, Geophys. J. Int., 156, 411-425. Gazzola, S. & Nagy, J. G., 2014. Generalized Arnoldi-Tikhonov method for sparse reconstruction, SIAM J. Sci. Comput., 36(2), B225-B247. Golub, G. H., Heath, M. & Wahba, G., 1979. Generalized Cross Validation as a method for choosing a good ridge parameter, Technometrics, 21 (2), 215-223. Golub, G. H. & van Loan, C., 1996. Matrix Computations, John Hopkins Press Baltimore, 3rd ed. Grant, F. S. & West, G. F., 1965. Interpretation Theory in Applied Geophysics, McGraw-Hill, New York. Hansen, P. C., 1992. Analysis of discrete ill-posed problems by means of the L-curve, SIAM Review, 34 (4), 561-580. Hansen, P. C., 1998, Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion, SIAM Mathematical Modeling and Computation, SIAM Philadelphia USA. Hansen, P. C., 2007. Regularization Tools: A Matlab Package for Analysis and Solution of Discrete Ill-Posed Problems Version 4.1 for Matlab 7.3 , Numerical Algorithms, 46, 189-194. Hnˇetynkov´a, I., Pleˇsinger, M. & Strakoˇs, Z., 2009. The regularizing effect of the Golub-Kahan iterative bidiagonalization and revealing the noise level in the data, BIT Numerical Mathematics, 49 (4), 669–696. Hoffmann, B., 1986. Regularization for Applied Inverse and Ill-posed Problems, Teubner, Stuttgart, Germany . Huang, Y. & Jia, Z., 2016. Some results on the regularization of LSQR for large-scale discrete ill-posed problems, arxiv, arXiv:1503.01864v3. Ialongo, S., Fedi, M. & Florio, G., 2014. Invariant models in the inversion of gravity and magnetic fields and their derivatives, Journal of Applied Geophysics, 110, 51-62. Kilmer, M. E. & O’Leary, D. P., 2001. Choosing regularization parameters in iterative methods for ill-posed problems, SIAM journal on Matrix Analysis and Application, 22, 1204-1221. Last, B. J. & Kubik, K., 1983. Compact gravity inversion, Geophysics, 48, 713-721. Li, Y. & Oldenburg, D. W., 1996. 3-D inversion of magnetic data, Geophysics, 61, 394-408. Li, Y. & Oldenburg, D. W., 1998. 3-D inversion of gravity data, Geophysics, 63, 109-119. Li, Y. & Oldenburg, D. W., 2003. Fast inversion of large-scale magnetic data using wavelet transforms and a logarithmic barrier method, Geophys. J. Int, 152, 251-265. Loke, M. H., Acworth, I. & Dahlin, T., 2003. A comparison of smooth and blocky inversion methods in 2D electrical imaging surveys, Exploration Geophysics, 34, 182-187. Marquardt, D. W., 1970. Generalized inverses, ridge regression, biased linear estimation, and nonlinear estimation, Technometrics, 12 (3), 591-612. Mead, J. L. & Renaut, R. A., 2009. A Newton root-finding algorithm for estimating the regularization parameter

34

S. Vatankhah, R. A. Renaut, V. E. Ardestani for solving ill-conditioned least squares problems, Inverse Problems, 25, 025002.

Morozov, V. A., 1966. On the solution of functional equations by the method of regularization, Sov. Math. Dokl., 7, 414-417. Oldenburg, D. W. & Li, Y., 1994. Subspace linear inverse method, Inverse Problems, 10, 915-935. Oldenburg, D. W. & McGillivray, P. R. & Ellis R. G., 1993. Generalized subspace methods for large-scale inverse problem, Geophys. J. Int, 114, 12-20. Paige, C. C. & Saunders, M. A., 1982. LSQR: An algorithm for sparse linear equations and sparse least squares, ACM Trans. Math. Software, 8, 43-71. Paige, C. C. & Saunders, M. A., 1982. ALGORITHM 583 LSQR: Sparse linear equations and least squares problems, ACM Trans. Math. Software, 8, 195-209. Pilkington, M., 2009. 3D magnetic data-space inversion with sparseness constraints, Geophysics, 74, L7-L15. Portniaguine, O. & Zhdanov, M. S., 1999. Focusing geophysical inversion images, Geophysics, 64, 874-887 Renaut, R. A., Hnetynkov´a, I. & Mead, J. L., 2010. Regularization parameter estimation for large scale Tikhonov regularization using a priori information, Computational Statistics and Data Analysis 54(12), 3430-3445. Renaut, R. A., Vatankhah, S. & Ardestani, V. E., 2015. Hybrid and iteratively reweighted regularization by unbiased predictive risk and weighted GCV for projected systems, http://arxiv.org/abs/1509.00096, September 2015, submitted. Siripunvaraporn, W. & Egbert, G., 2000. An efficient data-subspace inversion method for 2-D magnetotelluric data, Geophysics, 65, 791-803. Sun, J. & Li, Y., 2014. Adaptive Lp inversion for simultaneous recovery of both blocky and smooth features in geophysical model, Geophys. J. Int, 197, 882-899. Vatankhah, S., Ardestani, V. E. & Renaut, R. A., 2014b. Automatic estimation of the regularization parameter in 2-D focusing gravity inversion: application of the method to the Safo manganese mine in northwest of Iran, Journal Of Geophysics and Engineering, 11, 045001. Vatankhah, S., Ardestani, V. E. & Renaut, R. A., 2015. Application of the χ2 principle and unbiased predictive risk estimator for determining the regularization parameter in 3D focusing gravity inversion, Geophys. J. Int., 200, 265-277. Vatankhah, S., Renaut, R. A. & Ardestani, V. E., 2014c. Regularization parameter estimation for underdetermined problems by the χ2 principle with application to 2D focusing gravity inversion, Inverse Problems, 30, 085002. Vogel, C. R., 2002. Computational Methods for Inverse Problems, SIAM Frontiers in Applied Mathematics, SIAM Philadelphia U.S.A. Voronin, S., 2012. Regularization of Linear Systems With Sparsity Constraints With Application to Large Scale Inverse Problems, PhD thesis, Princeton University, U.S.A. Wohlberg, B. & Rodriguez, P. 2007. An iteratively reweighted norm algorithm for minimization of total variation functionals, IEEE Signal Processing Letters, 14 948–951.

3-D Projected L1 inversion of gravity data

35

APPENDIX A: SOLUTION USING SINGULAR VALUE DECOMPOSITION ˜˜ ∈ Rm×n is given by G ˜˜ = U ΣV T , where the Suppose m∗ = min(m, n) and the SVD of matrix G singular values are ordered σ1 ≥ σ2 ≥ . . . ≥ σm∗ > 0, and occur on the diagonal of Σ ∈ Rm×n with n − m zero columns (when m < n) or m − n zero rows (when m > n), using the full definition of the SVD, (Golub & van Loan 1996). U ∈ Rm×m , and V ∈ Rn×n are orthogonal matrices with columns denoted by ui and vi . Then the solution of (15) is given by ∗

h(α) =

m X i=1

σi2 uTi ˜r vi σi2 + α2 σi

(A.1)

For the projected problem Bt ∈ R(t+1)×t , i.e. m > n, and the expression still applies to give the solution of (24) with k˜rk2 et+1 replacing ˜r, ζ replacing α, γi replacing σi and m∗ = t in (A.1).

APPENDIX B: THE UNBIASED PREDICTIVE RISK ESTIMATOR We briefly describe the development of the UPRE for Tikhonov regularized system Ax = b, A ∈ Rm×n , as given in Vogel (2002). Here, the regularized problem is P λ (x) = kAx − bk22 + λ2 kbk22 ,

(B.1)

with solution x(λ) = (AT A + λ2 In )−1 AT b = A(λ)b,

where

A(λ) = (AT A + λ2 In )−1 AT

(B.2)

Any method which is used to determine optimal λ should minimize the error between the solution x(λ) and the exact solution xexact . Because the exact solution is unknown, an alternative error indicator, called the predictive error, is used (Vogel 2002) P(x(λ)) = Ax(λ) − bexact = AA(λ)b − bexact = H(λ)(bexact + η) − bexact = (H(λ) − Im )bexact + H(λ)η

(B.3)

where H(λ) = AA(λ) is the influence matrix. The predictive error is also not computable because bexact is unknown, however, it can be estimated using the full residual R(x(λ)) = Ax(λ) − b = (H(λ) − Im )b = (H(λ) − Im )bexact + (H(λ) − Im )η.

(B.4)

For both (B.3) and (B.4), the first term on the right hand side is deterministic, whereas the second is stochastic. Applying the Trace lemma, e.g. (Vogel 2002, page 98 eq. (7.5)), for both equations and using the symmetry of the influence matrix we obtain E(kP(x(λ))k22 ) = k(H(λ) − Im )bexact k22 + trace(H T (λ)H(λ)), and

(B.5)

E(kR(x(λ))k22 ) = k(H(λ) − Im )bexact k22 + trace((H(λ) − Im )T (H(λ) − Im )).

(B.6)

36

S. Vatankhah, R. A. Renaut, V. E. Ardestani

Here E(kP(x(λ))k22 )/m is the expected value of the predictive risk (Vogel 2002). We note that this derivation relies on the estimate of the noise vector in the right hand side vector b, we suppose that b is contaminated by noise vector η with independent identically distributed components each with variance 1, i.e. we assume that the noise is whitened and the covariance of the noise is the identity matrix. The first terms in the right hand sides of (B.5) and (B.6) are the same. Thus, by the linearity of the trace operator and with E(kR(x(λ))k22 ) ≈ kR(x(λ))k22 = k(H(λ) − Im )bk22 , the UPRE estimator of the optimal parameter is λopt = arg min{U (λ) := k(H(λ) − Im )bk22 + 2 trace(H(λ)) − m}. λ

(B.7)

and λopt is found by evaluating (B.7) for a range of λ.

APPENDIX C: UPRE FUNCTION USING SVD ˜˜ is expressible The UPRE function for determining α in the Tikhonov form (14) with system matrix G ˜˜ using the SVD for G ! 2 m∗  m∗ X X  1 σi2 T 2 U (α) = ui ˜r + 2 − m. σi2 α−2 + 1 σ 2 + α2 i=1 i=1 i In the same way the UPRE function for the projected problem (23) is given by ! 2 t  t+1 t 2 X X X   1 γ 2 2 i U (ζ) = uTi (k˜rk2 et+1 ) + uTi (k˜rk2 et+1 ) + 2 − (t + 1). 2 ζ −2 + 1 2 + ζ2 γ γ i i=1 i=t+1 i=1 i Then, for truncated UPRE, t is replaced by ttrunc < t so that the terms from ttrunc to t are ignored, corresponding to dealing with these as constant with respect to the minimization of U (ζ).

APPENDIX D: LIST OF VARIABLES A list of variables and their dimensions for the multiple bodies example of Section 4.5 is given in Table A1. A comprehensive list of the variables used in the paper is given in Table A2, with a pointer in each case to the equation where the variable is first introduced.

3-D Projected L1 inversion of gravity data Table A1. List of variables and their dimensions. Name

Dimension

Memory requirements (Bytes)

G ˜ G

6000 × 72000

3456000000

6000 × 72000

3456000000

˜˜ G

6000 × 72000

3456000000

At

72000 × 350

201600000

Bt

351 × 350

14024

Ht

6000 × 351

16848000

U

351 × 351

985608

V

350 × 350

980000

Γ

351 × 350

982800

m

72000 × 1

576000

yt (ζ)

72000 × 1

576000

ht (ζ)

72000 × 1

576000

zt (ζ)

350 × 1

2800

dobs

6000 × 1

48000

˜r

6000 × 1

48000

W

72000 × 72000

1728008 (sparse)

Wz

72000 × 72000

1728008 (sparse)

WL1

72000 × 72000

1728008 (sparse)

Wd

6000 × 6000

144008 (sparse)

37

38

S. Vatankhah, R. A. Renaut, V. E. Ardestani

Table A2. List of variables

Name

Type

Description

Equation

k, K,Kmax

Z+

IRLS iteration index, number of iterations, maximum K

Algorithms (1), (2)

Z

+

Number of measurements

(5)

n

Z

+

Number of unknown model parameters

(5)

t, ttrunc

Z+

Size of projected Krylov space, truncation for UPRE

(20), (30)

α, αopt

R

Regularization parameter full problem, optimal

(1), (25)

ζ, ζopt

R

Regularization parameter projected problem, optimal

(23), (26)

β, 

R

Weighting parameter

(16), (4)

ρj

R

Unknown density at cell j

(5)

τ1 , τ2

R

Noise parameters for simulation

(27)

ω

R

Truncation parameter

(30)

σi , γi

R

Singular values full, projected matrices

Appendix A

dobs , dexact

Rm

m

Observables, Exact

(6), (6)

m

Noise in the data

(6)

n

R

η m, mapr

R

Model parameters, estimate of model parameters

(5), (7)

r, ˜r

Rm

Residual, weighted residual

(7), (14)

n

y

R

Model deviation

(7)

h(α)

Rn

Tikhonov solution

(15)

zt (ζ)

Rt

Solution of projected problem

(24)

t+1

et+1 ˜˜ G, G

R

Unit vector length

(19)

Rm×n

Model matrix, Left and right preconditioned G

(5), (14)

Wd

Rm×m

Data weighting: diagonal, sparse, constant

(1)

W (·)

n×n

R

Model weighting: diagonal, sparse, variable dependent

(16)

WL0 (·)

Rn×n

Minimum support: diagonal, sparse,variable dependent

(4)

WL1 (·)

n×n

L1 regularization matrix: diagonal, sparse, variable dependent

(10)

n×n

R

Wz

R

Depth weighting: diagonal, sparse, constant

(16)

At

Rn×t

Bt

GKB factorization: orthonormal columns

(19)

(t+1)×t

GKB factorization: bidiagonal

(19)

m×(t+1)

R

Ht+1

R

GKB factorization: orthonormal columns

(19)

P α (·)

R

Variable dependent objective function

(1), (8), (11)

P (z)

R

Objective function for z

(23)

Φ(m)

R

Data Misfit

(2)

S(m)

R

Stabilizing Function

(1)

kxkp

R

p-norm

(3)

RE (K)

R

Relative error at step K

(28)

U (α), U (ζ)

R

Unbiased predictive risk, full and projected

(25), (26)

ζ

Suggest Documents