Large-Scale Non-Linear 3D Reconstruction Algorithms for Electrical Impedance Tomography of the Human Head L. Horesh1, M. Schweiger2, S.R. Arridge2 and D.S. Holder1 1
UCL/Medical Physics, Neurophysiology, London, UK 2 UCL/Computer Science, CMIC, London, UK
Abstract— Non-linear image reconstruction methods are desirable for applications in electrical impedance tomography (EIT) such as brain or breast imaging where the assumptions of linearity are violated. We present a novel non-linear Newton-Krylov method for solving large-scale EIT inverse problems, which has the potential advantages of improved robustness and computational efficiency over previous methods. This combines Krylov-subspace efficiency in the production of an implicit Hessian inverse together with the Newton-type search direction effectiveness. The computational cost was assessed by comparing the objective function value and image error norm with respect to run-time, iteration count and memory consumption with six other non-linear methods, including Damped Newton-Gauss, Levenberg-Marquardt, Variable Metric and non-linear Conjugated Gradients, using realistic layered head models with meshes of 4, 12 and 31K elements. For the small-scale model, Newton-type methods slightly outperformed the Krylov-Newton approach, while the other large-scale methods performed poorly. For the larger two models, the Newton-Krylov approach converged much more rapidly than the Krylov-subspace and quasi-Newton methods; Newton-type methods failed to converge in the time available. This approach opens a new frontier for non-linear EIT image reconstruction, as it allows production of accurate solutions of large-scale realistic models using modest computational resources. Keywords— Large-scale, Newton-Krylov, Non-linear reconstruction, Monopolar sources
I. INTRODUCTION
tional challenge. However, they require a coarser discretisation of the domain, and this may impair their suitability for application to the geometrical complexity of real-life physiological models. Recent developments in multi-frequency image reconstruction, improvements in geometry modeling and the possible developments of anisotropic image reconstruction, result in inverse problems of increasingly larger size. Thus, alternative approaches which are robust, prompt and use moderate resources are required. In this study we introduce the novel inverse problem approach of Newton-Krylov. This combines the prompt convergence property of the Newton-type methods with the moderate computational requirements which characterizes Krylov subspace methods. The benefits of this method are demonstrated by comparing its performance with the commonly used methods, using realistic head models of increasing size. II. BACKGROUND The image reconstruction problem can be formulated as the following regularized optimization problem r (γ ) w 2 γ (VΓ ) = arg min FΓ ( γ ) − VΓ + τ ψ ( γ ,η ) γ f (γ ) k
A range of EIT applications, such as breast imaging and stroke, are characterized by large and complex admittivity changes. These changes go beyond the linear approximation domain, and therefore requires non-linear inverse solution framework. Over the years, numerous strategies for nonlinear image reconstruction were developed. However, these were either facilitating a Newton–type strategy, which limited their application to small scale problems, or employed the Krylov-subspace approach, which allowed large scale problems to be solved, but suffered from poor convergence rate towards the solution [1-3]. Alternative scalable approaches as Geometric Multi-Grid (GMG) [4] or regular grid transformation are suitable for a large scale computa-
(1)
m
where FΓ : → is the forward model operator, which maps admittivity distribution γ t into the data space,
VΓ stands for the given boundary voltages, τ t is the regularization hyper-parameter, ψ ( γ ,η ) : → is a reguk
larization operator. The objective function f ( γ ) comprises of the goodness of fitness of the data residual rw ( γ ) norm, and of a regularization term which allow imposition of prior information.
()
Most inversion schemes requires derivation of the gradi-
∆γ NLCG t +1 = α t +1 − g t +1 + β tPR p +1 t
λt
ent gt and at times also the approximate Hessian H t of f
g t = J r ( γ t ) − τ t ∇ψ ( γ t ) T t w
(2)
H t = J J t + τ t ∇ ψ ( γ t ) + λt I kxk λt
2
T t
pt +1
β
PR t +1
(3)
g = max
T t +1
( g t +1 − gt ) gt T gt
(7)
, 0
λt is a trust region size.
Lanczos-type variants of Krylov methods are extremely memory efficient, as only the storage of the previous search direction and the local gradient requires for forming a new search direction. Nevertheless, this 1st order approach convergence is slow.
A. Newton type methods
D. Globalization
The solution-update ∆γ at each step can be formulated using the above relations (2) and (3). The Damped GaussNewton variant is brought by
Frequently, the derived search direction approximation is valid only in the close vicinity of the current solution, and advancing by a full step in the proposed direction might even result in solution divergence. Two strategies can be used to restore local convergence: line search and trust region. The first is performed once a search direction is determined and then an inexact 1D line search is performed to find the step size α t . The second approach defines a trust
where J t :=
∂VΓ ∂γ j
∈
mxk
is defined as the Jacobian, and
∆γ DGN t +1 = α t H t
0
−1
gt
(4)
where α t denotes the required step size. The trust region Levenberg-Marquardt update is λt −1
∆γ LM t +1 = H t
gt
(5) nd
These methods employ explicitly the 2 order derivative of the objective function (Hessian), and therefore can provide quadratic residual error convergence. Nevertheless, for large-scale problems, the explicit Hessian formulation and inversion are prohibitively expensive computationally. B. Quasi-Newton approach This approach circumvents the computational burden of inverting the Hessian matrix by approximating its inverse directly. For an approximate inverse lim H t = H
−1
region, which shifts the original Hessian matrix by λt I . The trust region control parameter is adjusted at each iteration to meet with a sufficient convergence criterion. III. NEWTON-KRYLOV For large-scale problems, the explicit formulation and inversion of the approximate Hessian is computationally intractable. The Newton search direction can be derived efficiently by introducing an iterative inner loop to solve the systems brought in (4) or (5). These systems can be rewritten as
the
i →∞
H t ∆γ NK t +1 = α t g t λt
update step of the BFGS variant is given by
∆γ VMM i +1 = H i +1 ( g t +1 - g t )
(6)
Surprisingly, this approach in most cases is even more robust than the conventional Newton-type methods; however, the expected convergence is only super-linear. C. Krylov-subspace method Krylov-subspace methods generate series of mutually conjugate search directions. The Polak-Ribiere variant of the non-linear Conjugate Gradient method update is
(8)
The defining feature of the Krylov-subspace solvers is that the coefficient system, which is the approximate Hesλ
sian H t t in this formulation, is accessed only for matrixvector product operations. Therefore, the explicit formulation of the Hessian, can be replaced by an implicit one, by replacing all the occurrences of the operation H t γ by the following operation
H NK t γ = J tT ( J t γ ) + τ t ∇ 2ψ ( γ t ) γ + λt γ
(9)
Notice that the brackets surrounding J t γ result in a vecT
tor, which is then multiplied again by J t , so that the imT
mense term of J t J t is never formed explicitly. In this study, the General Minimal Residual (GMRes) Krylov solver was used to solve the linear normal equations (8). This Krylov subspace method, is based on a modified variant of the Gram-Schmidt orthogonalization process (Arnoldi method), and therefore requires storage of a moderate number of internal search directions. IV. REGULARISATION
Variable-Metric Method (VMM) (Quasi-Newton) and nonlinear Conjugate Gradients (NLCG). A modified Kaczmarz backprojection parameter space transformation was applied for enhanced convergence. These solvers employed a monopolar sources framework for forward modeling, inversebased multi-level forward solver preconditioning, and a feed forward strategy for prompt forward solution evaluations. L-curve estimator was used for selection of the regularization hyperparameter. The solvers were compared with respect to the following criteria: Convergence – objective function convergence over time and iterations. Robustness – as solution error convergence. Memory requirements – peak memory consumption. Image quality – maximal intensity and localization error.
Due to the ill-posedness of the problem, regularization is required. We consider a first order Tikhonov prior for a quadratic regularization functional
ψ (γ t ) = R (γ t − γ g )
A. Convergence
2
∇ψ ( γ t ) = RT R ( γ t − γ g )
VI. NUMERICAL RESULTS AND DISCUSSION
(10)
∇ 2ψ ( γ t ) = R T R k ×k
where R ∈ denotes a Gaussian smoothing operator, and γ g is the mean of a Gaussian prior distribution. The layered structure of the human head was treated by construction of a regional smoothing prior, which avoided imposition of smoothing over the interface region between layers.
For the smallest model Damped Gauss-Newton provided the best convergence with respect to runtime and iteration count. However, for the medium-size models both standard Newton-type solvers were already not tractable due to memory limitations. For both the medium size and the large models Newton-Krylov Damped Gauss Newton provided the best convergence followed by the Newton-Krylov Levenberg-Marquardt. The other large-scale methods Variable Metric and Conjugated-Gradients provided weak convergence (Fig 1). SH031 objective function convergence vs runtime 10
Three head models of 4K, 12K and 31K elements contained conductivity values from published literature for head tissues at 10Hz. 31 electrodes were placed over the head according to the extended 10-20 EEG scheme. The protocol contained 258 injection-measurement combinations. Two spherical perturbations with radius of 1.7 cm, were placed inside the brain in the left temporal region and on the occipital lobe and were simulated to have ischemia or hemorrhage. Boundary voltages were generated using the UCL Super-Solver, which employs a modified version of EIDORS 3D and the ILUPack package [5-7]. Zero mean Gaussian noise was simulated with amplitude of 0.01% of the maximal boundary voltages. The following non-linear inverse solvers were implemented: Damped Newton Gauss (DGN), Newton-Krylov Damped Gauss-Newton (NK-DGN), Levenberg-Marquardt (LM), Newton-Krylov Levenberg-Marquardt (NK-LM),
objective function
V. NUMERICAL TESTING SETUP
10
10
10
NLCG VMM NK LM NK DGN
-8
-9
-10
-11
0
2000
4000
6000 8000 time [sec]
10000
12000
Fig. 1 Objective function convergence vs. runtime for the largest model Runtime per iteration of the Levenberg-Marquardt variants grew gradually, as the trust region became smaller, and correspondingly the system became more and more illconditioned. Conjugated gradients and Variable Metric runtime per iteration was significantly shorter than the Newton-Krylov methods. Nevertheless, the overall convergence
over time and per iteration was in favor of the latter. Krylov inversion runtime depends strongly on the spectral properties of the Hessian system. More advanced preconditioning schemes such the Newton-Krylov-Schwartz (NKS) domain decomposition or Multi-Grid (NKMG) approaches can be employed to in order to accelerate this phase furthermore [8].
VII. CONCLUSIONS Newton-Krylov inversion demonstrated superior computational and image quality results over the conventional large-scale methods which were in common use so far. With suitable preconditioning of the normal equations, this approach offers rapid calculation of inverse problems with the use of moderate computational resources.
B. Robustness A comparison of norm difference between the acquired images and the simulated image assessed whether the convergence behavior acquired for the objective function represented convergence towards the solution, rather than towards some local minimum. Newton-Krylov Damped Gauss-Newton provided the lowest solution error norm, which was 28-138% better than those provided by the other methods.
ACKNOWLEDGMENTS The authors wish to thank A. Tizzard from Middlesex University and R. Schindmes from University College London for producing MRI-based Finite-Element meshes, E. Sherman for graphical design assistance and A. McEwan for his insightful remarks.
REFERENCES
C. Memory requirements Conventional Newton-type methods were manageable only for the smallest model. For the medium size and large model, Non-linear Conjugated Gradients was the most memory efficient, whereas other methods introduced peak memory demand which was up to 14% more expensive.
1.
2.
D. Image quality Krylov-Newton Damped Gauss-Newton provided the sharpest images, recovering perturbation values which were up to 141% larger than those achieved by the Variable Metric method. Images acquired by Newton-Krylov inversion yielded a smaller localization error than their conventional counterparts. For the latter, the recovered perturbations were considerably shifted towards the boundary (Fig. 2).
3.
4.
5.
6.
7.
8.
Fig. 2 Iso-surfaces of reconstructed image from the largest model, Left: simulated image; Center: Newton-Krylov Damped Gauss-Newton solution; Right: Non-linear Conjugated Gradients solution
L. Horesh, R. H. Bayford, R. J. Yerworth, A. Tizzard, G. M. Ahadzi, and D. S. Holder, "Beyond the linear domain - The way forward in MFEIT image reconstruction of the human head,", proceedings of the ICEBI'04 - XII International Conference on Electrical Bio-Impedance joint with EIT - V Electrical Impedance Tomography 2004, ISBN 83917681-6-3 ed 2004, pp. 683-686. M. Molinari, S. J. Cox, BH Blott, and GJ Daniell, "Efficient nonlinear 3D Electrical Tomography reconstruction 2," in ISBN 0 85316 2247, 2nd World Congress on Industrial Process Tomography ed Hannover,Germany: 2001. N. Polydorides, W. R. Lionheart, and H. McCann, "Krylov subspace iterative techniques: on the detection of brain activity with electrical impedance tomography," IEEE Trans. Med. Imaging, vol. 21, no. 6, pp. 596-603, 2002. L. Borcea, "A nonlinear multigrid for imaging electrical conductivity and permittivity at low frequency," Inverse problems, vol. 17, no. 2, pp. 329-359, 2001. L. Horesh, M. Schweiger, M. Bollhöfer, A. Douiri, D. S. Holder, and S. R. Arridge, "Multilevel preconditioning for 3D large-scale soft field medical applications modelling," 2006. M. Bollhöfer, Y. Saad, and O. Schenk, "ILUPACK - preconditioning software package,", Available online at http://www.math.tuberlin.de/ilupack/, Release V2.1, January ed 2006. N. Polydorides and W. R. Lionheart, "A Matlab toolkit for threedimensional electrical impedance tomography: a contribution to the Electrical Impedance and Diffuse Optical Reconstruction Software project,", 13 (12) 1871-83 ed 2002. D. A. Knoll and D. E. Keyes, "Jacobian-free Newton–Krylov methods:a survey of approaches and applications," Computational Physics, pp. 357-397, 2004. Author: Lior Horesh Institute: University College London Street: Malet Place City: London Country: UK Email:
[email protected]