On the Convergence of Generalized Simultaneous ... - IEEE Xplore

15 downloads 0 Views 334KB Size Report
On the Convergence of Generalized Simultaneous. Iterative Reconstruction Algorithms. Jiong Wang and Yibin Zheng, Senior Member, IEEE. Abstract—In this ...
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 16, NO. 1, JANUARY 2007

1

On the Convergence of Generalized Simultaneous Iterative Reconstruction Algorithms Jiong Wang and Yibin Zheng, Senior Member, IEEE

Abstract—In this paper, we generalize the widely used simultaneous block iterative reconstruction algorithm and show that it converges, at a linear rate, to a weighted least-squares and weighted minimum-norm reconstruction. Our theoretical result provides a much simpler proof of the convergence properties obtained by Jiang and Wang and covers a much more general class of algorithms. The frequency domain iterative reconstruction algorithm is then introduced as a special application of our theory. Index Terms—Biomedical imaging, image reconstruction, iterative methods, signal reconstruction, tomography.

I. INTRODUCTION

I

TERATIVE reconstruction algorithms have been widely used in computational imaging applications, especially in medical imaging such as computed tomography (CT). Iterative algorithms are advantageous when there is no closed form reconstruction, when the data are incomplete, or when a priori image model and noise must be considered. The main disadvantage of iterative algorithms is its heavy computational burden. However, with the advance of high-performance computing technology, this is becoming a less serious hurdle. Many practical problems can be linearized with good accuracy, and we shall limit our discussions to such image reconstruction problems. After choosing appropriate basis functions (e.g., pixels) to represent the unknown image, one can generally formulate the reconstruction problem as solving a very large linear system of equations (1)

where is a vector of coefficients of the basis functions, is the measurement data with size , and is the matrix representing the measurement process. The matrix is usually rank-deficient, at least numerically. This means that (1) may have infinitely many exact or least-squares solutions, and the solution must be regularized. Furthermore, because of the very large size of the problem, iterative algorithms must be used as a practical way to find the solution. There are many different iterative reconstruction algorithms. In this paper, we consider a class of generalized simultaneous block iterative (SimBI) algorithms [4] using linear algebra formulations. More significantly, Manuscript received February 27, 2006; revised June 30, 2006. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Til Aach. The authors are with the Department of Electrical and Computer Engineering, University of Virginia, Charlottesville, VA 22904 USA (e-mail: [email protected]; [email protected]). Digital Object Identifier 10.1109/TIP.2006.887725

we establish their convergence properties and demonstrate the relationship between the forms of iteration and regularization. The algebraic reconstruction technique (ART) proposed by Gordon et al. [3] in 1970 was the first iterative algorithm used in CT. According to the classification in [4], ART is a sequential block-iterative (SeqBI) algorithm which partitions the matrix into rows and iteratively update the reconstruction by a column-wise projection-backprojection procedure. In 1984, Andersen and Kak [5] introduced the simultaneous algebraic reconstruction technique (SART) as a major improvement over ART. SART belongs to the class of simultaneous block-iterative (SimBI) algorithms which update the reconstruction by projection-backprojection of the entire measurement. Other SimBI algorithms include the Cimmino’s method [6], the Landweber’s iteration [7], the component averaging (CAV) algorithm [8], [9], and the diagonal weighting (DWE) algorithm [8], [9] all developed with different motivations. Compared to the SeqBI algorithms, SimBI algorithms are less artifact-prone and have no cyclic subconvergences (also called limit cycles) phenomenon, even with constant step size and inconsistent data [4], [10]–[14]. Both SeqBI and SimBI algorithms are widely used in various applications, particularly those related to CT [15]–[18]. All SimBI algorithms mentioned above can be regarded as special cases of the general form, expressed in matrix-vector notations (2) where is the conjugate transpose of , and , are positive definite matrices whose significance will become clear later. is the iteration number, and is a positive scalar called the is a relaxation coefficient or the step size. Oftentimes constant, but it does not have to be. The first well-known SimBI algorithm—the Cimmino’s method—was proposed in 1938 [6]; however, the proof of convergence has not been formally established until recently by the independent works of Jiang and Wang [2], and Censor and Elfving [8], with different restrictions on and . Jiang and Wang assumed that and are diagonal with all positive diagonal elements, while Censor and Elfving relaxed this condition to that , are positive definite. While [2] gave a convergence limit, neither [2] or [8] has given the rate of convergence. In the remainder of this paper, we shall give an elegant closed-form expression of the convergence limit of the generalized SimBI algorithm (2), and establish the rate of convergence which is linear with constant step size. Our proof of convergence is based on linear algebra formulations and much simpler than those in [8] and [2]. Like [8], we only require that , be positive definite. Furthermore, we will show that (2) converges

1057-7149/$25.00 © 2006 IEEE

2

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 16, NO. 1, JANUARY 2007

to the weighted least-square solution while minimizing norm between the solution and the initial condition. the Therefore, our results can be seen as completion and extension of the work of [2] and [8].

is a least-squares solution of . Now, we Therefore, minimizes will show that among all least-square solutions, . Suppose is another least-squares solution, and , then . That . Furthermore, is, lies in the null space of because

II. CONVERGENCE OF SIMBI ALGORITHMS Consider first the simple case where and tity matrices. Equation (2) is simplified to

are both iden-

(3) Equation (3) is called the basic SimBI algorithm in this paper. An Hilbert Space formulation of the basic SimBI was given in [19]. We shall prove that the basic SimBI algorithm converges to the limit

where the last equality results from and the and have the same null space. Therefact that fore

(4) where is the identity matrix and denotes the pseudo inverse of . The pseudo inverse is defined as the unique matrix satisfying the four Penrose conditions [20]

which is minimized if and only if , i.e., . converges to . Substituting (6) Next, we shall prove that into (3), we get

(7) (5) The remainder of this paper will use the following three important properties of the pseudo inverse [21]: 1) , 2) and have the same null space, and 3) let be a full rank and square matrix, then . Before proceeding, we first state a lemma whose proof is provided in the Appendix. Lemma 1: If for some positive number we have , and , then when

for any positive

Because is a Hermitian matrix, by the spectral theorem, it can be decomposed as

(8) where s are the orthonormal eigenvectors of are their associated eigenvalues with the property substituting (8) into (7), we get

and s . By

.

We now have the following theorem on the convergence of the iteration in (3). Theorem 1: Let be the largest eigenvalue . If and of the matrix , then the iteration (3) converges to (4), which is the least-square solution to while minimizing the norm , being the initial start of the iteration. Furthermore, the rate at which converges to zero is linear if . Proof: By linear algebra theory, is a least-squares soluif and only if . By the definition tion of , it is easy to verify that , and of

(9) where the second equality results from the principle of induction . Now, (repeatedly applying the first equality) and if

, then

. Otherwise, by Lemma 1, as

. Therefore, we have

(10) (6)

WANG AND ZHENG: ON THE CONVERGENCE OF GENERALIZED SIMULTANEOUS ITERATIVE RECONSTRUCTION ALGORITHMS

where s are the eigenvectors whose associated eigenvalues . Note that are zero, i.e.,

By Theorem 1,

3

converges to

(15)

Therefore, the right-hand side of (10) is zero because . , beFinally, the convergence rate of (3) is linear if cause from (9)

(11) is the smallest positive eigenvalue of . where Equation (11) shows that linearly with rate , provided . . It also follows that We can interpret the solution of the basic SimBI algorithm in (4) as follows. The first term is the minimum-norm . The second least-squares solution of the linear system is the orthogonal projection of onto the term null space of , which selects the least-squares solution that has . the minimum-norm Now, we are ready to generalize Theorem 1 to the iteration expressed in (2). and Theorem 2: If , then the SimBI iteration (2) converges to

(12) where and and matrix

are matrix square roots of and , i.e., , and is the largest eigenvalue of the . Equation (12) is a solution that mini-

mizes the

-weighted square error , while among all such solutions minimizing the norm . Furtherconverges to zero linearly if . more, Proof: Since both and are positive definite matrices, and they can be factorized (square rooted) as , where both and are themselves positive definite. Then (2) can be rewritten as

(13) By introducing transformations obtain the following iteration for

and

, we

: (14)

converges to , which is exTherefore, pressed in (12). Moreover, since (15) is the least-square while minimizing the norm solution to , (12) is the solution to the least-square problem while minimizing the norm . converges to zero linearly Finally, since when according to Theorem 1, it follows that converges to zero linearly. The weighted least-square and weighted minimum-norm inand may be used terpretation of Theorem 2 implies that to model the data and the image. For example, may be used to select a preferential image among many possible solutions. It is also worth noting the difference between the solution obtained by iteration (2) and that obtained by conventional penalized least-square approach. In conventional penalized least square, a cost function of the following form is minimized

(16) controls the tradeoff between data accuracy and where is often difficult image model accuracy. The choice of and subjective. Changing the value of requires a new solution. On the other hand, the convergence limit of (2) provides the best (exact or least square) fit of the data, and then secondarily the best match of image model. This is equivalent to in (16). Although iteration (2) tries to taking the limit fit the data exactly, it does not result in unstable reconstruction because in practice, the iteration stops before achieving exact data fit, and the solution at that point is regularized by the norm . Thus, as the iteration proceeds, we gradually emphasize the measured data and deemphasize the a priori image model. III. EXAMPLES OF THE WEIGHTING MATRICES We will now give two examples of specific forms for and . The first example is the widely used SART algorithm, and the second example introduces a frequency domain iteration incorporating stationary image correlation models. A. SART and Other Diagonal Weighting Iterative Algorithms Several well-known simultaneous iterative algorithms such as SART, Cimmino’s algorithm, DWE, and CAV can be forand being diagonal mulated as special cases with both and positive definite. Therefore, their convergence properties can be easily studied by invoking Theorem 2. [2] reviewed the similarities of these algorithms but the analysis in [2] is much

4

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 16, NO. 1, JANUARY 2007

more complicated than ours. Given the similarities of these algorithms, we only discuss the SART algorithm here. The term SART was first used by Andersen and Kak [5] in 1984. It takes the iterative form

explicit form for

:

.. .

.. .

..

.

.. .

(20) otherwise (17)

where and are indices for the data point and image pixel, respectively. It can be seen that SART is a special case of (2) with

Columns of

are circular shifts of the vector . By the discrete convolution theorem, a circulant matrix is diagonalized by the DFT matrix [22], [23] (21) where

denotes the normalized DFT matrix as defined by

(22)

(18) Therefore, SART converges to a weighted least-squares and weighted minimum-norm solution with weights expressed in . For computed tomography (18), provided problems, where represents line integrals, it was shown in ; thus, SART converges if . [2] that B. Image Correlation Modeling and Frequency Domain Iteration As discussed in Section II, and may be used to weight data importance and select preferential reconstruction image. Generally, the preferential solution is described in terms of the norm, while the explicit expression of is not readily known. This type of image model creates two difficulties for the implementation of (2). First, the explicit expression for is necessary to apply (2). Second, if is not sparse, the iteration in (2) is too computationally expensive. Fortunately, for an imnorm is spatially shift inportant class of models where the and are both (block) circulant and, variant,1 the matrices therefore, can be diagonalized by the multidimensional discrete ( is normalized to Fourier transform (MD-DFT) matrix ). Consider a simple example where represents a length- 1-D image. Assume that we prefer a solution minimizing gray level differences between neighboring pixels, and . Then, the norm start with the initial condition could be expressed as

(19) is a small positive number guaranteeing positive where definiteness. It is easy to see that (19) implies the following 1Strictly speaking, the model should be invariant under spatial circular shift. However, for practical size images, the circular boundary condition has little effect.

It follows that

is also circulant and diagonalized by (23)

Equation (23) suggests a frequency domain implementation of yields (with iteration (2). Specifically, multiplying (2) with )

(24) Let

be the frequency domain image, then (25)

This frequency domain iteration requires the computation of , which is the row-wise DFT of (conjugate of ) and can be implemented with the fast Fourier transform (FFT). For multian -point radix-2 FFT, a total of plications are required [24]. Furthermore, since is diagonal, will require only computation of multiplications. This compares multiplications for direct compuvery favorably against the , and makes the frequency domain implementatation of tion much more computationally affordable than direction implementation of (2). Obviously, when the image is 2-D or higher, the same frequency domain technique can be applied, with the formulations (19)–(23) modified correspondingly. For example, in the 2-D becomes case, (19) is modified to include 2-D neighbors, and are now doubly block the 2-D DFT matrix, while and

WANG AND ZHENG: ON THE CONVERGENCE OF GENERALIZED SIMULTANEOUS ITERATIVE RECONSTRUCTION ALGORITHMS

circulant [25]. The complexity analysis does not change as long is interpreted as the number of pixels. Simulation results as of the frequency domain iterative reconstruction algorithm had been reported in [26] and demonstrated to reduce certain noise and artifacts in the reconstruction image. IV. DISCUSSIONS AND CONCLUSIONS We have established the convergence properties of the generalized SimBI algorithm, including the limit of convergence and rate of convergence. A close examination of (2) suggests that and may be regarded as linear filters operating on the prediction error and back-projection image, respectively. Generally, these filters are not shift invariant, but in case they are, they can be implemented in the frequency domain for faster implemenand , it would be tation. With the filter interpretation of interesting to study if they could be replaced by nonlinear filters (e.g., median filters), and what kind of effects these filters have on the reconstruction. APPENDIX We will prove Lemma 1 used in Section II. To do this, we first cite a result obtained in [27]. Lemma 0: Assume that , and are sequences of nonnegative numbers satisfying

(26) for all , and that . Then 0 is a cluster point of . Using this preliminary result, we can prove the desired lemma. Lemma 1: If for some positive number , we have , and , when

then . Proof: From any positive then

, it follows . Define a sequence

for any positive for ,

satisfies

(27) Meanwhile

which leads to

5

Let , and invoking Lemma 0, we conclude that 0 is a cluster point of . and is monotonically decreasing, Furthermore, because , that is

.

REFERENCES [1] M. Jiang and G. Wang, “Convergence of the simultaneous algebraic reconstruction technique (SART),” IEEE Trans. Image Process., vol. 12, no. 8, pp. 957–961, Aug. 2003. [2] ——, “Convergence studies on iterative algorithms for image reconstruction,” IEEE Trans. Med. Imag., vol. 22, no. 5, pp. 569–579, May 2003. [3] R. Gordon, R. Bender, and G. T. Herman, “Algebraic reconstruction techniques (ART) for three-dimensional electron microscopy and X-ray photography,” J. Theoret. Biol., vol. 29, pp. 471–582, 1970. [4] Y. Censor and S. A. Zenios, Parallel Optimization: Theory, Algorithms and Applications. New York: Oxford, 1997. [5] A. H. Andersen and A. C. Kak, “Simultaneous algebraic reconstruction technique (SART): A superior implementation of the ART algorithm,” Ultrasonic Imag., vol. 6, pp. 81–94, Jan. 1984. [6] G. Cimmino, “Calcolo approssimato per le soluzioni dei sistemi di equazioni lineari,” La Ricerca Sci. XVI, ser. II, pp. 326–333, 1938. [7] L. Landweber, “An iteration formular for Fredholm integral equations of the first kind,” Amer. J. Math., vol. 73, pp. 615–624, 1951. [8] Y. Censor and T. Elfving, “Block-iterative algorithms with diagonally scaled oblique projections for the linear feasibility problem,” SIAM J. Matrix Anal. Appl., vol. 24, pp. 40–58, 2002. [9] Y. Censor, D. Gordon, and R. Gordon, “Component averaging: An efficient iterative parallel algorithm for large and sparse unstructured problems,” Parallel Comput., vol. 27, pp. 777–808, 2001. [10] K. Tanabe, “Projection method for solving a singular system of linear equations and its applications,” Numer. Math., vol. 17, pp. 203–814, 1971. [11] P. P. B. Eggermont, G. T. Herman, and A. Lent, “Iterative algorithms for larger partitioned linear systems, with applications to image reconstruction,” Linear Algebra Appl., vol. 40, pp. 37–67, 1981. [12] Y. Censor, P. P. B. Eggermont, and D. Gordon, “Strong underrelaxation in Kaczmarz’s method for inconsistent systems,” Numer. Math., vol. 41, pp. 83–92, 1983. [13] H. H. Bauschke, J. M. Borwein, and A. S. Lewis, “The method of cyclic projections for closed convex sets in Hilbert space,” in Proc. Recent Developments in Optimization Theory and Nonlinear Analysis, Jerusalem, Israel, 1997, vol. 204, pp. 1–38. [14] H. H. Bauschke and M. R. Edwards, “A conjecture by De Pierro is true for translates of regular subspaces,” J. Nonlinear Convex Anal., vol. 6, pp. 93–116, 2005. [15] G. Wang, D. L. Snyder, J. A. O’Sullivan, and M. W. Vannier, “Iterative deblurring for CT metal artifact reduction,” IEEE Trans. Med. Imag., vol. 15, no. 5, pp. 657–664, Oct. 1996. [16] G. Wang, M. W. Vannier, M. W. Skinner, M. G. P. Cavalcanti, and G. Harding, “Spiral CT image deblurring for cochlear implantation,” IEEE Trans. Med. Imag., vol. 17, no. 2, pp. 251–262, Apr. 1998. [17] G. Wang, G. D. Schweiger, and M. W. Vannier, “An iterative algorithm for X-ray CT fluoroscopy,” IEEE Trans. Med. Imag., vol. 17, no. 5, pp. 853–856, Oct. 1998. [18] K. Mueller and R. Yagel, “Anti-aliased three-dimensional cone-beam reconstruction of low-contrast objects with algebraic methods,” IEEE Trans. Med. Imag., vol. 18, no. 6, pp. 519–537, Jun. 1999. [19] P. L. Combettes and V. R. Wajs, “Signal recovery by proximal forward-backward splitting,” SIAM J. Multiscale Model. Simul., vol. 4, pp. 1168–1200, 2005. [20] R. Penrose, “A generalized inverse for matrices,” in Proc. Cambridge Philosophical Soc., 1955, vol. 51, pp. 406–413. [21] A. Albert, Regression and the Moore-Penrose Pseudoinverse. New York: Academic, 1972, p. 30. [22] T. Bose, Digital Signal and Image Processing. New York: Wiley, 2004, p. 692. [23] R. M. Gray, “Toeplitz and Circulant Matrices: A Review,” Tech. Rep., Inf. Syst. Lab., Stanford Univ., Stanford, CA, 1971 [Online]. Available: http://ee.stanford.edu/~gray/toeplitz.pdf [24] B. Porat, A Course in Digital Signal Processing. New York: Wiley, 1997, p. 141.

6

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 16, NO. 1, JANUARY 2007

[25] A. K. Jain, Fundamentals of Digital Image Processing. Englewood Cliffs, NJ: Prentice-Hall, 1989, pp. 28–30. [26] J. Wang and Y. Zheng, “Frequency domain simultaneous algebraic reconstruction techniques: Algorithm and convergence,” Proc. SPIE, vol. 5674, pp. 344–353, 2005. [27] M. R. Trummer, “Reconstructing pictures from projections: On the convergence of the ART algorithm with relaxation,” Computing, vol. 26, pp. 189–195, 1981. Jiong Wang was born in Shanghai, China, in 1980. He received the B.S. degree in electrical engineering from Shanghai Jiaotong University in 2002 and the M.S. degree from the University of Virginia, Charlottesville, in 2005, where he is currently pursuing the Ph.D. degree in the Department of Electrical Engineering. From May 2002 to August 2003, he was a Product Engineer with Intel Technology China at Shanghai. His research interests include statistical signal processing, medical imaging, and image reconstruction and analysis.

Yibin Zheng (M’92–SM’01) received B.S. degree in electrical engineering from Zhongshan University, Guangzhou, China, in 1988, the M.A. degree in physics from the State University of New York at Buffalo in 1992, and the Ph.D. degree in electrical and computer engineering from Purdue University, West Lafayette, IN, in 1996. From July 1996 to August 2000, he was a Senior Electrical Engineer with GE Global Research Center, Niskayuna, NY. Since August 2000, he has been on the faculty of the Department of Electrical and Computer Engineering, University of Virginia, Charlottesville. His primary research interests are computational imaging, statistical signal processing, and high-performance scientific computing.