A Framelet-Based Approach for Image Inpainting - Semantic Scholar

1

A Framelet-Based Approach for Image Inpainting Raymond Chan, Lixin Shen, and Zuowei Shen

Abstract— In this paper, we present an image inpainting algorithm based on framelet analysis. The motivation for using framelets is that the redundancy provided by the framelets makes it possible to propagate the accurate information from the vicinity of the region to be inpainted to the inside of the region to be inpainted. This is done by small perturbations of the framelet coefficients via thresholding operators. Experimental results show that our proposed algorithm is fast and effective.

Index Terms - Image inpainting, framelets I. I NTRODUCTION Image inpainting refers to the filling-in of missing data in digital images based on the information available in the observed region. Applications of this technique include film restoration, text or scratch removal, and digital zooming. Many successful inpainting algorithms have been proposed in this active area of research. Bertalmio, Sapiro, Caselles, and Ballester in [1] first introduced the term digital inpainting into the field and proposed a novel third-order PDE inpainting model. An early variational approach was proposed by Masnou and Morel [12], where they interpolate the data using an elastica-based variational inpainting model. Chan and Shen have proposed the total variation inpainting model [5], a PDE inpainting model based on curvature driven diffusions [4], and a Euler’s elastica and curvature-based inpainting model [3]. The PDE based inpainting approach in [1] fills the missing region by propagating information from the outside of the masked region along isophotes while the total variation inpainting methods [3], [4], and [5] extend level sets into the missing region without smearing discontinuities along the tangential direction of the boundary of the missing regions. Their implementations depend on the numerical PDE methods used. In this paper, we develop a framelet-based approach for image inpainting. It is well known that errors in a signal can be reduced when a redundant system is used to represent it (see e.g. [8]). It was also pointed out in [11] that the redundancy is useful because it leads to robust signal representation in which partial loss of data can be tolerated without adverse effects. In fact, the discrete Fourier transform frames have been applied to a number of problems in signal reconstruction, error controlled coding, fault-tolerant, and spectrum analysis in [11]. Since images can be modeled as piecewise smooth functions Raymond Chan is with the Department of Mathematics, The Chinese University of Hong Kong, Shatin, Hong Kong, China. Email: [email protected]. This work was supported by HKRGC Grant CUHK 400503 and CUHK DAG 2060257. Lixin Shen is with the Department of Mathematics, Western Michigan University, Kalamazoo, MI 49008. Email: [email protected]. Zuowei Shen is with the Department of Mathematics, National University of Singapore, 2 Science Drive 2, Singapore 117543. Email: [email protected]. Research supported in part by several grants at the National University of Singapore.

and framelets have sparse representations for such functions, this motivates us to develop new algorithms based on tight framelet systems, which have good localization in both time and frequency domain as orthonormal wavelet systems have but are redundant. Our framelets method has a built-in regularization effect in the sense that the framelet coefficients it produces have the smallest `2 norm. The effect allows the framelet coefficients from outside of D, the region to be inpainted, to affect the missing framlet coefficients inside of D in a smooth way. Our method gives images that are 2 to 3 dB better in PSNR than those obtained in [4] and [5], and is more than 10 times faster. The outline of this paper is as follows. In Section II, we give our algorithm. In Section III, we compare our methods with those in [4] and [5]. Finally in Section IV, we conclude the paper. II. T HE INPAINTING ALGORITHM A. The piecewise cubic tight framelet system The study of compactly supported (bi)orthogonal wavelet bases of arbitrarily high smoothness has been widely received since Daubechies’ celebrated works [7]. Tight framelets generalize orthogonal wavelet systems. They preserve the unitary property of the related analysis and synthesis operators, while sacrificing the orthogonality and the linear independence of the system in order to get more flexibility. The simplest ‘wavelet type’ tight frame system which is good for signal and image processing [6] is the translation invariant wavelet system that is obtained by oversampling an orthonormal wavelet system to form a redundant system. Although any kind of redundant system can be adopted in our proposed algorithm for image inpainting, we use the spline framelet systems constructed from the unitary extension principle of [14], since they are either symmetric or anti-symmetric, and have small supports for a given smoothness order. In other words, they have a good time frequency localization. In particular, here we use the piecewise cubic tight framelet system of [14] as it is simple, efficient and well serve our goal. Define τ0 (ω) = cos2m ω2 . The trigonometric polynomial τ0 2m b is the refinement symbol of the B-spline φ(ω) = sin(ω/2)(ω/2) 2m b b of order 2m. Wavelets are given as ψn (ω) = τn (ω/2)φ(ω/2) with wavelet masks sµ ¶ 2m τn (ω) = sinn (ω/4) cos2m−n (ω/4), n for 1 ≤ n ≤ 2m. It is shown in [14] that the system S = {2k/2 ψn (2k · −j) : k, j ∈ Z; n = 1, . . . , 2m} is a tight framelet system. The choice of m = 2 gives the so-called piecewise cubic tight framelet system.

2

When only digital samples are given and when the tight frame system is obtained via the unitary extension principle of [14], both the analysis and synthesis operators can be implemented by the set of filters associated with the framelet system. This is very similar to the implementation of orthonormal wavelet systems. The details of the implementation can be found in [9]. For piecewise cubic tight framelet system, 1 [1, 4, 6, 4, 1], h1 = 81 [1, 2, 0, −2, −1], the filters are h0 = 16 √ 6 h2 = 16 [−1, 0, 2, 0, −1], h3 = 18 [−1, 2, 0, −2, 1], and h4 = 1 16 [1, −4, 6, −4, 1]. As we can see, every filter here is either symmetric or anti-symmetric which is desirable for image processing. The tensor products of the piecewise cubic tight framelet system produce a tight framelet system in L2 (R2 ) with masks defined by τi,j (ω1 , ω2 ) = τi (ω1 )τj (ω2 ) for i, j ∈ {0, . . . , 4}. Accordingly, the filter are hij = hti hj for i, j ∈ {0, . . . , 4}. Before we go to the next subsection, we define the factors αj = max |τj (ω)|, i.e. αj is the sum of the absolute values of the filter taps in hj . The factors are to be used in our algorithms to scale the thresholds in order that all framelet coefficients for the different framelet subbands at the same decomposition level will be processed with the same threshold. For piecewise √ cubic tight framelet filters, α0 = 1, α1 = 43 , α2 = 46 , α3 = 34 , and α4 = 1. The factors for the corresponding tensor product filters are αij = αi αj . B. Framelet inpainting We now describe the idea for image inpainting using framelets. There are three steps: (i) The image to be inpainted is first transformed into the framelet domain via the analysis operator A so that the image is represented by the set of framelet coefficients. (ii) To propagate the information from the outside of D into D, framelet coefficients are perturbed by a thresholding operator. (iii) An image is then obtained by the synthesis operator A∗ using the perturbed framelet coefficients. To have a smooth transition of pixel values from the outside of D to the inside of D, we iteratively apply (i)– (iii) until convergence. Since S is a tight frame, A∗ A = I, the identity operator. However, AA∗ 6= I for our S (see e.g. [14]). This is crucial in our method as we now explain. Let f be a function given by its perturbed coefficient sequence c := {cg }g∈S in Step (iii) above, i.e. X f = A∗ c = cg g. g∈S

In Step (i) of the next iteration, when we apply the analysis operator A to f , the sequence Af = AA∗ c 6= c. At the same time, since f = A∗ Af , f can also be represented by the sequence Af . (We note that f can be represented by two different sequences under S as S is redundant.) Frame theoy tells us that Af has the minimum `2 norm among all sequences P {cg }g∈S such that f = g∈S cg g (see e.g. [13]). This means that when the operator AA∗ acts on the perturbed sequence {cg }g∈S , it gives another new sequence that also represents f but has the minimum `2 norm. In particular, the process smoothes and regularizes f . Through the process, information contained in the framelet coefficients from outside of D can

propagate smoothly to the missing framelet coefficients inside D. The analysis and synthesis operators can be implemented fast via the decomposition and reconstruction algorithms in discrete forms as illustrated in [9]. The simplest and fastest perturbation operator is the thresholding operator, which we introduce next. Let x be the framelet coefficient of the underlying image, the hard thresholding operator (see [10]) is defined by ½ 0 |x| ≤ T, ηT (x) = (1) x otherwise, where T is a threshold level. We note that the thresholding operator essentially suppresses the small framelet coefficients completely. Since tight framelet systems can represent images sparsely (see e.g. [2]), the small framelet coefficients contain no significant information of the underlying signal, while the random noise normally spreads out in the small framelet coefficients. By thresholding the framelet coefficients, we obtain two extra benefits—first, it removes the random noise in the image, and secondly it enhances the edges and high frequency features of the previous approximations. This is exactly what we want, since images are bounded to have noises and we want to keep the edges and details in the image. C. The algorithms In the following we give our algorithms. Algorithm 1: 1. Let f (0) be the given image. Let Ω be the image domain and D be the missing region to be inpainted. That is, we have the available information f (0) χΩ\D where χ is the characteristic function. Let hij , i, j ∈ {0, . . . , 4} be the tight framelet filters for the cubic tight framelet. Let T be the threshold and ² be the error tolerance. 2. Decompose the image f (r) with tight framelet filters. (r) Let fij be the framelet coefficients from the filter hij , (r) (i, j) 6= (0, 0). Threshold every coefficient in fij by (1) with threshold αij · T . Reconstruct the image f (r+1) from the modified framelet coefficients. 3. Set f (r+1) = f (r+1) χD + f (0) χΩ\D . 4. Compute the relative error kf (r+1) − f (r+1) k/kf (r+1) k. If it is less than ² stop, otherwise go back to Step 2. We note that in Step 3, the pixel values outside of D are replaced by their original true pixel values as these pixel values may have been changed by Step 2. One remaining issue in Algorithm 1 is how to choose the threshold T . As mentioned before, the large framelet coefficients keep the edges and details of the underlying image. The amplitudes of the framelet coefficients in the neighborhood of D are relatively large, partially because it contains the information of the missing data and it is also due to the artifacts created by the missing data. If the threshold T is too small, most of these coefficients will remain unchanged and the artifacts will not be removed. If the threshold T is too large, many small features of the image will disappear. To resolve this, we apply Algorithm 1 several times with decreasing T ’s.

CHAN, SHEN AND SHEN: A FRAMELET-BASED APPROACH FOR IMAGE INPAINTING

Algorithm 2: 1. Let g (0) = f (0) be the given image. Define J thresholds Tj = 2J−j with j = 0, 1, . . . J. 2. Perform Algorithm 1 with starting image g (j) and threshold T = Tj . The output of Algorithm 1 is denoted by g (j+1) . 3. Set j = j + 1. Repeat Step 2 until j > J. In Algorithm 2, T0 is large compared to the other Tj ’s. The output g (1) is a coarse approximation of the image to be recovered in the sense that large scale information outside D has been propagated into the region D. To refine the coarse approximation, the sequence of decreasing Tj is chosen to propagate more detail information to the region D in each iteration. III. S IMULATIONS We now present experimental results for Algorithm 2. Regarding the parameters, we choose J = 5 and the error tolerance ² = 0.0001. All images shown were generated by Matlab on a 2.53 GHz Pentium IV machine with 256 MB of memory running Windows XP. For comparison purposes, we also include the results by Chan-Shen’s curvature-driven diffusion (CDD) inpainiting model [4] with Matlab codes provided by the authors. The captions of Figures 1–3 show the performance of both methods. They indicate that i) the texts on the images are successfully removed by both methods; ii) The PSNR of images from our algorithm is 2 to 3 dB better than those from the CDD model; and iii) our algorithm is over 10 times faster than the CDD model. We emphasize however that both the CCD codes and ours are not optimized, so the timing is a rough comparison. IV. C ONCLUSION In this paper we have proposed a tight framelet-based image inpainting algorithm. The basic idea is to iteratively squeeze information from the outside into the inside of the region to be inpainted by using the redundancy of the tight frame system via perturbing the framelet coefficients. Our algorithm is effective, simple, fast and easy to be implemented. Acknowledgment: We thank Profs. Tony Chan and Jackie Shen for their valuable discussions and for allowing us to use their codes for comparison. R EFERENCES [1] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester. Image inpainting. In Proceedings of SIGGRAPH, New Orleans, LA, 2000. [2] L. Borup, R. Gribonval and M. Nielsen. Bi-framelet systems with few vanishing moments characterize Besov spaces. Appl. Comput. Harmon. Anal., 17(1):3-28, 2004. [3] T. Chan, S. H. Kang, and J. Shen. Euler’s elastica and curvature-based inpainting. SIAM J. Applied Mathematics, 63:564–592, 2002. [4] T. Chan and J. Shen. Nontexture inpainting by curvature driven diffusion (CDD). J. Visul Comm. Image Rep., 12:436–449, 2001. [5] T. Chan and J. Shen. Mathematical models for local nontexture inpaintings. SIAM J. Applied Mathematics, 62:1019–1043, 2001. [6] R. Coifman and D. Donoho. Translation-invariant de-noising. In Wavelet and Statistics, Springer Lecture Notes in Statistics, volume 103, pages 125–150, New York, 1994. Springer-Verlag. [7] I. Daubechies. Orthogonal bases of compactly supported wavelets. Comm. Pure and Applied Math., 41:909–996, 1988.

3

[8] I. Daubechies. Ten Lectures on Wavelets, Volume 61 of CBMS Conference Series in Applied Mathematics. SIAM, Philadelphia, 1992. [9] I. Daubechies, B. Han, A. Ron, and Z. Shen. Framelets: MRA-based constructions of wavelet frames. Applied and Computation Harmonic Analysis, 14:1–46, 2003. [10] D. Donoho and I. Johnstone. Ideal spatial adaptation by wavelet shrinkage. Biometrika, 81:425–455, 1994. [11] P.J.S.G. Ferreira. Mathematics of multimedia signal processing II discrete finite frames and signal reconstruction. Signal Processing for Multimedia J. S. Byrues (Ed.), pages 35–54. IOS Press, 1999. [12] S. Masnou and J.-M. Morel. Level lines based disocclusion. In Int. Conference on Image Proc., pages 259–263, Chicago, IL, 1998. [13] A. Ron and Z. Shen. Frames and stable bases for shift invariant subspaces of L2 (Rd ). Canadian Journal of Mathematics, 47:1051-1094, 1995. [14] A. Ron and Z. Shen. Affine system in L2 (Rd ): the analysis of the analysis operator. Journal Func. Anal., 148:408–447, 1997.

4

(a)

(a)

(b)

(b)

(c)

(c)

Fig. 1. (a) 256 × 256 “Peppers” image with text; (b) Output of Algorithm 2 with PSNR=34.83 dB. The CPU time is 162.08 seconds; and (c) Output of Chan-Shen Model with PSNR=32.91 dB. The CPU time is 1681.81 seconds.

Fig. 2. (a) 512 × 512 “Lena” image with the text; (b) Output of Algorithm 2 with PSNR=37.60 dB. The CPU time is 521.58 seconds; and (c) Output of Chan-Shen Model with PSNR=34.78 dB. The CPU time is 7862.97 seconds.

CHAN, SHEN AND SHEN: A FRAMELET-BASED APPROACH FOR IMAGE INPAINTING

5

(a)

(a)

(b)

(b)

(c)

(c)

Fig. 3. (a) 512×512 “Lena” image with another text; (b) Output of Algorithm 2 with PSNR=36.40 dB. The CPU time is 509.08 seconds; and (c) Output of Chan-Shen Model with PSNR=34.38 dB. The CPU time is 7883.61 seconds.

Fig. 4. (a) A 128 × 128 portion of “Peppers” image with text. Inpainted results (b) from Algorithm 2; and (c) from Chan-Shen Model.