Image Interpolation via Low-rank Matrix Completion and Recovery

13 downloads 11597 Views 12MB Size Report
the proposed method can be used to handle noisy data and random perturbations .... some entries in the augmented matrix are corrupted. This paper therefore ...
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2014.2372351, IEEE Transactions on Circuits and Systems for Video Technology IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, XXXX 201X

1

Image Interpolation via Low-rank Matrix Completion and Recovery Feilong Cao, Miaomiao Cai, and Yuanpeng Tan

Abstract—Methods of achieving image super-resolution have been the object of research for some time. These approaches suggest that when a low-resolution image is directly downsampled from its corresponding high-resolution image without blurring, i.e., the blurring kernel is the Dirac delta function, the reconstruction becomes an imageinterpolation problem. Hence, this is a pervasive way to explore the linear relationship among neighboring pixels to reconstruct a high-resolution image from a low-resolution input image. This paper seeks an efficient method to determine the local order of the linear model implicitly. According to the theory of low-rank matrix completion and recovery, a method for performing single-image superresolution is proposed by formulating the reconstruction as the recovery of a low-rank matrix, which can be solved by the augmented Lagrange multiplier method. In addition, the proposed method can be used to handle noisy data and random perturbations robustly. Experimental results show that the proposed method is effective and competitive compared with other methods. Index Terms—Image interpolation, super-resolution, reconstruction, low-rank matrix recovery, augmented Lagrange multiplier.

I. I NTRODUCTION MAGE super-resolution (SR) technology is always desirable in visual information processing to obtain more detail in an image. It aims to reconstruct a highresolution (HR) image from one or more low-resolution (LR) images [1-2]. This task essentially can be converted into an inverse problem of the image degradation process. However, the SR problem is inherently ill-posed because many HR images can generate the same LR image by downsampling. Therefore, prior knowledge and

I

F. L. Cao, M. M. Cai, and Y. P. Tan are with the Department of Information and Mathematics Sciences, China Jiliang University, Hangzhou 310018, Zhejiang Province, China. Corresponding author: F. L. Cao, e-mail: [email protected]. Manuscript received xxx xx, 2013; revised xxxx xx, 201x. Copyright (c) 2014 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending an email to [email protected].

fundamental assumptions are necessary to obtain highquality HR images from LR ones. This prior knowledge and these assumptions with respect to image capture, such as the type of motion, blurring, and quality, usually come from common sense or statistical laws [3-4]. In recent years, various learning-based methods have been proposed to stabilize the SR ill-posed problem [5-16]. For instance, an example-based method was proposed in [5], which assumes that lost high-frequency details in an LR image can be learned from trained LR and HR patch pairs. In other words, the HR image can be obtained by learning the co-occurrence relationship between these training patch pairs. However, with unsuitable training samples, example-based SR methods may produce obvious artifacts and unwanted noise in the synthesized image [6]. In [7], Chang et al. proposed another learning-based SR method which used the principle of locally linear embedding (LLE) (see [17]) from manifold learning. In their method, it was assumed that small image patch pairs in training images have the same local geometry [8-9], and therefore this method can reduce the scale of the training set. For a one-to-many mapping from an LR to an HR image, the manifold assumption is not always true. Recently, the SR problem has been extended and further developed as a sparse coding scheme, which has attracted increasing interest [18-23]. In [10], Yang et al. proposed a sparse representation SR method which can choose the most relevant reconstruction neighbors adaptively and thus avoid over- or under-fitting. However, performance is still not good enough when textures are not contained in the training database [24]. In practice, in some cases, one cannot obtain more than one LR image for the same scene, such as recovery of old photos, handwriting authentication, and restoration of calligraphy and paintings. Therefore, the singleimage super-resolution (SISR) problem is practical and valuable. However, it is also difficult to make use of the limited information in one LR image. This paper focuses on the SISR problem. Often the observed LR image Y is a blurred and downsampled version of the HR image X [25]. This paper will mainly consider

1051-8215 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2014.2372351, IEEE Transactions on Circuits and Systems for Video Technology IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, XXXX 201X

the situation where Y is directly downsampled from the original image X . In this case, the SISR problem turns into an image-interpolation problem [26], which mainly uses either parametric or nonparametric methods to upscale the size of the LR image. Recently, various interpolation algorithms have appeared for SISR, such as the classical bilinear [27], bicubic interpolation [28-29], and edge-guided interpolation methods [30-33]. Among these methods, reconstructing the HR image by means of linear relationships among neighboring pixels is a pervasive approach. Nevertheless, almost no traditional interpolation methods can fully accommodate correlations in image edge pixels, and therefore these methods may result in some ringing artifacts and blurring at the edge of the reconstructed HR image. Therefore, because the linear correlations are fixed and predefined in these methods, they cannot sufficiently model the textures in natural images [24]. Fu et al. [24] proposed a new method to solve the SISR problem based on the recently developed technique of low-rank matrix completion, which determines the order of the linear model adaptively and implicitly. In [24], the linear relationship among neighboring pixels was determined implicitly and adaptively by exploring the low-rank properties of the augmented matrix. Here, the concept of the augmented matrix can refer to the following (7). The low rank of the augmented matrix is due to the local structural similarity of the images. In other words, the center pixels can be sufficiently represented by the 8-connected neighboring pixels or a subset of the 8-connected eighboring pixels. However, due to the presence of noise and random perturbations, some entries in the augmented matrix are corrupted. This paper therefore investigates the SISR problem under this condition by using the recently developed low-rank matrix recovery theory [34-44]. Low-rank matrix recovery theory is a new signalprocessing method which was proposed in the framework of compressed sensing theory [34-36]. Many recent works such as [37-38] have shown that it is indeed possible to recover a low-rank matrix efficiently and exactly by convex optimization methods when a small number of samples or incomplete samples are available. Here, the SISR problem is recast as that of recovering and completing a low-rank augmented matrix (MCR) in the presence of random perturbations and noise (see [45]). This problem can be expressed as a rank minimization problem, which can be solved by the augmented Lagrange multiplier method (ALM) [46]. Experimental results show that the proposed method can be used to improve the effect of reconstruction both visually

2

and numerically and can outperform other traditional interpolation-based SISR methods. Furthermore, the proposed method can be efficiently used for noisy data. The remainder of this paper is organized as follows. Section 2 first provides a brief description of the SISR problem by means of the low-rank matrix completion and recovery model. Then the efficient and scalable convex optimization algorithm is demonstrated to recover the augmented matrix accurately. Experimental results are shown in Section 3, and Section 4 concludes the paper. II. SISR T HROUGH L OW- RANK M ATRIX C OMPLETION AND R ECOVERY In this section, SISR reconstruction will be formulated as a rank minimization problem. By solving the rank minimization problem, the HR image can be recovered. The recovery process is illustrated in Fig. 1. Given a color image similar to [11], it must first be transformed from RGB color space to YCbCr color space. The proposed method will be applied to the Y channel only. As for the color channels (Cb,Cr), the bicubic interpolation method is used to upsample them. In the Y channel, the proposed low-rank matrix recovery method is used to estimate the missing pixels in the local window W . Compatibility is enforced between adjacent local windows by processing the local windows in raster-scan order in the image, namely from left to right and from top to bottom.

Cb

Bicubic

Cr

Bicubic

Output

Input

Y

W Estimating the HR pixels by MCR

W

Fig. 1. Reconstruction process.

In the following discussion, some notations necessary for the description of the proposed method will first be introduced. Let Y be an input LR image which is a downsampled version of the HR image by a downsampling factor, and let X be the HR image to be estimated from Y . Let xi ∈ X and yi ∈ Y denote the pixels

1051-8215 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2014.2372351, IEEE Transactions on Circuits and Systems for Video Technology IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, XXXX 201X

of X and Y respectively. The neighbors of xi in X and yi in Y can be written as xti and yit respectively, where t = 1, 2, . . . , 8. As is well known, an LR image is usually modeled as a downsampled version of its corresponding HR counterpart after blurring. This paper mainly considers the situation where the LR image is directly downsampled from its corresponding HR image. Thus the SISR problem becomes an image-interpolation problem. Then, for the pixels in the LR image Y , yi ∈ Y implies yi ∈ X . One can also write an HR pixel xi as yi when it is in the LR image.

xi1 xi8

xi2

xi3

xi4

xi

xi7

3

xi6

xi5

x1j

x 2j

x8j

x3j x 4j

xj

x 7j

x 6j

x 5j

Fig. 3. 8-connected neighbors of pixels.

linear combination of its 8-connected neighboring pixels xti (t = 1, 2, . . . , 8), namely, xi =

8 ∑

xti αt , xi ∈ W,

(1)

t=1

Fig. 2. The solid dots are the LR image pixels, the shaded dots are the missing pixels to be estimated in the first phase, and the empty dots are the missing pixels to be estimated in the second phase.

where αi (i = 1, 2, . . . , 8) are the linear representation coefficients. w1

Similar to [30], the proposed method involves interpolating the missing pixels in X in two phases. A schematic diagram of the proposed method is shown in Fig.2, in which there are three kinds of pixels: solid dots, shaded dots, and empty dots. The solid dots are the known LR pixels, and the shaded and empty dots are the missing pixels. To provide enough information to estimate the missing pixels, interpolation is done in two phases. In the first phase, the bilinear interpolation method is first used to obtain initial estimates of the empty dots. Then the solid dots and the empty dots are combined to recover the shaded dots using low-rank matrix recovery theory. In the second phase, the final values of the empty dots are revised using low-rank matrix recovery theory. The relationship among neighboring pixels is an important piece of information for estimating missing pixels. The concept of 8-connected neighbors of pixels is illustrated in Fig. 3. This concept also illustrates that the spatial configuration of known and missing pixels is involved in the two phases. For a missing pixel xi ∈ X , some of its 8-connected neighbors are known LR pixels. In contrast, for a pixel xi ∈ Y , some of its 8-connected neighbors are missing pixels in X . A local window W is defined as an n × n image patch [30-32], and for each xi ∈ W , it can be sufficiently expressed by the

w2

w3

w4

x1

x2

x3

W

w5

w16

w6 x4

x5

x6 w7

w15 x7

x8

x9

w14

w8 w9 w13

w12

w11

w10

Fig. 4. Pixels in windows.

In SISR reconstruction, the structural similarity between the LR and HR images is always direct prior knowledge for high-quality image reconstruction. Structural similarity means that all xi ∈ W in the HR image share the same linear representation coefficient α = (α1 , α2 , . . . , α8 )| . Next, we denote the n-dimensional Euclidean space by Rn , and R1 degrades into the set of real number R in particular. Clearly, Eq. (1) can be rewritten as xW = Dx α, (2) where xW = (x1 , x2 , . . . , xn )| ∈ Rn×1 contains all the pixels in the local window W , the row of Dx ∈ Rn×8

1051-8215 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2014.2372351, IEEE Transactions on Circuits and Systems for Video Technology IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, XXXX 201X

Afterward, the augmented matrix of the system of linear equations (6) has the following form: ( ) xW D x B= . (7) yW Dy

W xi3

xi2

xi1

xi8

xi4

xi 7 i

x

yi1

5 i

6 i

x

x

yi2

yi3

yi4

yi8

yi yi7

yi6

yi5

LR image

HR image

Fig. 5. Local window W and the correspondence relation between LR pixels and HR pixels.

is the 8-connected neighbors of these pixels in W . For example, for the W in Fig. 12, xW and the matrix Dx in (2) can be written as xW = (x1 , x2 , x3 , . . . , x7 , x8 , x9 ), and   w1 w2 w3 w16 x2 w15 x4 x5  w2 w3 w4 x1 x3 x4 x5 x6     w3 w4 w5 x2 w6 x5 x6 w7     · · ·    · ·  Dx =    ·   · · ·    · · ·     x4 x5 x6 x7 x9 w12 w11 w10  x5 x6 w7 x8 w8 w11 w10 w9 (3) respectively, where row k(k = 1, 2, . . . , 9) corresponds to the 8-connected neighbors of xk . Due to the assumption in this paper that the LR image is directly downsampled from the HR image, the pixel yi in the LR image is also present in the HR image (see Fig. 5). Based on the structural similarity between the LR and HR images, yi can also be represented as a linear combination of its 8-connected neighbors in the LR image with the same coefficient α: yi =

8 ∑

yit αt , yi ∈ W, t = 1, 2, . . . , 8.

A

where Ω is the set of indices of known samples in B . Because the rank is a nonconvex function, (8) is an NPhard problem. To overcome this difficulty, Cand`es et al. [41] used the nuclear minimization problem to complete the low-rank matrix. Liu et al. [44] extended the matrix case to the tensor case based on a novel definition of the tensor trace norm and proposed a new algorithm for missing data completion. Here, (8) is relaxed as for the nuclear minimization problem (MC) min∥A∥∗ , s.t. Aij = Bij , ∀(i, j) ∈ Ω, A

(9)

where ∥A∥∗ is the summation of singular values of matrix A. However, the low-rank structure of matrix B described above is seldom observed due to the presence of noise in real images. Usually, in practical application, the pixel xi in W can be approximately linearly represented by its 8-connected neighbors. It can be formulated as 8 ∑

xti αt + vi , xi ∈ W,

(10)

t=1

(5)

where yw ∈ Rm×1 contains all the LR pixels in the local window W , and the rows of Dy ∈ Rm×8 are its 8-connected neighbors in the LR image. According to (2) and (5) the following system of linear equations can be obtained: = Dx α; = Dy α.

(MC) min rank(A), s.t. Aij = Bij , ∀(i, j) ∈ Ω, (8)

xi =

For convenience, (4) can be rewritten as

xW yW

Fu et al. [24] explored the low-rank property of the augmented matrix B . From the applications viewpoint, Eq. (7) has many solutions. Actually, the center pixels can be sufficiently represented by the 8-connected neighboring pixels or a subset of the 8-connected neighboring pixels. For example, 4-connected neighboring pixels are explored in [30] and [31]. These observations indicate that the augmented matrix has low rank and that some entries are missing [24]. This property of the augmented matrix provides an alternative way to obtain the HR image by filling in the missing entries of B . In [24], Fu et al. accomplished recovery of missing entries using the theory of matrix completion (MC) [41-43], in other words, finding the target matrix A such that

(4)

t=1

yW = Dy α,

4

(6)

where vi is a random perturbation, independent of spatial location i and the image signal, which accounts for both the fractal-like fine details of the image signal and for measurement noise [30]. Then the augmented matrix B is generated by corrupting some of the entries of a lowrank matrix A. The equation of this optimization problem can be written as follows [42] min ∥A∥∗ + λ∥V ∥1 A,V (11) s.t. B = A + V,

1051-8215 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2014.2372351, IEEE Transactions on Circuits and Systems for Video Technology IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, XXXX 201X

where V accounts for corruption by noise and perturbation. On the one hand, because part of the entries are missing in matrix B , it can be assumed that these entries do not change. That is, the entries outside Ω in V are zeros. On the other hand, it is known that only a small fraction of the known entries in B exhibit random perturbation from (10). Hence, most entries in V are zeroes, i.e., V is a sparse matrix. Thus, recovery of B becomes the problem of recovering a low-rank matrix with both missing entries and unknown corrupted entries. It can be formulated as a matrix-completion and errorcorrection problem [45]: min ∥A∥∗ + λ∥V ∥1

(MCR) s.t.

A,V

(12)

πΩ B = πΩ (A + V ),

where ∥ · ∥1 represents the ℓ1 norm, and πΩ is a linear operator which keeps the entries in Ω unchanged and sets those outside Ω to zero: { Bij , if (i, j) ∈ Ω; πΩ B = 0, else. Therefore, the optimization (12) models the following problem: it is expected to recover B , but only a few entries of B can be seen, and among these, a certain number are corrupted [34]. Once A has been found in model (12), not only have the missing pixels been obtained, but also the order of the linear model has been implicitly determined. Next, the augmented Lagrange multiplier (ALM) method [46] is used to solve the model (12). For (12), the augmented Lagrangian function is given by L(A, V, Y, µ) = ∥A∥∗ + λ∥V ∥1 + ⟨Y, πΩ (B − A − V )⟩ µ + ∥πΩ (B − A − V )∥2F , 2

(13)

where Y ∈ Rm×n is a Lagrange multiplier matrix, µ is a positive constant, ∥ · ∥F denotes the Frobenius norm, and ⟨·, ·⟩ denotes the matrix inner product. The ALM algorithm iteratively estimates both the Lagrange multiplier and the optimal solution. The basic ALM iteration is given by (Ak+1 , Vk+1 ) = arg minLµk (A, V, Yk , µk ) A,V

Yk+1 = Yk + µk πΩ (B − A − V ) µk+1 = ρµk

(14)

5

Because it is difficult to minimize with respect to both A and V simultaneously, an alternating minimization method is used. The first step is to fix A = Ak and update Vk+1 : Vk+1 = arg min λ∥V ∥1 − ⟨Yk , πΩ (V )⟩ V

µk + ∥πΩ (B − Ak − V )∥2F . 2

(15)

Then (15) has a closed-form solution: Vk+1 = ( ) 1 λ shrink πΩ (B) + Yk − πΩ (Ak ), , µk µk

(16)

where shrink(x, α) = sign(x) · max{|x| − α, 0} is a softthresholding operator. The second step is to fix V = Vk+1 and solve Ak+1 : (U, Σ, T ) = ( ) 1 svd + πΩ (B) − Vk+1 + πΩc (Z) µk ( ) 1 Ak+1 = U shrink Σ, T| µk

(17)

where svd(·) represents the singular-value decomposition operator. The whole idea can be summarized as the following Algorithm 1. Algorithm 1: Matrix Completion and Recovery through ALM. Input: The augmented matrix B, Ω, and λ > 0; Output: (A, V ) = (Ak+1 , Vk+1 ); Step 1. Initialize A = 0, V1 = 0, Y1 = 0.; Step 2. While not converged (k = 1, 2, . . .) do while not converged do //solve Vk+1 = arg minV L(Ak , V, Yk , µk ) Vk+1 = shrink(πΩ (B) + µ1k Yk − πΩ (Ak ), µλk ); t = 1; Z = Ak ; //solve Ak+1 = arg minA L(A, Vk+1 , Yk , µk ) while not converged do (U, Σ, T ) = svd( µ1k Yk + πΩ (B) − Vk+1 + πΩ (Z)); Apre = Ak+1 , update Ak+1 by Ak+1 = U shrink(Σ, µ1k )T | ; √ tpre = t, t = 0.5(1 + 1 + 4t2pre ), Z = Ak+1 ; end while end while //update Yk+1 Yk+1 = Yk + µk πΩ (B − Ak+1 − Vk+1 ), µk+1 = ρ · µk ; end while End.

where {µk } is a monotonically increasing positive sequence and ρ > 1. 1051-8215 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2014.2372351, IEEE Transactions on Circuits and Systems for Video Technology IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, XXXX 201X

6

Fig. 6. Test images used in this paper. From left to right and from top to bottom: Butterfly, Elephant, Lena, Peppers, Cameraman, Donna, Elaine, Digit.

III. E XPERIMENTAL R ESULTS This section describes a series of tests performed to illustrate the effectiveness of the proposed algorithm. All the experiments described in this paper were carried out in a MATLAB 2010a environment running on a Pentium dual-core E6700 processor with a speed of 3.2 GHz. All test images are shown in Fig. 6 and are available in [47]. The LR images were sampled from the original HR images by extracting the pixels in odd rows and odd columns. The downsampling factors were 2 and 4, the sizes of the test images were 128 × 128 or 64 × 64, respectively. The goal was to reconstruct the HR images from these LR images, especially in the presence of noise. In these experiments, the window W in the HR image was set to 5 × 5 with 1 pixel overlap between adjacent windows. The LR images were first upsampled using the bilinear interpolation method to obtain the initial HR image. In these experiments, the peak signal-to-noise ratio (PSNR) has been used for quantitative valuation of the reconstructed images. PSNR was calculated as follows to evaluate image-recovery effectiveness [48]: ( ) 2552 PSNR = 10 log , MSE (18) 1 2 ∥X1 − X2 ∥F , MSE = MN where X1 , X2 represent the ideal high-resolution image and the reconstructed high-resolution image respectively, M, N represent the size of the high-resolution image,

and ∥ · ∥F is the Frobenius norm. In addition, comprehensive experiments were carried out to compare imagerecovery effectiveness among the proposed method (MCR), the bicubic interpolation method (Bicubic) [28], the bilinear interpolation method (Bilinear) [27], the wavelet-based method (Wavelet) [49], the improved new edge-directed interpolation approach (Inedi) [50], the method described in [16] (Sisr), and the method described in [24] (MC). In these experiments, the input LR images of Lena, Butterfly, Peppers, Digit, Elephant, and Donna have been magnified by a factor of 2 and those of Cameraman and Elaine by a factor of 4. Otherwise, because the proposed method was presented to double the horizontal and vertical resolutions, it can be readily generalized to a scaling factor Z = 2k with a positive integer k . If the scaling factor Z is not an integer power of two, the input LR image can be expanded by 2k such that 2k < Z < 2k+1 by the proposed method. Then a traditional interpolation method, for instance, the Bicubic or Bilinear interpolation methods, should be used to scale up the output image by s times such that 2k s = Z . Fig. 7 shows the reconstruction effects using seven different methods with magnification factor 2. From the local details in the figures, it is clear that all methods have similar reconstruction effects on the smooth region. However, near the edges and in regions of irregular structure, the effects generated by the Bicubic, Wavelet, Inedi, Sisr, and Bilinear interpolations are blurry, whereas the result from MC is somewhat clear, but still blurry. In any

1051-8215 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2014.2372351, IEEE Transactions on Circuits and Systems for Video Technology IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, XXXX 201X

7

㧧ᗇⲴ䎵࠶䗘⦷മ‫ۿ‬

㧧ᗇⲴ䎵࠶䗘⦷മ‫ۿ‬

㧧ᗇⲴ䎵࠶䗘⦷മ‫ۿ‬

Input

Bicubic

Bilinear

Wavelet

Inedi

Sisr

MC

MCR

MC

MCR

Fig. 7. Reconstruction results with a magnification factor of 2 by different methods.

Bicubic

Input

Bilinear

Wavelet

Inedi

Sisr

Fig. 8. Local details of Fig.7.

TABLE I PSNR S OF RECONSTRUCTION WITH A MAGNIFICATION FACTOR OF 2. Method Bicubic Bilinear Wavelet Inedi Sisr MC MCR

Lena 29.6485 29.8218 29.8138 29.7535 28.9940 30.4605 30.7003

Peppers 28.4772 28.5569 28.6194 28.4843 28.2687 29.1139 29.5367

Butterfly 28.2975 28.3226 28.3156 28.2335 28.0684 28.8347 29.1177

Digit 17.9016 17.9520 17.9520 17.6200 17.4145 18.2717 18.3166

Cameraman 23.7067 23.9119 23.7940 23.6704 23.4370 24.3253 24.9376

Donna 30.0561 30.2175 30.2285 29.9667 29.6209 30.6547 31.1214

Elaine 27.4627 27.7225 27.6632 27.5595 27.0207 27.9931 28.6702

Elephant 28.8538 28.8703 28.8984 28.8455 28.5433 29.4426 29.6381

1051-8215 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2014.2372351, IEEE Transactions on Circuits and Systems for Video Technology IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, XXXX 201X bicubic

Input

Bicubic 㧧ᗇⲴ䎵࠶䗘⦷മ‫ۿ‬

Inedi

Sisr

bilinear

Bilinear MC

MC

8

㧧ᗇⲴ䎵࠶䗘⦷മ‫ۿ‬

Wavelet MCR

MCR

Fig. 9. Reconstruction results with a magnification factor of 4.

case, the proposed method can generate clearer, more detailed HR images. The PSNR values of the four test images are reported in Table I. It can be noted that the Bicubic interpolation method and the Inedi method always give the lowest values. Although the Bilinear, Wavelet, Sisr, and MC methods in [24] achieved better results than the Bicubic interpolation, their performance is obviously inferior to the MCR method described in this paper. In terms of PSNR values, the proposed MCR method achieved much better performance. Figure 9 shows the recovery effects of applying different methods to achieve SISR with magnification factor 4. The PSNRs are displayed in Table II. From Fig. 9 and Table II, it is clear that the proposed method (MCR) is clearly superior to the others. To test the robustness of the proposed algorithm to noise, different levels of salt-and-pepperand Gaussian noise were added to the LR input images. The density of salt-and-pepper noise ranged from 0 to 0.004, and the standard deviation of Gaussian noise ranged from 0 to 18. Fig. 10 and Fig. 11 show the recovery effects of the proposed method with different levels of Gaussian and salt-and pepper-noise. Fig. 12 shows the local detail of Fig. 11. For comparison, the recovery effects of the other six methods are also shown. It is easy to see that the results from the Bicubic, Wavelet, Inedi, Sisr, and Bilinear interpolation methods

are both noisy and blurry. As for the MC method, it is good at preserving edges, but fails to distinguish information from noise and may generate unwanted noisy results. However, the proposed method is capable of performing denoising and SR simultaneously. Table III and Table IV show the PSNRs of the reconstructed images with different densities of salt-and-pepper noise and different levels of Gaussian noise. In terms of PSNRs, the proposed method outperforms the other methods in all cases. Moreover, from Table III and Table IV it can be seen that when the level of noise is high enough, the performance of the MC method is inferior to that of the Bilinear interpolation method for both saltand-pepper and Gaussian noise. IV. C ONCLUSIONS In this paper, a new computational framework for SISR has been proposed which is robust to noisy data. The proposed method aims to explore the local linear relationship among neighboring pixels. Unlike previous interpolation-based SISR methods which use fixed-order linear models, the proposed method can implicitly determine the optimum order of the linear model. By considering the low-rank property of the augmented matrix, the super-resolution problem has been reformulated as the recovery of a low-rank matrix from missing and corrupted observations, which can be solved efficiently using the ALM method. Experimental results have demonstrated that the proposed method can achieve better recovery effects than the other methods in terms of

1051-8215 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2014.2372351, IEEE Transactions on Circuits and Systems for Video Technology IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, XXXX 201X

9

TABLE II PSNR S OF RECONSTRUCTION WITH A MAGNIFICATION FACTORS OF 4. Method Bicubic Bilinear Wavelet Inedi Sisr MC MCR

Cameraman 19.2428 19.6882 19.7528 19.4416 19.2457 19.8068 20.2775

Elaine 22.3066 22.6918 22.6116 22.3276 22.1242 23.0011 23.5602

Lena 24.2549 24.6955 24.6457 24.1178 24.0449 24.9043 25.2107

Elephant 22.2415 22.6056 22.5874 22.3172 22.1615 22.8841 23.1045

Donna 24.0068 24.4234 24.4067 23.9826 23.8969 24.7078 25.0397

Peppers 20.7278 21.0627 21.0571 20.9516 20.1976 21.3391 21.5978

Digit 14.7361 15.0356 15.2766 14.9890 14.6524 15.5698 15.8870

Butterfly 21.5970 22.0753 22.0771 21.8786 21.2039 22.2011 22.6139

㧧ᗇⲴ䎵࠶䗘⦷മ‫ۿ‬

0

㧧ᗇⲴ䎵࠶䗘⦷മ‫ۿ‬

0.001

㧧ᗇⲴ䎵࠶䗘⦷മ‫ۿ‬

0.002

Bicubic

Bilinear

Wavelet

Inedi

Sisr

MC

MCR

Fig. 10. Reconstruction results for Donna by different methods. The ordinate is the density of salt-and-pepper noise.

㧧ᗇⲴ䎵࠶䗘⦷മ‫ۿ‬

0

㧧ᗇⲴ䎵࠶䗘⦷മ‫ۿ‬

8

㧧ᗇⲴ䎵࠶䗘⦷മ‫ۿ‬

16

Bicubic

Bilinear

Wavelet

Inedi

Sisr

MC

Fig. 11. Reconstruction results for Elephant. The ordinate is the level of Gaussian noise.

1051-8215 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

MCR

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2014.2372351, IEEE Transactions on Circuits and Systems for Video Technology IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, XXXX 201X

10

0

8

16

Bicubic

Bilinear

Wavelet

Inedi

Sisr

MC

MCR

Fig. 12. Local details of Fig.11.

TABLE III PSNR S OF DONNA AND ELEPHANT WITH SALT-AND-PEPPER NOISE. Image

Donna

Elephant

Noise 0 0.001 0.002 0.003 0.004 0 0.001 0.002 0.003 0.004

Bicubic 30.0561 29.4140 28.5507 27.5855 27.5540 28.8538 28.5074 28.3571 28.0448 28.0417

Bilinear 30.2175 29.8039 29.2353 28.4947 28.5035 28.8703 28.5836 28.4927 28.2931 28.2778

Wavelet 30.2175 29.8303 29.2039 28.3970 28.1506 28.8984 28.5010 28.1337 27.9217 27.6382

Inedi 29.9667 29.7315 28.5066 27.9156 27.2116 28.8455 28.4906 28.0843 27.8526 27.4109

Sisr 29.6209 29.5452 28.3232 27.8517 26.8318 28.5433 28.0967 27.8467 27.7327 27.4239

MC 30.6547 29.3300 29.2352 28.3115 27.6407 29.4426 29.2840 29.0953 28.2039 28.1001

MCR 31.1214 30.6055 29.6091 28.5379 28.5664 29.6381 29.4849 29.2155 28.8723 28.6377

TABLE IV PSNR S OF DONNA AND ELEPHANT WITH GAUSSIAN NOISE. Image

Donna

Elephant

Noise 0 4 8 12 16 18 0 4 8 12 16 18

Bicubic 30.0561 29.4044 27.8842 27.4028 24.5785 23.7735 28.8538 28.6612 28.1744 27.5248 26.7508 26.3161

Bilinear 30.2175 29.7917 28.7500 27.4176 26.0993 25.3936 28.8703 28.7455 28.4172 27.9993 27.4331 27.1227

Wavelet 30.2175 29.6315 28.0149 27.1123 25.0989 24.2120 28.8784 28.6374 27.9026 26.9232 25.8199 25.3058

Inedi 29.9667 28.9547 27.4037 26.8894 24.0392 23.0573 28.8544 28.7711 28.4738 27.9362 27.2906 26.8990

Sisr 29.6209 28.5946 27.0012 26.1469 23.6490 22.7953 28.5433 28.3982 27.9011 26.6481 26.0344 25.4441

MC 30.6547 30.2160 28.6131 26.8218 25.1737 24.4297 29.4426 29.2597 28.7577 28.0366 27.2107 26.8658

MCR 31.1214 30.8074 29.9911 27.8304 26.5001 25.8811 29.6381 29.4504 28.9768 28.3236 27.5604 27.1579

1051-8215 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2014.2372351, IEEE Transactions on Circuits and Systems for Video Technology IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, XXXX 201X

PSNR. The biggest advantage of the proposed method is its ability to handle noisy data and random perturbations. The feasibility and effectiveness of the proposed method can also be demonstrated using real images with noisy data. Existing interpolation-based methods still generate serrated and blurred edges as recovery effects. Therefore, in future, the authors will try to explore using feasible and effective prior knowledge to guide higher-quality HR image recovery. ACKNOWLEDGMENT The authors would like to thank the editors and reviewers for their many valuable suggestions and comments. This paper was supported by the National Natural Science Foundation of China (Nos. 91330118, 61272023). R EFERENCES [1] S. C. Park, M. K. Park, and M. G. Kang, “Super resolution image reconstruction: A techmical overview,” IEEE Signal Process. Mag., vol. 20, no. 3, pp. 21-36, Jan. 2003. [2] Q. Yuan, L. Zhang, and H. Shen, “Regional spatially adaptive total variation super-resolution with spatial information filtering and clustering,” IEEE Trans. Image Process., vol. 22, no. 6, pp. 2327-2342, May 2013. [3] S. Farsiu, D. Robinson, M. Elad, and P. Milanfar, “Advances and challenges in super-resolution,” Int. J. Imag. Syst. Technol., vol. 14, no. 2, pp. 47-57, Aug. 2004. [4] E. J. Cand`es, and C. Fernandez-Granda, “Super-resolution from noisy data,” J. Fourier Anal. Appl., vol. 19, no. 6, pp. 12291254, Dec. 2013. [5] W. T. Freeman, T. R. Jones, and E. C. Pasztor, “Example-based super-resolution,” IEEE Comput. Graph. Appl., vol. 22, no. 2, pp. 56-65, Aug. 2002. [6] K. Zhang, X. Gao, D. Tao, and X. L. Li, “Single image superresolution with non-local means and steering kernel regression,” IEEE Trans. Image Process., vol. 21, no. 11, pp. 4544-4556, Jul. 2012. [7] H. Chang, D. Y. Yeung, and Y. Xiong, “Super-resolution through neighbor embedding,” In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 1, pp. 275-282, 2004. [8] X. B. Gao, K. Zhang, D. C. Tao, and X. L. Li, “Joint learning for single-image super-resolution via a coupled constraint,” IEEE Trans. Image Process., vol. 21, no. 2, pp. 469-480, Jul. 2012. [9] X. B. Gao, K. Zhang, D. C. Tao, and X. L. Li, “Image superresolution with sparse neighbor embedding,” IEEE Trans. Image Process., vol. 21, no. 7, pp. 3194-3205, Mar. 2012. [10] J. C. Yang, J. Wright, T. Huang, and Y. Ma, “Image superresolution as sparse representation of raw image patches,” In Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., Jun. 2008, pp. 1-8. [11] J. C. Yang, J. Wright, T. Huang, and Y. Ma, “Image superresolution via sparse representation,” IEEE Trans. Image Process., vol. 19, no. 11, pp. 2861-2873, May 2010. [12] J. Yang, Z. Lin, and S. Cohen, “Fast image super-resolution based on in-place example regression,” IEEE Conf. Comput. Vision Pattern Recognit., Jun. 2013, pp. 1059-1066.

11

[13] M. Bevilacqua, A. Roumy, C. Guillemot, “Super-resolution using neighbor embedding of back-projection residuals,” IEEE Conf. Digital Signal Process., Jul. 2013, pp. 1-8. [14] D.-H. Trinh, M. Luong, F. Dibos, and J.-M. Rocchisani, “Novel example-based method for super-resolution and denoising of medical images,” IEEE Trans. Image Process., vol. 23, no. 4, pp. 1882-1895, Apr. 2014. [15] X. Lu, Y. Yuan, P. Yan, “Image super-resolution via double sparsity regularized manifold learning,” IEEE Trans. Circuits Syst. Video Technol., vol. 23, no. 12, pp. 2022-2033, Dec. 2013. [16] D. Glasner, S. Bagon, and M. Irani, “Super-resolution from a single image,” in IEEE 12th Inter. Conf. Computer Vis., Oct. 2009, pp. 349-356. [17] S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science, vol. 290, no. 5500, pp. 2323-2326, Dec. 2000. [18] J. Mairal, G. Sapiro, and M. Elad, “Learning multiscale sparse representations for image and video restoration,” SIAM Multiscale Model. Simul., vol. 7, no. 1, pp. 214-241, Jul. 2008. [19] S. Y. Yang, Z. Z. Liu, and L. C. Jiao, “Multitask dictionary learning and sparse representation based single-image superresolution reconstruction,” Neurocomputing, vol. 74, no. 17, pp. 3193-3203, Oct. 2011. [20] S. Y. Yang, M. Wang, Y. G. Chen, and Y. X. Sun, “Singleimage super-resolution reconstruction via learned geometric dictionaries and clustered sparse coding,” IEEE Trans. Image Process., vol. 21, no. 9, pp. 4016-4028, May 2012. [21] R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using sparse-representations,” Curves and Surfaces, vol. 6920, pp. 24-30, 2010. [22] J. Ren, J. Liu, and Z. Guo, “Context-aware sparse decomposition for image denoising and super-resolution,” IEEE Trans. Image Process., vol. 22, no. 4, pp. 1456-1469, Oct. 2013. [23] X. Lu, H. Yuan, P. Yan, Y. Yuan, and X. Li, “Geometryconstrained sparse coding for single image super-resolution,” In Proc. IEEE Conf. Comput. Vision Pattern Recognit., Jun. 2012, pp. 1648-1655. [24] C. Fu, X. Ji, Y. Zhang, and Q. Dai, “A single-frame superresolution method based on matrix completion,” In Proc. Data Compression Conf., Apr. 2012, pp. 297-306. [25] A. K. Katsaggelos, R. Molina, and J. Mateos, “Super resolution of images and video,” San Rafael, CA: Morgan and Claypool, 2007. [26] W. Dong, L. Zhang, R. Lukac, and G. Shi, “Sparse representation-based image interpolation with nonlocal autoregressive modeling,” IEEE Trans. Image Process., vol. 22, no. 4, pp. 1382-1394, Jan. 2013. [27] X. Zhang and X. Wu, “Digital interpolation of discrete images,” IEEE Trans. Comput., vol. 25, no. 2, pp. 196-202, Jun. 1976. [28] H. S. Hou and H. C. Andrews, “Cubic splines for image interpolation and digital filtering,” IEEE Trans. Signal Process., vol. 26, no. 6, pp. 508-517, Jan. 1978. [29] R. G. Keys, “Cubic convolution interpolation for digital image processing,” IEEE Trans. Acoust., Speech, Signal Process., vol. 29, no. 6, pp. 1153-1160, Jan. 1981. [30] X. J. Zhang and X. L. Wu, “Image interpolation by adaptive 2D autoregressive modeling and soft-decision estimation,” IEEE Trans. Image Process., vol. 17, no. 6, pp. 887-896, May 2008. [31] X. Li and M. T. Orchard, “New edge-directed interpolation,” IEEE Trans. Image Process., vol. 10, no. 10, pp. 1521-1527, Aug. 2001. [32] L. Zhang and X. Wu, “An edge-guided image interpolation algorithm via directional filtering and data fusion,” IEEE Trans. Image Process., vol. 15, no. 8, pp. 2226-2238, Jul. 2006. [33] K. He, J. Sun, and X. Tang, “Guided image filtering,” IEEE

1051-8215 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2014.2372351, IEEE Transactions on Circuits and Systems for Video Technology IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, XXXX 201X

[34]

[35] [36]

[37]

[38]

[39]

[40]

[41]

[42]

[43]

[44]

[45]

[46]

[47] [48]

[49]

[50]

Trans. Pattern Anal. Mach. Intell., vol. 35, no. 6, pp. 13971409, Jun. 2013. E. Cand`es, X. Li, Y. Ma, and J. Wright, “Robust principal component analysis?” J. ACM, vol. 58, no. 3, pp. 1-37, May 2011. B. K. Natarajan, “Sparse approximate solutions to linear systems,” SIAM J. Comput., vol. 24, no. 2, pp. 227-234, 1995. E. Cand`es, J. Romberg, and T. Tao, “Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. Inform. Theory, vol. 52, no. 2, pp. 489-509, Jan. 2006. B. Recht, M. Fazel, and P. A. Parrilo, “Guaranteed minimumrank solutions of linear matrix equations via nuclear norm minimization,” SIAM Rev., vol. 52, no. 3, pp. 471-501, 2010. E. Cand`es and T. Tao, “Decoding by linear programming,” IEEE Trans. Inform. Theory, vol. 51, no. 12, pp. 4203-4215, Nov. 2004. V. Chandrasekaran, S. Sanghavi, P. Parrilo, and A. Willsky, “Rank-sparsity incoherence for matrix decomposition,” SIAM J. Optim., vol. 21, no. 2, pp. 572-596, 2011. J. F. Cai, E. J. Cand`es, and Z. Shen, “A single value thresholding algorithm for matrix completion,” SIAM J. Optim., vol. 20, no. 4, pp. 1956-1982, 2010. E. J. Cand`es and B. Recht, “Exact matrix completion via convex optimization,” Found. Comput. Math., vol. 9, no. 6, pp. 717-772, 2008. J. Wright, A. Ganesh, S. Rao, Y. Peng, and Y. Ma, “Robust principal component analysis: Exact recovery of corrupted lowrank matrices by convex optimization,” In Proc. Neural Inform. Proces. Syst., pp. 2080-2088, Dec. 2009. E. J. Cand`es and T. Tao, “The power of convex relaxation: Near-optimal matrix completion,” IEEE Trans. Inform. Theory, vol. 56, no. 5, pp. 2053-2080, May 2010. J. Liu, P. Musialski, and P. Wonka, “Tensor completion for estimating missing values in visual data”. IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 1, pp. 208-220, Jan. 2013. L. Wu, A. Ganesh, B. Shi, Y. Matsushita, Y. Wang, and Y. Ma, “Robust photometric stereo via low-rank matrix completion and recovery,” Lecture Notes in Computer Science, Vol. 6494, pp. 703-717, 2011. Z. Lin, M. Chen, L. Wu, and Y. Ma, “The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices,” preprint arXiv:1009.5055, 2010. URL http://decsai.ugr.es/cvg/cg/base.htm. Y. Tong, Q. Zhang, and Y. Qi, “Image quality assessing by combining PSNR with SSIM,” J. Image and Graphics, vol. 12, pp. 1758-1763, 2006. C. V. Jiji, M. V. Joshi, and S. Chaudhuri, “Single-frame image super-resolution using learned wavelet coefficients”, Int. J. Imag. Syst. Technol., vol. 14, no. 3, pp. 105-112, 2004. N. Asuni, and A. Giachetti, “Accuracy improvements and artifacts removal in edge based image interpolation, in Proc. 3rd Int. Conf. Comput. Vis. Theory Appl., pp. 58 C65, 2008.

12

Feilong Cao received the B.Sc. and M.Sc. degrees in Applied Mathematics from Ningxia University, China, in 1987 and 1998 respectively. In 2003, he received the Ph.D. degree in Applied Mathematics from Xi’an Jiaotong University, China. He was a Research Fellow with the Center of Basic Sciences, Xi’an Jiaotong University, China, from 2003 to 2004. From 2004 to 2006, he was a Post-Doctoral Research Fellow with the School of Aerospace, Xi’an Jiaotong University, China. From June 2011 to December 2011 and October 2013 to January 2014, he was a Visiting Professor with the Department of Computer Science, Chonbuk National University, South Korea, and the Department of Computer Sciences and Computer Engineering, La Trobe University, Melbourne, Australia respectively. He is currently a Professor and the Dean of the College of Sciences, China Jiliang University. He has authored or co-authored over 100 scientific papers in refereed journals. His current research interests include pattern recognition, neural networks, and approximation theory.

Miaomiao Cai received the B.Sc. degree in Information and Computing Science from Dalian Nationality University, China, in 2011. She is currently working towards the M.Sc. degree in Applied Mathematics in China Jiliang University, China. Her research interests include image super-resolution and machine learning.

Yuanpeng Tan received the B.Sc. and M.Sc. degrees in Applied Mathematics from China Jiliang University in 2010 and 2013 respectively. He is currently working towards the D.E. degree at North China Electric Power University. His research interests include machine learning, image processing, and smart grids.

1051-8215 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Suggest Documents