Signal Processing 118 (2016) 103–114
Contents lists available at ScienceDirect
Signal Processing journal homepage: www.elsevier.com/locate/sigpro
Rotation invariant multi-frame image super resolution reconstruction using Pseudo Zernike Moments Hamidreza Rashidy Kanan a,n, Sara Salkhordeh b a b
Department of Electrical, Biomedical and Mechatronic Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran Department of Computer and Information Technology Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran
a r t i c l e i n f o
abstract
Article history: Received 12 September 2014 Received in revised form 17 May 2015 Accepted 23 May 2015 Available online 4 June 2015
The purpose of multi-frame super resolution (SR) is to combine multiple low resolution (LR) images to produce one high resolution (HR) image. The major challenge of classic SR approaches is accurate motion estimation between the frames. To address this problem, fuzzy motion estimation method has been proposed that replaces value of each pixel using the weighted average of all its neighboring pixels in all LR images. However, in case of rotation between LR images, comparing the gray level of blocks is not a suitable criterion for calculating the weight. Hence, magnitude of Zernike Moments (ZM) has been used as a rotation invariant feature. Considering the more robustness of Pseudo Zernike Moments (PZM) to noise and its higher description capability for the same order compared to ZM, in this paper, we propose a new method based on the magnitude of PZM as a rotation invariant descriptor for representing the pixels in the weight calculation. Also, due to the fact that the phase of PZM provides significant information for image reconstruction, we propose a new phase-based PZM descriptor for SR by making the phase coefficients invariant to rotation. Experimental results on several image sequences demonstrate that the proposed algorithm outperforms other currently popular SR techniques from the viewpoint of PSNR, SSIM and visual image quality. & 2015 Elsevier B.V. All rights reserved.
Keywords: Super resolution Zernike Moments (ZMs) Pseudo Zernike Moments (PZMs) Fuzzy motion estimation
1. Introduction High Resolution (HR) images are usually desired in different image processing and pattern recognition systems such as remote sensing, diagnosis and video surveillance [1]. However, due to the resolution restriction of physical sensors and the disturbance from external environment, image acquisition systems usually involved optical fuzzy, motion blurring, and additive noise. The term Super Resolution (SR) or resolution enhancement refers to the image n
Corresponding author. E-mail addresses:
[email protected],
[email protected] (H.R. Kanan),
[email protected] (S. Salkhordeh). http://dx.doi.org/10.1016/j.sigpro.2015.05.015 0165-1684/& 2015 Elsevier B.V. All rights reserved.
processing algorithms overcoming the resolution restriction and the mentioned shortcoming of inexpensive imaging systems. SR algorithms are divided into single-frame and multiple-frame categories. In the single-frame approaches, the image resolution is enhanced using information of a single image, while in the multiple-frame algorithms, information of multiple Low Resolution (LR) images are combined to produce a High Resolution (HR) image. In other words, the main idea of the multi-frame SR is a fusion of several blurred and noisy LR images to reconstruct an HR one. The first step for analyzing the multi-frame SR issue is providing a model which relates the main HR image to LR images. Fig. 1 illustrates this model. If we define the tth HR image as Xt and the tth LR image as yt, then the input–output relation in this model can be
104
H.R. Kanan, S. Salkhordeh / Signal Processing 118 (2016) 103–114
HR image sequence (Xt)
HR blurred image sequence
LR blurred image sequence
LR noisy and blurred image sequence (yt)
Additive Noise (n) Fig. 1. The model to obtain LR images from HR images.
defined as yt ¼ DBM t X t þn; 1 r t rT. In this equation, considering down sampling factor L1 in the horizontal direction and L2 in the vertical direction, the size of HR and LR images are equal to L1N1 L2N2 and N1 N2 respectively. Parameter T denotes the number of observed LR images and Mt is a wrap matrix that could include the local or the global translation or rotation. The blur function B represents the atmospheric, sensors, or lens effects during image acquisition process. Matrix D is a down sample matrix which makes down samples out of blurred and distorted HR image equal to L1 and L2 factors in the horizontal and vertical directions respectively. Also, n is the additive white Gaussian noise. The goal of SR is the restoration of Xt from the input set of images yt, reversing the above process. Existing SR approaches can be categorized into two general classes including frequency domain-based approaches [2–8] and spatial domain-based approaches. Furthermore, the spatial domain-based approaches can be classified into three categories containing interpolationbased methods [9–11], reconstruction-based techniques [12–18], and example learning-based algorithms [19–34]. The first research of multi-frame super resolution reconstruction in the frequency domain-based approaches was proposed by Tsai and Huang [2]. They considered LR images without noise and transformed LR image data into the Discrete Fourier Transform (DFT) domain. Kim et al. [3,4] extended this approach, considering the same additive noise and the same spatial blurring properties for all LR images. Rhee and Kang [5] also developed the frequency domainbased methods by exploiting Discrete Cosine Transform (DCT) for LR images. Different Wavelet Transforms has been also utilized for image/video super resolution in order to improve the restored HR image [6–8]. The major benefit of the frequency domain-based approaches is their theoretical simplicity [35] in which the relationship between HR image
and the LR images is clearly described in the frequency domain. However, in these approaches, the observation model is limited to only global translational motion and linear space invariant blur. Therefore, the aforementioned methods cannot perform well in the real world applications. Interpolation-based SR methods usually utilize a smooth kernel function to interpolate the HR image from the LR input. The major advantage of these methods is their relatively low computational complexity. However, they usually tend to remove the high-frequency details and therefore produce unclear textures and also blurring and aliasing artifacts along edges in the resulted image. In the reconstruction-based SR techniques which are motivated by the fact that the SR computation is inherently an ill-posed inverse problem [36], some certain prior knowledge from successive LR image frames of the same scene is utilized to recover the lost high-frequency details. Usually, edge-directed priors (e.g., edge prior [16], gradient profile prior [15], and total variation [17]) are typically designed to obtain sharp edges in the produced image. Other famous prior models, such as Projection On Convex Sets (POCS) [2], Maximum A-Posteriori (MAP) [37], Bilateral Total Variation (BTV) [38], wavelet-domain Hidden Markov Tree (HMT) [39], Markov Random Fields (MRF) [40], simple Gaussian prior [41], and the Gibbs distribution with a Huber potential function [42], have been also utilized in SR problem. It should be mentioned that in reconstruction-based methods the appearance of the reconstructed image should be consistent with the original LR image(s) via back-projection. The main advantage of reconstruction-based algorithms is their ability in suppressing aliasing artifacts and preserving sharper edges. However, the performance of these techniques depends greatly on a rational prior imposed on the up-sampled image. Example learning-based SR algorithms which have drawn much attention in recent years, estimate the high
H.R. Kanan, S. Salkhordeh / Signal Processing 118 (2016) 103–114
frequency details missing in an LR image from a large training set of external HR images that encode the mapping relationship between HR and LR image sets. Although example learning-based SR techniques can effectively predict the missing details based on the similarities between the given LR image and the samples in the training set, the performance of these algorithms depends greatly on the quality of the supporting image database, especially when a larger magnification ratio is accomplished. The representative techniques contain k-Nearest Neighbors (k-NN) learning algorithms [19,20], manifold learning algorithms [21–24], sparse representation algorithms [25–30], sparse regression algorithms [32,33], and multiscale similarity learning [34]. One of the main drawbacks in most of the aforementioned techniques, is that the defined relocation between LR images is translation or affine. So, these methods cannot obtain a desirable performance when we use actual scenes with local motion (e.g., a person talking). Some approaches were proposed to reduce the registration error, which handle the motion with fuzzy approach. One of these successful approaches which is a generalization of a denoising algorithm [43], is super resolution based on Non-local Means (NLM-based SR) [44]. In this approach, the value of each pixel is replaced by the weighted average of its neighbor's pixels. These weights are the similarity criteria between the reference pixel and its neighbor pixels. However, in practical imaging systems, there are many complicated parameters such as rotation and a small change of scene. Under these conditions, the relevance between the LR input images would be smaller, and the NLM-based SR method cannot perform well. In other words, if we define the similarity criteria based on gray levels of two pixels, in case of rotation between LR input images, inappropriate weights will be assigned to the pixels and the output reconstructed image will be distorted. Gao et al. [45], by utilizing the rotation invariant property of Zernike Moments (ZMs), have partially solved this problem. They employed magnitude of ZM of image blocks, centered on two pixels to define their weight instead of defining feature vector for two blocks based on gray level values. Considering the less sensitivity of Pseudo Zernike Moments (PZM) to noise [46,47] and its higher description ability for the same order compared to ZM, in this paper, we propose a novel method based on the magnitude of PZM extracted from the blocks as a rotation invariant descriptor to represent the pixels in the weight calculation. Also, due to the fact that the phase of PZM provides meaningful information for image reconstruction, we propose a new phase-based PZM descriptor for comparing the image blocks by making the phase of PZM invariant to rotation. Experimental results on multiple image sequences indicate that the proposed algorithm outperforms other currently popular SR techniques from the viewpoint of PSNR, SSIM and visual image quality. The rest of the paper is organized as follows: Section 2 presents NLM denoising and NLM-based super resolution algorithms. Section 3 describes the proposed super resolution approach and its capabilities in details. The experimental results are presented in Section 4. Finally, the paper concludes in Section 5.
105
2. NLM denoising and NLM-based super resolution algorithms Since our proposed method is development of NLM denoising and NLM-based super resolution techniques, so in this section, we briefly present these two algorithms. Buades et al. [43] in 2005 proposed an algorithm for denoising image sequences based on NLM filter. The main idea of this algorithm is based on the assumption that in natural images, the probability of existing similar pixels in different areas of the image is high. Based on this obligation, to denoise image sequences, they calculated the value of each pixel in each input image using a weighted average of the pixels in the three-dimensional neighborhood (considering the previous and the next frames of it). In fact, they defined an energy function as Eq. (1) and achieved this goal by minimizing it. ε2T ðXÞ ¼
X X 1X wðk; l; i; j; tÞ t A ½1;:::;T i;j A Nðk;lÞ 2 ðk;lÞ A Ω ‖Rk;l X Ri;j yt ‖22
ð1Þ
where yt denotes the sequence of input noisy images, X is the target output image which is noise-free and could be either of the images of input sequence, Ri;j is an operator which is multiplied by the image yt and extract a block around pixel (i,j) with the size of m n. Also, ‖:‖22 denotes L2 norm, N(k,l) is the area around pixel (k,l) which pixels in N(k,l), contribute to the averaging process. It should be mentioned that using the larger search area, the better results can be obtained, but it takes more time. Also, parameter T denotes the number of noisy input images and Ω is the area related to the whole image X. Also, wðk; l; i; j; t Þ is the weight and shows the similarity between the pixel (i,j) in image yt and the pixel (k,l) in image X which can be calculated as follows: ! ‖R^ k;l X R^ i;j yt ‖22 ð2Þ wðk; l; i; j; t Þ ¼ exp 2 h where R^ k;l is the patch extraction operator which its size is 2 different from Ri;j in Eq. (1). Also, h is a smoothing parameter and controls the effect of the gray-level difference between two image patches. After calculating the weights, the image X will be obtained by zeroing its derivative as follows: P P wðk; l; i; j; tÞyt ði; jÞ t A ½1;:::;T i;j A Nðk;lÞ ^ lÞ ¼ P P Xðk; : ð3Þ wðk; l; i; j; tÞ t A ½1;:::;T i;j A Nðk;lÞ
In 2009, Protter et al. [44] developed NLM denoising idea to multi-frame SR method. In NLM denoising method, the input and output images have similar resolution, but in super resolution, the output frame should have higher resolution than the input frames. So, they first enhanced the resolution of reference frame using a conventional interpolation method like Lanczos interpolation [48] and then replaced these frame pixels with the new values using a method similar to denoising algorithm. In NLMbased SR method [44], for each pixel (k,l) in the reference frame (the frame that we want to enhance its resolution) the amount of X(k,l) will be estimated based on Eq. (4) as
106
H.R. Kanan, S. Salkhordeh / Signal Processing 118 (2016) 103–114
follows: Xðk; lÞ ¼
P
P
wðk; l; i; j; tÞyt ði; jÞ
t A ½1;:::;T i;j A Nðk;lÞ
P
P
ð4Þ
wðk; l; i; j; tÞ
t A ½1;:::;T i;j A Nðk;lÞ
where T denotes the number of candidate frames, yt ði; jÞ is the intensity of candidate pixel (i,j) in tth LR image, and wðk; l; i; j; tÞ, is defined as follows: ! ‖Rk;l yt0 Ri;j yt ‖22 wðk; l; i; j; t Þ ¼ exp ð5Þ 2 h th
where yt0 and yt are the t0 high resolution image (reference image) and tth (one of the candidate images) respectively. Considering the resolution difference between input frames, resolution of other input frames should be enhanced using Lanczos interpolation method. 3. The proposed super resolution algorithm In this paper, we propose a new method based on magnitude of PZM which is more robust to noise and has higher description ability for the same order compared to ZM for representing the pixels. Also, due to the fact that
the phase of PZM provides meaningful information for image reconstruction [49], we propose a new phase-based descriptor to compare image blocks by making the phase of PZM invariant to rotation. The following sections will describe our algorithm in details. 3.1. Pseudo Zernike Moment (PZM) The PZM is a widely utilized orthogonal moment due to its ability for image representation and less sensitivity to image noise. The kernel of PZMs is a set of orthogonal Pseudo Zernike polynomials defined inside a unit circle. The two-dimensional complex PZMs of order n with repetition m of a continuous image intensity function f ðx; yÞ are defined as [50] PZM n;m ðf ðx; yÞÞ ¼
n þ1 ∬x2 þ y2 r 1 V nn;m ðx; yÞf ðx; yÞdx dy π
ð6Þ
where n ¼ 0; 1; 2; :::; 1 and m takes on positive and negative integer values under the constraint jmj r n. The symbol n also denotes the complex conjugate. It should be mentioned that the definition of ZMs is the same as that of PZMs except that in ZM, there is a further constraint (i. e., n |m| ¼even). The Pseudo Zernike polynomials V n;m ðx; yÞ are defined as ð7Þ V n;m ðx; yÞ ¼ Rn;m ðrÞejmθ pffiffiffiffiffiffiffiffiffiffiffiffiffiffi where r ¼ x2 þ y2 is the length of the vector from the origin to the pixel (x,y) and θ ¼ tan 1 ðy=xÞ is the angle between the vector r and the x-axis. The radial polynomials Rn;m ðrÞ are defined as Rn;m ðrÞ ¼
nX jmj s¼0
ð2n þ 1 sÞ! r ðn sÞ : ð 1Þs : s!ðn jmj sÞ!ðn þ jmjþ 1 sÞ! ð8Þ
Fig. 2. Graph of radial polynomials Rn;m ðrÞ with orders n¼ 0–10.
It should be mentioned that Rn; m ðrÞ ¼ Rn;m ðrÞ. Fig. 2 indicates a graph of radial polynomials Rn;m ðrÞ with others n ¼0–10. To calculate the PZM for a digital image, the integrals in Eq. (6) are replaced by summations and pixel coordinates in an image must be normalized into [0, 1] by a linear
y Normalization
0
N-1
q 1 x 2
2
1 N-1
p Fig. 3. Pixel coordinates normalization into the unit circle.
H.R. Kanan, S. Salkhordeh / Signal Processing 118 (2016) 103–114
mapping transform. In other words, the center of image is taken as the origin and pixel coordinates are mapped to the range of a unit circle, i.e., x2 þ y2 r 1. There are two possibilities for this mapping transform [51]. The general mapping approach maps a N N image to bound a unit circle inside of it. The pixels placed outside of the unit circle are not used for the calculation of PZM. All the information carried by these pixels will be lost and therefore the image representation ability of the extracted PZM becomes deteriorated. In another method which we utilized in this paper, the entire N N image is bounded inside the unit circle. This method ensures that there is no pixel loss during PZM calculation. This linear transformation on an image is shown in Fig. 3 using the following equations: pffiffi pffiffi xq ¼ 22 þ N 21q; q ¼ 0; 1; :::; ðN 1Þ pffiffiffi pffiffiffi 2 2 p; p ¼ 0; 1; :::; ðN 1Þ: ð9Þ yp ¼ 2 N1 Thus, the discrete form of PZM of order n with repetition m for mapped digital image intensity f ðxq ; yp Þ is rewritten as 1 N 1 ðn þ 1Þ NX X ¼ PZM n;m f xq ; yp V nn;m ðxq ; yp Þf xq ; yp πλðNÞ p ¼ 0 q ¼ 0
ð10Þ where normalization factor λðNÞ is the ratio between the number of pixels in the image before normalization and the area of normalized image. This normalization factor for 2 our mapping method is λðNÞ ¼ N2 . 3.2. Feature vector creation based on the magnitude of PZM and PZM n; m f xq ; yp ¼ PZM n n;m f xq ; yp n PZM n; m f xq ; yp ¼ PZM n;m f xq ; yp ¼ PZM n;m ðf xq ; yp Þ, the magnitudes of PZMs of order n ¼ 0 up to Since
nmax with m Z0 (as listed in Table 1 for the case of nmax ¼ 5) will be considered as magnitude-based PZM feature vector in this paper. So, this feature vector for image f xq ; yp can be represented as i n nmax h PZM f xq ; yp ¼ jPZM u;v f xq ; yp ju ¼ 0; 1; :::; nmax ; o v ¼ 0; 1; :::; u :
ð11Þ
Table 1 List of PPZM with order n¼ 0 up to 5. N PZM
0 1 2 3 4 5
PZM 0;0 PZM 1;0 ; PZM 2;0 ; PZM 3;0 ; PZM 4;0 ; PZM 5;0 ;
Number of PZM
PZM 1;1 PZM 2;1 ; PZM 3;1 ; PZM 4;1 ; PZM 5;1 ;
PZM 2;2 PZM 3;2 ; PZM 3;3 PZM 4;2 ; PZM 4;3 ; PZM 4;4 PZM 5;2 ; PZM 5;3 ; PZM 5;4 ; PZM5;5
1 2 3 4 5 6
107
According to this feature vector, Eq. (5) for the weight calculation between two pixels can be rewritten as wMag ðk; l; i; j; t Þ 0
B ¼ expB @
P
1 n h n 0h i i max max ‖ PZM f xq ; yp ðk; l; t 0 Þ PZM f xq ; yp ði; j; t Þ ‖22 C C A 2 h
ð12Þ where
h i PZM f xq ; yp ðk; l; t 0 Þ is the extracted feature nmax
vector of the block around pixel (k,l) in the reference frame n 0h i max of t0 and PZM f xq ; yp ði; j; t Þ is the extracted feature vector of the block around pixel (i,j) in tth candidate frame based on the magnitude of PZM. Based on this obtained weight, a new value of pixel (k,l) in the high resolution image can be defined as follows: P P wMag ðk; l; i; j; tÞyt ði; jÞ Xðk; lÞ ¼
t A ½1;:::;T i;j A Nðk;lÞ
P
P
wMag ðk; l; i; j; tÞ
:
ð13Þ
t A ½1;:::;T i;j A Nðk;lÞ
3.3. Feature vector creation based on the phase of PZM Two-dimensional PZM is complex and therefore, can be defined as combination of magnitude and phase coefficients as follows: PZM n;m f xq ; yp ¼ PZM n;m f xq ; yp ejφn;m ð14Þ where PZM n;m f xq ; yp and φn;m are magnitude and respectively. Also, phase coefficients of PZM n;m f xq ; yp the image may be reconstructed to an arbitrary precision by Z X X f xq ; yp ¼ lim PZM n;m f xq ; yp V n;m ðxq ; yp Þ Z-1
n¼0 m
ð15Þ where the second sum is taken over all jmj rn and the are computed over the unit circle. PZM n;m f xq ; yp Li and Lee [52] proved that in addition to magnitude coefficients, phase coefficients of PZM provide useful information for image reconstruction. However, phase coefficients are not inherently rotation invariant and therefore, they cannot be utilized for feature extraction when the image affected by rotation. In the following, we modify phase coefficients of PZM to nullify the impact of rotation on them and define a new rotation invariant image descriptor which will be used for feature extraction in order to obtain the weights. Assume θ is the rotation angle of the image; PZM n;m f xq ; yp and PZM Rn;m f xq ; yp denotes the original PZM and its rotated version, respectively. Hence, we can write PZMRn;m f xq ; yp ¼ PZM n;m f xq ; yp expð jmθÞ
108
H.R. Kanan, S. Salkhordeh / Signal Processing 118 (2016) 103–114
¼ PZM n;m f xq ; yp
where
φRn;m ¼ φn;m mθ ð16Þ where PZM n;m f xq ; yp and φn;m denote the magnitude
vector of the block around pixel (k,l) in the reference ( 0 ) i nmax h _ frame of t0 and φ f xq ; yp ði; j; t Þ is the extracted
and phase coefficients respectively. From Eq. (16) it can be seen that, the magnitude stay unchanged while the phase is affected by image rotation. In order to remove the effect introduced by a rotation, we combine phase coefficients of different orders and repetitions to create the complex valued rotation invariant PZM as follows:
feature vector of the block around pixel (i,j) in the tth candidate frame based on the phase of PZM. Based on this obtained weight, a new value of pixel (k,l) in the high resolution image can be defined as follows: P P wPhase ðk; l; i; j; tÞyt ði; jÞ
_ PZM n;m
¼ PZM n;m f xq ; yp e jmφn0 ;1 f xq ; yp
n0 ¼ 1; 2; 3; :::; nmax :
P
P
wPhase ðk; l; i; j; tÞ
:
ð24Þ
ð17Þ
φRn0 ;1 ¼ φn0 ;1 θ ð19Þ
Finally, by combing Eqs. (18) and (19), we can conclude that _R _ φn;m ¼ φRn;m mφRn0 ;1 ¼ φn;m :
t A ½1;:::;T i;j A Nðk;lÞ
t A ½1;:::;T i;j A Nðk;lÞ
The corresponding phase change of the above complex valued rotation invariant PZM can be also defined as _ φn;m ¼ φn;m mφn0;1 ð18Þ _ _ and φn;m are modified PZM and where PZM n;m f xq ; yp modified phase angle respectively. According to Eq. (16), we can write
mφRn0 ;1 ¼ mφn0 ;1 mθ:
Xðk; lÞ ¼
nmax h i _ φ f xq ; yp ðk; l; t 0 Þ is the extracted feature
ð20Þ
From Eq. (20), it can be observed that the modified phase _ angle φn;m of the original image is exactly equal to the _R modified phase angle φn;m of the rotated image. It should be noted that the value of order n0 could be any number from 1 to nmax. However, since PZMs with higher orders contribute to high frequency details of the image which are more sensitive to noise and distortion, so, we set n0 to 1. Therefore, the modified phase angle can be defined as follows: _ φn;m ¼ φn;m mφ1;1 : ð21Þ _ Since φ1;1 ¼ 0, the modified phase coefficients of PZMs _ of order n ¼ 1 up to nmax with m Z 0 (except φ1;1 ) will be considered as modified phase-based PZM feature vector in this Therefore, this feature vector for image research. f xq ; yp can be represented as i nmax h _ φ f xq ; yp n _ ¼ φu;v f xq ; yp u ¼ 1; :::; nmax ; o v ¼ 1; :::; u; except: ðu ¼ 1; v ¼ 1Þ : ð22Þ According to this feature vector, Eq. (5) for the weight calculation between two pixels can be rewritten as wPhase ðk; l; i; j; t Þ 0
) 1 (nmax 0 h 2 nmax h i i _ B P _ φ f xq ; yp ðk; l; t 0 Þ φ f xq ; yp ði; j; t Þ C B C 2C B B C ¼ expB C 2 h B C @ A
ð23Þ
4. Experimental results In order to evaluate the feasibility and effectiveness of the proposed PZM-based SR algorithm, an extensive experimental investigation is conducted using different image sequences. The first image sequence (synthetic sequence) includes 9 images (frames) with global motion. The cameraman image is used for building this image sequence and this image is shifted zero, one and two pixels in both horizontal and vertical directions (i.e., dx ¼ f0; 1; 2g and dy ¼ f0; 1; 2g). Thus, we have 9 HR frames with the global shifts to each other. This sequence is then blurred using a 3 3 uniform mask (mean filter), decimated by a factor of 1:3 (in each axis) and then contaminated by an additive Gaussian noise with zero mean and the standard deviation 3. After applying these operators, a sequence of 9 LR, blurred and noisy frames, with global motion will be obtained. The second type sequence includes images with local shifts, for example “Miss America” and “mobile”. Similar to the explanation for the previous image sequence, the downsampling, noise and the blurring operators are also applied to all images of these sequences. In all the experiments, we use three evaluation methods for performance measurement including subjective (visual) evaluation, structural similarity (SSIM) [53] and Peak Signal-to-Noise Ratio (PSNR). The PSNR in decibels (dB) is defined as PSNR ¼ 10 log10
ð255Þ2 MSE
ð25Þ
where MSE is the mean-square error obtained per pixel. The performances of the proposed algorithm are compared with four benchmark approaches, i.e. a 3:1 pixelreplication in each axis and Lanczos interpolation method [48] which are two widely used baseline approaches, the Nonlocal-Means-based (NLM-based) [44] and Zernike Moment-based (ZM-based) [45] approaches which are two state of the art algorithms. 4.1. Determination of parameters In this part, we specify the parameters involved in the proposed algorithm. It is obvious from theory that a PZM with zero order represents the mean value of the image
H.R. Kanan, S. Salkhordeh / Signal Processing 118 (2016) 103–114
pixels which changes with different illumination conditions. According to Eq. (8) and also as indicated in Fig. 2, the value of radial polynomial Rn;m ðrÞ for n ¼ 0 does not change with radius rand always equals to 1. This means that V 0;0 ðxq ; yp Þ ¼ 1. So, Eq. (10) when n ¼ 0 can be rewritten as 1 N 1 X 2 NX V n0;0 ðxq ; yp Þf xq ; yp PZM 0;0 f xq ; yp ¼ 2 πN p ¼ 0 q ¼ 0 2 ¼ f xq ; yp π
ð26Þ
109
where f xq ; yp is the mean value of image pixels. Therefore, the minimum order in our PZM is set as 1. In order to make a direct comparison of the proposed method against the ZM-based super resolution algorithm [45], the same value for nmax as used in [45] was selected, in which nmax was set to 3. It should be mentioned that since the PZM with any arbitrary order and repetition 0 (i.e., PZMn,0) is real and therefore their phase coefficients are zero (please see Eqs. (7) and (8)), so, we exclude them from the modified phase-based PZM feature vector. Also,
Table 2 Performance comparison of the proposed method and other benchmark approaches in the first experiment. Frame number
The algorithm (PSNR/SSIM) Pixel Replication method
2 5 8
22.6397/0.7383 22.6616/0.7365 22.5938/0.7379
Lanczos interpolation method [48]
24.0916/0.7771 24.1229/0.7756 24.1774/0.7773
The NLM-based SR method [44]
22.5330/0.7496 22.6830/0.7512 22.6039/0.7535
The ZM-based SR method [45]
24.3226/0.8133 24.4884/0.8128 24.5820/0.8167
The proposed SR method Magnitude coefficients
Modified phase coefficients
26.9165/0.8656 26.8936/0.8627 26.9145/0.8646
26.1669/0.8541 26.1044/0.8531 26.2767/0.8548
Fig. 4. Results for the 5th frame from the “Cameraman” sequence with 51 rotation. The first row from left to right: rotated original image (ground truth) and rotated LR image. The second row from left to right: Pixel Replication method; Lanczos interpolation method and NLM-based method. The third row from left to right: ZM-based method; the magnitude-based proposed algorithm and the modified phase-based proposed algorithm.
110
H.R. Kanan, S. Salkhordeh / Signal Processing 118 (2016) 103–114
Table 3 Performance comparison of the proposed method and other benchmark approaches in the second experiment. Frame number
The algorithm (PSNR/SSIM) Pixel Replication method
8 18 28
31.5751/0.8646 31.5822/0.8656 31.3331/0.8636
Lanczos interpolation method [48]
34.1683/0.9100 34.0795/09101 33.8079/0.9091
The NLM-based SR method [44]
33.1894/0.9047 33.6937/0.9141 33.4371/0.9104
for the proposed method, NLM-based [44] and ZM-based [45] algorithms, the similarity block size used for computing the weights which did not vary between different tests, was set to 11 11 pixels. Moreover, the search area was specified manually in order to decrease the computational time as 5 5 pixels. The parameter h as used in [45] was also set to 30. 4.2. Experiments To compare the proposed method with the Pixel Replication method, Lanczos interpolation method [48], NLM-based [44] and ZM-based [45] super resolution approaches, three experiments were conducted. In the first experiment, one of the frames of the cameraman sequence which is rotated 5 degrees, is reconstructed utilizing 8 other free of rotation frames. The second experiment is performed on a sequence containing 30 frames of “Miss America” sequence. In this experiment, similar to the first experiment, one of the frames which is rotated 51, is reconstructed utilizing 29 other free of rotation frames. The third experiment is also carried out on “mobile” sequence in normal conditions (i.e. without any rotation). The obtained results of the proposed method, Pixel Replication method, Lanczos interpolation method [48], NLM-based [44] and ZM-based [45] approaches in the first experiment are tabulated in Table 2. It can be seen from Table 2 that in case of rotation of even one of the sequence images, the performance of NLM-based method is approximately equal to the performance of the Pixel Replication method and worse than the performance of the Lanczos interpolation method. From Table 2, it can be also seen that the proposed method based on the magnitude coefficients PZM outperformed the Pixel Replication method, Lanczos interpolation method, the most widely used baseline for super resolution, the NLM-based method and ZM-based algorithm. Table 2 also indicates that the proposed algorithm based on the modified phase coefficients is superior to all the compared methods in the rotation conditions. The reconstructed images by Pixel Replication, Lanczos interpolation, NLM-based, ZM-based and the proposed methods in this experiment are also displayed in Fig. 4. In Fig. 4, the first row from left to right shows rotated original image (ground truth) and rotated LR image respectively. The second row from left to right indicates the reconstructed images by Pixel Replication method, Lanczos
The ZM-based SR method [45]
33.2828/0.9100 34.0790/0.9165 33.7706/0.9060
The proposed SR method Magnitude coefficients
Modified phase coefficients
34.9425/0.9190 35.7732/0.9198 35.4718/0.91880
33.5704/0.9116 34.4533/0.9176 34.0039/0.9148
interpolation and NLM-based methods respectively. The third row from left to right also shows the reconstructed images by ZM-based method, the magnitude-based proposed algorithm and the modified phase-based proposed algorithm respectively. It can be observed from Fig. 4 that, in comparison with the Pixel Replication method, Lanczos interpolation method, NLM-based SR method and ZM-based SR algorithm, our proposed algorithm (both based on the magnitude and the modified phase) also improves the performance in subjective (visual) evaluation. In the second experiment, first the 8th frame of “Miss America” sequence was rotated 51 and then this rotated frame was reconstructed using other 29 rotation free images with upscaling factor 3. This process was repeated for the 18th and 28th frame of the sequence. The obtained results of the proposed method, Pixel Replication method, Lanczos interpolation method [48], NLM-based [44] and ZM-based [45] approaches in terms of PSNR and SSIM are tabulated in Table 3. It can be seen from Table 3 that the results of both magnitude-based and modified phase-based proposed algorithms are superior to all the compared methods in rotation conditions. It is found that the proposed approach outperformed the ZM-based method which is rotation invariant algorithm. The reason could be due to the higher description capability of PZM for the same order compared to ZM. The reconstructed images by Pixel Replication method, Lanczos interpolation, NLM-based, ZM-based and the proposed method in this experiment are also displayed in Fig. 5. In Fig. 5, the first row from left to right shows rotated original image (ground truth), reconstructed images by Pixel Replication and Lanczos interpolation methods respectively. The second row from left to right also indicates the reconstructed images by NLM-based method, ZM-based method, the magnitude-based proposed algorithm and the modified phase-based proposed algorithm respectively. From Fig. 5, it can be observed that our proposed algorithm (both based on the magnitude and the modified phase) also improves the subjective (visual) evaluation performance in comparison with the Pixel Replication method, Lanczos interpolation method, NLM-based SR method and ZM-based SR algorithm. The third experiment is performed on 30 frames of “mobile” sequence. In this experiment, the capability of the
H.R. Kanan, S. Salkhordeh / Signal Processing 118 (2016) 103–114
111
Fig. 5. Results for the 28th frame from the “Miss America” sequence with 51 rotation. The first row from left to right: rotated original image (ground truth); Pixel Replication method and Lanczos interpolation method. The second row from left to right: NLM-based method; ZM-based method; the magnitudebased proposed algorithm and the modified phase-based proposed algorithm.
Table 4 Performance comparison of the proposed method and other benchmark approaches in the third experiment. Frame number
The algorithm (PSNR/SSIM) Pixel Replication method
8 18 28
19.0070/0.5903 19.6114/0.6099 19.4829/0.6120
Lanczos interpolation method [48]
19.5717/0.5980 20.2740/0.6248 20.3658/0.6341
The NLM-based SR method [44]
20.1925/0.6980 20.8337/0.7086 20.7505/0.7038
proposed algorithm on rotation free images in which the 8th, 18th and 28th frames are reconstructed with upscaling factor 3 is studied. The obtained results of the proposed method, Pixel Replication method, Lanczos interpolation method [48], NLM-based [44] and ZM-based [45] approaches in terms of PSNR and SSIM are tabulated in Table 4. It can be seen from Table 4 that the results of the magnitude-based proposed algorithm are superior to all the compared methods in natural conditions. Table 4 also indicates that the modified phase-based proposed algorithm outperformed Pixel Replication method, Lanczos interpolation method, NLM-based SR method and ZMbased SR algorithm on rotation free images. The obtained results of this section together with the results of the previous experiments confirm that the proposed modified phase-based method can perform better than the above benchmark approaches not only in case of rotated images but also in natural condition. The reconstructed images by Pixel Replication, Lanczos interpolation, NLM-based, ZMbased and the proposed methods in this experiment are also displayed in Fig. 6.
The ZM-based SR method [45]
20.6893/0.7086 21.2900/0.7176 21.1737/0.7100
The proposed SR method Magnitude coefficients
Modified phase coefficients
21.5237/0.7487 22.3271/0.7766 22.1930/0.7690
20.9955/0.7119 21.4499/0.7476 21.3731/0.7340
In Fig. 6, the first row shows original image (ground truth). The second row from left to right shows reconstructed images by Pixel Replication and Lanczos interpolation methods respectively. The third row from left to right indicates the reconstructed images by LNM-based and ZM-based methods respectively. The fourth row from left to right also shows the reconstructed images by the magnitude-based and the modified phase-based proposed algorithms respectively. It can be observed from Fig. 6 that the proposed algorithm (both based on the magnitude and the modified phase) also improves the subjective (visual) evaluation performance in comparison with the Pixel Replication method, Lanczos interpolation method, NLM-based SR method and ZM-based SR algorithm. Also Table 5 presents the mean PSNR and the mean SSIM for each of the test sequences of 30 frames. It can be seen from Table 5 that the proposed PZM-based SR algorithm based on the magnitude coefficient and also the modified phase coefficients is totally superior to all the compared methods.
112
H.R. Kanan, S. Salkhordeh / Signal Processing 118 (2016) 103–114
Fig. 6. Results for the 8th frame from the “Mobile” sequence. The first row: original image (ground truth). The second row from left to right: Pixel Replication method and Lanczos interpolation method. The third row from left to right: NLM-based method and ZM-based method. The fourth row from left to right: the magnitude-based proposed algorithm and the modified phase-based proposed algorithm.
5. Conclusions In this paper, a novel super resolution algorithm is proposed based on Pseudo Zernike Moments. In the
presented method, magnitude of PZMs which are more resistant to noise and have higher description capability compared to Zernike Moments, are extracted as a rotation invariant descriptor for representation the pixels. Also,
H.R. Kanan, S. Salkhordeh / Signal Processing 118 (2016) 103–114
113
Table 5 The mean PSNR and the mean SSIM results for two test sequences of 30 frames obtained for the proposed method and other benchmark approaches. Sequence
Miss-America Mobile
Pixel Replication method
31.5060/0.8648 19.5195/0.6013
Lanczos interpolation method [48]
33.9385/0.9145 19.8080/0.6098
The NLM-based SR method [44]
33.5660/0.9116 20.1732/0.6980
considering that the phase coefficients of PZMs which are not inherently rotation invariant, comprises substantial amount of information for image reconstruction, in this paper, a novel and modified phase-based PZM descriptor is introduced which is rotation invariant and utilized for feature extraction. The proposed algorithm has been evaluated and compared with four benchmark approaches including Pixel Replication method, Lanczos interpolation method [48] which is a widely used baseline approach, the Nonlocal-Means-based (NLM-based) approach [44] and Zernike Moment-based (ZM-based) approach [45] which are two previously successful existing approaches from the aspects of objective (i.e., PSNR and SSIM) and subjective evaluations. It is a very encouraging finding that the proposed algorithm performs superior to all the compared benchmark approaches. References [1] J.D. Van Ouwerkerk, Image super-resolution survey, Image Vis. Comput. 24 (10) (2006) 1039–1052. [2] R.Y. Tsai, Thomas S. Huang, Multiframe image restoration and registration, Adv. Comput. Vis. Image Process. 1 (2) (1984) 317–339. [3] S.P. Kim, Nirmal K. Bose, H.M. Valenzuela, Recursive reconstruction of high resolution image from noisy undersampled multiframes, IEEE Trans. Acoust. Speech Signal Process. 38 (6) (1990) 1013–1027. [4] Seung P. Kim, W.-Y. Su, Recursive high-resolution reconstruction of blurred multiframe images, IEEE Trans. Image Process. 2 (4) (1993) 534–539. [5] Seunghyeon Rhee, Kang Moon Gi, Discrete cosine transform based regularized high-resolution image reconstruction algorithm, Opt. Eng. 38 (8) (1999) 1348–1356. [6] Yinji Piao, Il-hong Shin, HyunWook Park, Image resolution enhancement using inter-subband correlation in wavelet domain, in: Proceedings of the IEEE International Conference on Image Processing, ICIP 2007, vol. 1, IEEE, 2007. [7] Hasan Demirel, Gholamreza Anbarjafari, Image resolution enhancement by using discrete and stationary wavelet decomposition, IEEE Trans. Image Process. 20 (5) (2011) 1458–1460. [8] Sara Izadpanahi, Hasan Demirel, Motion based video super resolution using edge directed interpolation and complex wavelet transform, Signal Process. 93 (7) (2013) 2076–2086. [9] Xin Li, Michael T. Orchard, New edge-directed interpolation, IEEE Trans. Image Process. 10 (10) (2001) 1521–1527. [10] D. Zhang, Xiaolin Wu, An edge-guided image interpolation algorithm via directional filtering and data fusion, IEEE Trans. Image Proces. 15 (8) (2006) 2226–2238. [11] Min Li, Truong Q. Nguyen, Markov random field model-based edgedirected image interpolation, IEEE Trans. Image Process. 17 (7) (2008) 1121–1128. [12] Moshe Ben-Ezra, Zhouchen Lin, Bennett Wilburn, Penrose pixels super-resolution in the detector layout domain, in: Proceedings of the IEEE 11th International Conference on Computer Vision, ICCV 2007, IEEE, 2007. [13] Zhouchen Lin, Heung-Yeung Shum, Fundamental limits of reconstruction-based superresolution algorithms under local translation, IEEE Trans. Pattern Anal. Mach. Intell. 26 (1) (2004) 83–97.
The ZM-based SR method [45]
33.9820/0.9148 21.2960/0.7276
The proposed SR method Magnitude coefficients
Modified phase coefficients
35.3417/0.9287 22.3501/0.7760
34.2770/0.9185 21.6676/0.7497
[14] Michal Irani, Shmuel Peleg, Motion analysis for image enhancement: resolution, occlusion, and transparency, J. Vis. Commun. Image Represent. 4 (4) (1993) 324–335. [15] Jian Sun, Zongben Xu, Heung-Yeung Shum, Image super-resolution using gradient profile prior, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, IEEE, 2008. [16] Yu-Wing Tai, et al., Super resolution using edge prior and single image detail synthesis, in: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2010. [17] Antonio Marquina, Stanley J. Osher, Image super-resolution by TVregularization and Bregman iteration, J. Sci. Comput. 37 (3) (2008) 367–382. [18] Kaibing Zhang, et al., Single image super-resolution with non-local means and steering kernel regression, IEEE Trans. Image Process. 21 (11) (2012) 4544–4556. [19] William T. Freeman, Thouis R. Jones, Egon C. Pasztor, Example-based super-resolution, IEEE Comput. Graph. Appl. 22 (2) (2002) 56–65. [20] Jian Sun, et al., Image hallucination with primal sketch priors, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, IEEE, 2003. [21] Hong Chang, Dit-Yan Yeung, Yimin Xiong, Super-resolution through neighbor embedding, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, vol. 1, IEEE, 2004. [22] Wei Fan, Dit-Yan Yeung, Image hallucination using neighbor embedding over visual primitive manifolds, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR'07, IEEE, 2007. [23] Kaibing Zhang, et al., Partially supervised neighbor embedding for example-based image super-resolution, IEEE J. Sel. Top. Signal Process. 5 (2) (2011) 230–239. [24] Xinbo Gao, et al., Joint learning for single-image super-resolution via a coupled constraint, IEEE Trans. Image Process. 21 (2) (2012) 469–480. [25] Jianchao Yang, et al., Image super-resolution as sparse representation of raw image patches, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, IEEE, 2008. [26] Xinbo Gao, et al., Image super-resolution with sparse neighbor embedding, IEEE Trans. Image Process. 21 (7) (2012) 3194–3205. [27] Weirong Liu, Shutao Li, Sparse representation with morphologic regularizations for single image super-resolution, Signal Process. 98 (2014) 410–422. [28] Jianchao Yang, et al., Image super-resolution via sparse representation, IEEE Trans. Image Process. 19 (11) (2010) 2861–2873. [29] Roman Zeyde, Michael Elad, Matan Protter, On single image scale-up using sparse-representations, Curves and Surfaces, Springer, Berlin Heidelberg, 2012, 711–730. [30] Weisheng Dong, et al., Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization, IEEE Trans. Image Process. 20 (7) (2011) 1838–1857. [31] Weisheng Dong, D. Zhang, Guangming Shi, Centralized sparse representation for image restoration, in: Proceedings of the 2011 IEEE International Conference on Computer Vision (ICCV), IEEE, 2011. [32] Kwang In Kim, Younghee Kwon, Single-image super-resolution using sparse regression and natural image prior, IEEE Trans. Pattern Anal. Mach. Intell. 32 (6) (2010) 1127–1133. [33] Yi Tang, et al., Single-image super-resolution via sparse coding regression, in: Proceedings of the 2011 Sixth International Conference on Image and Graphics (ICIG), IEEE, 2011. [34] Kaibing Zhang, et al., Single image super-resolution with multiscale similarity learning, IEEE Trans. Neural Netw. Learn. Syst. 24 (10) (2013) 1648–1659.
114
H.R. Kanan, S. Salkhordeh / Signal Processing 118 (2016) 103–114
[35] Sung Cheol Park, Min Kyu Park, Kang Moon Gi, Super-resolution image reconstruction: a technical overview, IEEE Signal Process. Mag. 20 (3) (2003) 21–36. [36] M. Bertero, P. Boccacci, Introduction to Inverse Problems in Imaging. 1998, IOP Publishing, Bristol, UK, 1985. [37] Huanfeng Shen, et al., A MAP approach for joint motion estimation, segmentation, and super resolution, IEEE Trans. Image Process. 16 (2) (2007) 479–490. [38] Sina Farsiu, et al., Fast and robust multiframe super resolution, IEEE Trans. Image Process. 13 (10) (2004) 1327–1344. [39] Shen Lijun, Xiao ZhiYun, Han Hua, Image super-resolution based on MCA and wavelet-domain HMT, in: Proceedings of the 2010 International Forum on Information Technology and Applications (IFITA), vol. 2, IEEE, 2010. [40] Stuart Geman, Donald Geman, Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images, IEEE Trans. Pattern Anal. Mach. Intell. 6 (1984) 721–741. [41] R.O. Lane, Non-parametric Bayesian super-resolution, IET Radar Sonar Navig. 4 (4) (2010) 639–648. [42] Lyndsey Pickup, et al., Bayesian image super-resolution, continued, 2006. [43] Antoni Buades, Bartomeu Coll, Jean-Michel Morel, Denoising image sequences does not require motion estimation, in: Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, AVSS 2005, IEEE, 2005.
[44] Matan Protter, et al., Generalizing the nonlocal-means to superresolution reconstruction, IEEE Trans. Image Proces. 18 (1) (2009) 36–51. [45] Xinbo Gao, et al., Zernike-moment-based image super resolution, IEEE Trans. Image Proces. 20 (10) (2011) 2738–2747. [46] C.-H. Teh, Roland T. Chin, On image analysis by the methods of moments, IEEE Trans. Pattern Anal. Mach. Intell. 10 (4) (1988) 496–513. [47] Hamidreza Rashidy Kanan, Karim Faez, Yongsheng Gao, Face recognition using adaptively weighted patch PZM array from a single exemplar image per person, Pattern Recognit. 41 (12) (2008) 3799–3812. [48] G. Wolberg, Digital Image Warping, IEEE Computer Society Press, Washington, DC, 1990. [49] Jan Flusser, Barbara Zitova, Tomas Suk, Moments and Moment Invariants in Pattern Recognition, John Wiley & Sons, 2009. [50] A.B. Bhatia, E. Wolf, Proc. Camb. Philos. Soc. 50 (1954) 40–48. [51] Chong-Yaw Wee, Raveendran Paramesran, On the computational aspects of Zernike moments, Image Vis. Comput. 25 (6) (2007) 967–980. [52] Shan Li, Moon-Chuen Lee, Chi-Man Pun, Complex Zernike moments features for shape-based image retrieval, IEEE Trans. Syst. Man Cybern. Part A: Syst. Hum. 39 (1) (2009) 227–237. [53] Xinbo Gao, et al., Image quality assessment based on multiscale geometric analysis, IEEE Trans. Image Process. 18 (7) (2009) 1409–1423.