Super-Fusion: A Super-Resolution Method Based ... - Semantic Scholar

16 downloads 0 Views 3MB Size Report
this paper we further extend pyramid fusion to super- fusion that provides a stable and much faster alter- native to traditional super-resolution methods. For.
Super-Fusion:

A Super-Resolution

Method

Based on Fusion

W. Zhao, H. Sawhney, M. Hansen, and S. Samarasekera Sarnoff Corporation, Princeton, NJ 08540 Abstract Reconstruction-based

super-resolution algorithms require very accurate alignment and good choice of filters to be effective. Often these requirements are hard to satisfy, for example, when we adopt optical flow as the motion model. In addition, the condition of having enough sub-samples may vary from pixel to pixel. In this paper, we propose an alternative super-resolution method based on image fusion (called super-fusion hereafter). Image fusion has been proven to be effective in many applications. Extending image fusion to super-resolve images, we show that super-fusion is a faster alternative that imposes less requirements and is more stable than traditional super-resolution meth-

ods.

1

Introduction

Enhancement of image resolution, called superresolution, by processing multiple video images has been studied by many researchers over the past decade. Work on super-resolution can be divided into two main categories: reconstruction-based methods [5, 61 and learning-based methods [7, 81. The theoretical foundations for reconstruction methods are (non-)uniform sampling theorems while learningbased methods employ generative models that are learned from samples. The goal of the former is to reconstruct the original (super-sampled) signal while that of the latter is to create/hallucinate the signal based on learned generative models. In contrast with reconstruction methods, learning-based superresolution methods assume that corresponding lowresolution and high-resolution training image pairs are available. The majority of super-resolution algorithms belong to the signal reconstruction paradigm that formulates the problem as a signal reconstruction problem from multiple samples. Among this category are frequency-based methods, Bayesian methods, BP (back-projection) methods, POCS (projection onto convex set) methods, and hybrid methods. However, there exist at least three issues in such model-based super-resolution methods. First of all, ensuring the accuracy of sample locations from multi-

ple images demands adequate alignment between multiple images that may be related through arbitrarily complex motion models, e.g., optical flow. Second, choosing correct blurring filters that may vary from pixel to pixel is not guaranteed. However, choosing the right filter is critical to the quality of reconstructed image as demonstrated in [9]. Finally, the condition of having enough samples is necessary for super-resolution [8] but how to determine whether or not there are enough samples at each pixel is still a difficult issue. Pyramid fusion [l] has proven to be very effective and robust in many applications: fusing images from different sensors or the same senor but different exposures, extending depth-of-field [l] or dynamic range [2]. Efficient noise reduction has also been demonstrated based on pyramid/wavelet [3]. More recently, it has been successfully applied to remove the scintillation distortion introduced by unstable/nonuniform atmosphere due to factors such as heat [4]. In this paper we further extend pyramid fusion to superfusion that provides a stable and much faster alternative to traditional super-resolution methods. For details on pyramid fusion, please refer to [l]. This paper is organized as follows: After describing super-fusion in Section 2, Section 3 presents the experiment results based on real video clips. Conclusions will be drawn in Section 4.

2

Fusion

based

super-resolution

We first motivate fusion-based super-resolution approach and then present the details of super-fusion.

2.1

Motivation

As mentioned early, pyramid- or wavelet-based image fusion has been applied to many applications successfully. Several factors contribute to such wide applications. We list them in the following and explain why they are important in the case of super-resolution. The first advantage of pyramid fusion comes from the fact that the pyramid/wavelet representations are localized in spatial frequency as well as in space. This helps to reconstruct images with decent global structure and fine local details. This is an important fea-

1051-4651/02 $17.00 (c) 2002 IEEE

ture for creating better-quality image, e.g., the superresolved image, than fusing images at the highest resolution only. The second advantage is the simple choice of salience patterns. In theory, the best salient features should be selected depending upon a given task. In practice, a Laplacian pattern could serve as a general quality measure for various tasks. In such case, the energy of the local Laplacian pattern is selected as the salience measure and has proven to be successful in many tasks [l, 2, 41. Employing Laplacian or similar pyramid, fusion can be used to enhance/create local structures, the details, from multiple images. This is important for deblurring and critical for recovering perceptual details for super-resolution. Finally, the third advantage of pyramid fusion, as pointed out in [l], is the remarkable insensitivity to the choices of parameters such as the size of pixel neighborhood. This implies that it is a robust method and can be used in many applications, including superresolution presented in this paper. Being able to generate super-resolved images in a single-step processing, super-fusion is fast and does not suffer the null-space problem as reconstructionbased methods do when samples are not enough [8]. In summary, we argue that super-fusion could be a stable and much faster alternative to traditional superresolution methods. However, certain drawbacks also exist in superfusion. First, there is no guarantee that perfect reconstruction can be achieved even when alignment and choice of filters are good. Second, some artificial facts may appear in the super-resolved image due to the creation nature of fusion.

2.2

Super-Fusion

A straight-forward implementation of super-fusion involves three steps: 1) computation of global parametric model plus local flow from each hame to the reference frame that needs to be super-resolved, 2) warping all other frames to the reference frame and zooming-up via interpolation, 3) fusing the interpolated reference frame and warp-interpolated frames. However, two issues exist in this implementation. First, step 2 involves two interpolation procedures: one for warping at low-resolution and one for zoomingup. To obtain high-quality (sharper) images, we would prefer fewer interpolation procedures. Second, step 3 involves fusion based on pyramids constructed from interpolated images. Let us assume that high-resolution image to be constructed is at level 0 (the bottom level), and we have low-resolution images available at level L. Now if we construct the pyramids from level-0

Input

Video

Motion L) pyramids of the superfused image and the true high-resolution image. The super-fusion scheme is illustrated in Fig. 1.

1051-4651/02 $17.00 (c) 2002 IEEE

2.2.2

Rejecting

Warping

Outliers

In order to cover all aspects of motion error, a superresolution algorithm needs to account for warping errors. For example, in the case of large object occlusions or scene changes in the video, the alignment is inherently wrong and super-resolution is not possible. In such cases, we need to detect prominent errors in flow based on warping. We compute the cross-correlation between the reference frame and the warped frame, and if the correlation score at a point is below a certain threshold, the corresponding warped pixels are ignored in the super-fusion process.

3

Experiment

We have applied our super-fusion algorithm to real video clips ranging from home video (VHS format), video captured with DV (digital video) camcorders, to law-enforcement video. In most of these experiments, flow is computed using the hierarchical model-based motion estimation method [ll]. 3.1

Super-Resolution

In this experiment, we ran our algorithms with a zoom factor of 5 on a test pattern that is of 54 x 56. The results are plotted in Fig. 2. For comparison, we also plot the results based on bi-cubic interpolation and BP method [6]. The BP result is obtained using the same number of frames as super-fusion but is based on a new optimal flow algorithm that improves the super-resolution significantly [lo]. However, two factors contribute to the unsatisfactory results of traditional super-resolution: 1) large flow error in the super-resolved resolution, 2) insufficient number of sub-samples. These images are provided through TSWG program ’ To test our algorithm on video sequence with arbitrary scene, we ran our algorithm on a beach sequence with a zoom factor of 3. The original image (one video field) size is 720 x 240. A snap-shot of the super-fused result is plotted in Fig. 3 together with the bi-cubic interpolated result.

3.2

Deblurring

and Denoising

As a special case of super-resolution, efficient deblurring and denoising is a critical feature of video enhancement. We have applied our super-fusion algorithm to an unstable blurry/noisy football video sequence that has image size 720 x 480. We first stabilize the video sequence before we deblur it. In Fig. 4, we show a snap-shot of the original, deblurred frames. It lThe Technical Support Working Group (TSWG) is the U.S. national forum that identifies, prioritizes, and coordinates interagency and international research and development (R&D) requirements for combating terrorism.

is well-known that the popular fusion criterion based on maximum energy is sensitive to image noise. To address this issue, we choose trimmed mean as the fusion criterion and compare it with the maximum criterion.

4

Conclusions

We have presented super-fusion as a stable and fast alternative to traditional super-resolution algorithms. The efficacy of the proposed algorithm has been demonstrated on many real video clips. In current implementation of super-fusion, we hallucinate the pyramids at high resolutions by simple interpolation. A better way might be to track the deep structure of the scale space [12].

References

11P. Burt

and R. Kolczynski, “Image Capture Through Fusion,” In Proc. ht. Con& Comp. Vision, pp.173182, 1993. L. Bogoni and M. Hansen, “Pattern-Selective Color Image Fusion,” Journal of Pattern Recognition, 2000. E. Simoncelli and E. Adelson, “Noise Removal via Bayesian Wavelet Coring,” In Proc. ht. Conf Image

Proc., 1996. W. Zhao, L. Bogoni, and M. Hansen,

hancement by Scintillation

“Video EnRemoval,” In Proc. ht.

Conf. Multimedia Expo, 2001. M. Elad and A. Feuer, “Restoration of a single superresolution image form several blurred, noisy and undersampled measured images,” IEEE Trans. on Image Processing, pp. 1646-1658, 1997. 61 M. Irani and S. Peleg, “Motion Analysis for Image Enhancement: Resolution, Occlusion, and Transparency. Journal of Visual Comm. and Image Repre., Vol. 4, pp. 324-335, 1993. W. Freeman and E. Pa&or, “Learning low-level vision,” In Pcoc. Int. Con& Comp. Vision, 1999. S. Baker and T. Kanade, “Limits on Super-Resolution and How to Break Them,” In Proc. Conf. Comp. Vision and Putt. Recog., 2000. D. Cape1 and A. Zisserman, “Super-resolution enhancement of text image sequences.” In hoc. Int. Conf. Pattern Recognition, 2000. W. Zhao and H. Sawhney, “Is Super-Resolution with Optical Flow Feasible?,” In Proc. Euro. Conf. Computer Vision, 2002. J. Bergen. P. Anadan, K. Hanna, and R. Hingorani, “Hieararchical MoldeBased Motion Estimation,” In Roe. European Con& Comp. Vision, pp. 237-252, 1992. J. Koenderink, “The structure of Images,” Biol. Cybun., Vol. 50, pp. 363-370, 1984.

1051-4651/02 $17.00 (c) 2002 IEEE

Original

Bi-cubic

interpolated Figure

BP with better flow

2: Super-Resolution

Super-Fused

results based on different methods.

Superfused Figure

3: Super-Resolution

Original Figure

results based on bi-cubic interpolation

Maximum

4: Results using different fusion criteria.

Stabilization

criterion

and super-fusion.

Trimmed mean criterion introduces border effect and we leave the border to be black.

1051-4651/02 $17.00 (c) 2002 IEEE

Suggest Documents