Wavelet based seam carving for content-aware image resizing

0 downloads 0 Views 450KB Size Report
In this paper, a novel content-aware image resizing method based on wavelet analysis is proposed. We estimate the local energy map of an image by weighing ...
WAVELET BASED SEAM CARVING FOR CONTENT-AWARE IMAGE RESIZING Jong-Woo Han, Kang-Sun Choi, Tae-Shick Wang, Sung-Hyun Cheon, and Sung-Jea Ko School of Electrical Engineering, Korea University, Seoul, Korea ABSTRACT In this paper, a novel content-aware image resizing method based on wavelet analysis is proposed. We estimate the local energy map of an image by weighing its multiscale subbands appropriately. Based on the energy map, the image is resized by repeatedly carving out or inserting in a connected path of pixels which is least significant in terms of the energy. Since wavelet analysis is similar to the way the human visual system operates, the obtained energy map reflects human perception with fidelity, and thus, the semantic information in the image can be preserved faithfully in the resizing process. The experimental results show that the proposed method produces higher subjective quality images than scaling and conventional content-aware image resizing techniques. 1 Index Terms— Content-aware image resizing; seam carving; wavelet decomposition 1. INTRODUCTION Since the multimedia appliances such as TV, portable multimedia players, and mobile phones, adopt different resolution sizes of display panels, images are frequently required to be resized for each display adaptation. In general, cropping and scaling methods are utilized to resize images but these techniques employ only simple geometric manipulation so that these methods cannot provide satisfied image quality due to the loss of important semantic information in the image during the resizing process, especially when a scaling factor is very small. Avidan et al. have introduced an image operator, seam carving [1]. A seam is defined as an 8-connected path of pixels following minimum energy values in the image. Among seams, an optimal seam is determined as a least perceivable region, where the optimality is defined by the energy function based on the image gradient. Then, by inserting in or carving out the optimal seam repeatedly, the image can be resized while preserving the important contents. However, with the energy function employing the This research was supported by Seoul Future Contents Convergence (SFCC) Cluster established by Seoul R&BD Program, and by the Korea Science and Engineering Foundation (KOSEF) grant funded by the Korea government (MEST) (No. 2009-0080547).

978-1-4244-5654-3/09/$26.00 ©2009 IEEE

345

image gradient, the conventional seam carving can produce a distorted image when the intensity variation of the background is larger than that of main objects of interest. To solve this problem, Hwang et al. have applied a human attention model to the energy function by using the saliency and face detectors [2]. Both detectors estimate the region-of-interest (ROI) where large energy value is imposed, so that the important contents such as the face of people can be preserved after resizing the image. However, this method has the restriction that the input image should contain faces. In this paper, we propose a content-aware image resizing method with a modified energy function exploiting wavelet decomposition. In the modified energy function, the local energy of the image is estimated by weighing wavelet subband components. Since wavelet analysis which decomposes the image into multiscale subbands and processes individual subbands is similar to the way the human visual system (HVS) operates, the obtained energy map reflects human perception with fidelity, and thus, the semantic information in the image can be preserved faithfully during the resizing process. The rest of this paper is organized as follows. Section 2 describes the backgrounds including the brief review of the conventional seam carving and wavelet analysis. The proposed method is introduced in Section 3. The experimental results are discussed in Section 4. Finally, the conclusion is given in Section 5. 2. PROPOSED METHOD 2.1. Conventional seam carving The conventional seam carving employs the following simple energy function to split the image into the set of seams. Using the energy function, the optimal seam regarded as the least perceivable feature in the image is selected. Let I be an N u M image. Then, the energy of the (m, n) th pixel is obtained by e( I (m, n)) |

w w I (m, n) |  | I (m, n) | . wx wy

(1)

A vertical seam is given by

ICIP 2009

(a) (b) (c) Fig. 1. Example of image resizing using the test image Tunnel. (a) Original image. (b) Scaling. (c) Seam carving.

(a) (b) (c) Fig. 2. Example of image resizing using the test image Boat. (a) Original image. (b) Scaling. (c) Seam carving.

experiments have proven that multiscale transforms show a high degree of similarity with receptive field profiles recorded from the mammalian retina and the first stages in the visual cortex [4]-[5]. The wavelet transform provides a successful multiscale analysis which decomposes an image into different frequency subbands, similar to the way the HVS operates [6]. Moreover, the performance of wavelet transforms for energy compaction is much higher than other transforms so that wavelet decomposition effectively represents the important features [7]. These properties motivate us to adopt the wavelet transform to a content-aware image resizing method. The discrete wavelet transform (DWT) represents an image, in terms of one approximated image and three detailed images. The wavelet coefficients can be efficiently calculated by using a filterbank consisting of low-pass and high-pass filters [6]. Let h and g be the low-pass and highpass filters, respectively. Then, the 2-D DWT of N Ő M image I can be obtained as follows: P 1 I LL (m, n)

¦¦ h(i)h( j)I

P LL

(m  i, n  j ),

(4)

¦¦ h(i) g ( j)I

P LL

(m  i, n  j ),

(5)

¦¦ g (i)h( j)I

P LL

(m  i, n  j ),

(6)

¦¦ g (i) g ( j )I

P LL

(m  i, n  j ),

(7)

i

SX

{six }iN 1 {( x(i ), i )}iN 1 , s.t. i, | x(i )  x(i  1) |d 1 ,

(2)

where x is a mapping function such as x :[1,", n] o[1,", m] in [1]. Then, an optimal seam in the vertical direction is selected from the seams, which satisfies S*

N

arg min ¦ e( I ( six )) . S

(3)

i 1

P 1 I LH (m, n) P 1 I HL (m, n) P 1 I HH (m, n)

j

i

j

i

j

i

j

where P denotes the level of wavelet decomposition. 2.3. Proposed energy function based on wavelet analysis

Basically, the conventional seam carving preserves the important contents better than the scaling as shown in Fig. 1. However, as Avidan et al. described in [1], the conventional seam carving has an inevitable problem which the seam can be across the important contents depending on the layout and amount of contents [1]-[2]. In this case, the image can be distorted the shape of the important contents by the conventional seam carving. Furthermore, since it only utilizes the gradient of the image, texture regions such as little twigs of trees or ripples on the water tend to be preserved as important contents even though those are not in the interesting region as shown in Fig. 2. Therefore, the energy function is required to respond to the image feature in the way that the HVS operates. 2.2. Wavelet analysis for image classification In the computer vision field, researchers have demonstrated that multiscale transforms, such as wavelet transform, are essential to extract and analyze the information content of images [3]. In addition, psychophysics and physiological

346

The entire outline is well-classified by multiscale transforms which seem to appear in the visual cortex of mammals [4]. Note that the prominent edges of large objects usually appear in the coarse scale, while the coefficients located in the image details including textures are large in the fine scale. By considering such characteristics of the multiscale, the proposed energy function is defined as e(I (m, n))

L

¦Z | I P

P 1

P LH

P P (m, n) |  | I HL (m, n) |  | I HH (m, n) | ,

(8)

where L and ZP denote the maximum level of wavelet decomposition and the weighting factor for the Pth level, respectively. To further enhance the entire outline, individual subbands are weighted differently depending on their scale levels. Specifically, the weighting factor for the higher level is larger than that for the lower level to emphasize the outlines of main objects in the image. Let EP denote the total energy contained in the subbands for the Pth level, which is given by

TABLE I WEIGHTING FACTORS FOR TEST IMAGES Ice Age Beach Skier

EP

M 1 N 1

¦ ¦ | I

m 0n 0

P LH

Z1

Z2

Z3

0.32 0.32 0.32

1.15 1.13 1.13

4.57 4.34 4.24

P P ( m, n) |  | I HL (m, n) |  | I HH ( m, n ) | .

(9) (a)

In the proposed method, we fix the ratio of EP and ¦ P 1 EP , which is denoted by ĮP. Then, ZP is determined as follows: L

L

ZP

D P ¦ EP P 1

EP

.

(10)

3. EXPERIMENTAL RESULTS To evaluate the performance of the proposed method, three images including Ice Age (1280×1024), Beach (768×512), and Skier (1280×800) were tested. The decomposition level L was set to 3 for all the images and the predefined ratios Į1, Į2, and Į3 were set to 0.2, 0.3, and 0.5, respectively. Given the ratios, the corresponding weighting factors are obtained as shown in Table I. Fig. 3 describes the performance of the proposed method in a synthesized image from the animation “Ice Age”. Since the face of the squirrel where seems to be the ROI is biased to the left-side of the image, the scaling cannot achieve the high quality image preserving the contents. The conventional seam carving carves out the sky in the right-side to preserve the main figures. However, the tail is protected exaggeratedly, while the face is distorted severely. The entire outline of the squirrel including the face is preserved with fidelity by the proposed method as shown in Fig. 3(b), (c), and (d). In Fig. 4, the energy distributions obtained by the conventional and proposed methods are compared. To clarify the difference, the energy profile generated by accumulating the energy values of the energy distribution in the vertical direction is also provided. Note that this is not the exact seam energy representation. In the conventional method, large energy is mainly distributed in the region of the squirrel’s tail as shown in Fig. 4(a). Similar to the conventional method, the energy in the finest subband of wavelet decomposition is also concentrated to the tail of the squirrel as shown in Fig. 4(b). As the wavelet level is increased, however, the energy in the texture region is suppressed, whereas the energy in the structural regions becomes prominent. This is why the proposed method can protect the entire outline of the image with high fidelity.

347

(b) (c) (d) Fig. 3. Resizing results of the image Ice Age. (a) Original image. (b) Result of the scaling. (c) Result of the conventional seam carving. (d) Result of the proposed method.

(a)

(b)

(c) (d) Fig. 4. Comparison of the energy distribution and the energy profile accumulated vertically. (a) For the conventional seam carving. (b) For the subband at P=1 (the finest scale). (c) For the subband at P=2. (d) For the subband at P=3 (the coarsest scale).

(a) (b) (c) (d) (e) (f) Fig. 5. Resizing results of the image Beach. (a) Original image. (b) Result of the scaling. (c) Result of the conventional seam carving. (d) Result of the proposed method. (e) Result of the proposed method in horizontal and vertical directions with the resizing factor of 2/3. (f) Result of the proposed method in both directions with the resizing factor of 1/2.

the conventional seam carving as shown in Fig. 6(c). However, his legs are distorted since the conventional seam carving focuses on high gradient regions without considering the structure of the figure. In the proposed method, the main outline of the skier is well preserved with small loss of his right arm as shown in Fig. 6(d). 4. CONCLUSION

(a)

(b) (c) (d) Fig. 6. Resizing results of the image Skier. (a) Original image. (b) Result of the scaling. (c) Result of the conventional seam carving. (d) Result of the proposed method.

Fig. 5 contains many ripples in the left-side of the image. Since the ripples increase the gradient energy, the conventional seam carving regards the ripples as important features instead of the people in the center of the image. As a result, seam carving operator preserves the left-side of the image while carving out the other parts even including the man severely as shown in Fig. 5(c). However, since the proposed method utilizes the entire outline, this method can produce the most desirable image than other resizing methods as shown in Fig. 5(d). Figs. 5(e) and (f) show that the proposed method can resize the image faithfully with various resizing factors in both horizontal and vertical directions. In Fig. 6, the main object is laid in the diagonal direction and the boundary between snow and sky has large amount of energy. As shown in Fig. 6(b), the conventional scaling method reduces the skier so that his face becomes unrecognizable. The face of the skier remains by using

348

In this paper, we proposed a novel content-aware image resizing technique based on wavelet analysis. The proposed energy function utilizes weighted wavelet coefficients in the multiscale domains to acquire the energy distribution for the image in the way that the HVS perceives. Therefore, the obtained energy map reflects human perception with fidelity, and thus, the image can be resized with the small loss of the semantic information in the image. Experimental results confirmed that the proposed algorithm can preserve the shape of the main objects in the scene faithfully during the resizing operation and consequently provides more pleasing resized results than the conventional methods. 5. REFERENCES [1] S. Avidan and A. Shamir, “Seam carving for content-aware image resizing,” in Proc. SIGGRAPH 2007, 2007. [2] D.-W. Hwang and S.-Y. Chien, “Content-aware image resizing using perceptual seam carving with human attention model,” in Proc. ICME2008, 2008. [3] S. Mallat, “Wavelets for a vision,” Proceeding of IEEE, vol. 84, no. 4, pp. 604-614, Apr. 1996. [4] J. G. Daugmann, “Two-dimensional spectral analysis of cortical receptive field profile,” Vision Res., vol. 20, pp. 847856, 1980. [5] G. Brooks, “Wavelet characteristics of early vision,” Neural Networks for Signal Processing [1997] VII. Proceedings of the 1997 IEEE Workshop, 1997. [6] K. Huang and S. Aviyente, “Wavelet feature selection for image classification,” IEEE Trans. Image Processing, vol. 17, no. 9, pp. 1709-1720, Sep. 2008. [7] R. Neelamani, H. Choi, and R. Baraniuk, “ForWaRD: Fourier-wavelet regularized deconvolution for ill-conditioned system,” IEEE Trans. Signal Processing, vol. 52, no. 2, pp. 418-433, Feb. 2004.