In this paper, we present a seam carving, which is a well-known content-aware image resizing method, with a constraint on bitrates for seam path information ...
SEAM CARVING WITH RATE-DEPENDENT SEAM PATH INFORMATION Yuichi Tanaka, Madoka Hasegawa, and Shigeo Kato Department of Information Science, Utsunomiya University 7-1-2 Yoto, Utsunomiya, Tochigi, 321-8585 Japan email: {tanaka, madoka, kato}@is.utsunomiya-u.ac.jp ABSTRACT
Server (high performance)
In this paper, we present a seam carving, which is a well-known content-aware image resizing method, with a constraint on bitrates for seam path information (SPI). The SPI corresponds to pixel positions to be pruned and it generally requires high bitrate when we store or transmit it. However, the SPI should be transmitted to receivers in the case that the receivers are devices with low computing power, such as cell phones and PDAs since seam carving at the receivers is a computationally-demanding process. We resolve the problem by applying piecewise linear approximation to the seam paths. In the experimental results, the retargeted images yielded by the original seam carving and the proposed one are very similar, whereas required bitrate for the SPI in the proposed method is 34-97 % less than the original one.
Seam calculation Image compression (if needed)
Index Terms— Image retargeting, seam carving, content-aware image resizing 1. INTRODUCTION The number of cell phone and/or PDA users is growing in a rapid pace. Usually, display resolutions of those portable devices are restricted due to size constraints and battery limitations. However, users want to see good-quality images on their devices even in this situation. The restricted display can show a scaled image. Unfortunately, the users will see objects too small in the scaled image if the original image is large or objects in the image are relatively small. Moreover, object shapes become completely different from the original ones when the aspect ratio of the image is changed. Cropping is another traditional approach for image resizing, however, it is not suitable for a complex image such as an image whose important objects are located apart from each other. To resolve the problem, content-aware image/video retargeting methods have been proposed [1–6]. They remove insignificant region(s) from original images. Finally, a retargeted image maintains the size and shape of the important region(s). Seam carving (SC) is a pioneering work for content-aware image retargeting. It is based on a graph-cut of an image. A seam is defined as an eight-connected path of pixels from the top (left) to the bottom (right) of the image. When an optimal seam path is found, the pixels on the path are simply discarded and image size is shrank one pixel width/height. Also, by inserting a seam next to the optimal path with some interpolation methods, the image width/height is expanded. The SC process is iterated until the resized image reaches the desired size. We focus on reducing image size by the SC in this paper. This work was supported in part by KAKENHI 22760263 and the Kayamori Foundation of Informational Science Advancement.
978-1-4577-0539-7/11/$26.00 ©2011 IEEE
1449
Image data
Client: portable device (low performance) Seam pruning Image decompression (if needed)
SPI
Fig. 1. Pre-seam calculation model for SC at a portable device.
As previously mentioned, the SC is useful for portable devices. Unfortunately, it is a computationally demanding process to perform at gadgets with low computing power since finding optimal seam paths requires numerous repetitive calculations. There are two approaches to challenge the problem: One is to construct a SC with acceptable execution time for devices with low computing power, and another is a modification of the SC which yields seam path information (SPI) with acceptable (low) bitrates for its transmission to the receiver. We propose the latter approach in this paper since it is suitable for client-server type of computing where a server has high computing power. It is illustrated in Fig. 1. Another benefit to transmit SPI to the client side is flexibility. If the normal SC is perfomed at the server and only the resulting image is sent to the client side, any other resizing methods including scaling and cropping cannot be used at the client even if a user demands these resizing options. On the other hand, our model gives flexibility for re-editing the image. For reducing the required bitrates for the SPI, our method is based on imitating the original seam paths with a bitrate constraint. It is realized by a top-down estimation of the original seam paths with piecewise linear approximation. In the experimental results, the proposed SC shows very similar image retargeting results to the original one, whereas the bitrates for SPI are significantly reduced. 1.1. Notations Table 1 represents the notation of used symbols for an image I in this paper. To simplify the description, we use Matlab-based notations for sets of rows and columns in a matrix. For simplicity, a vertical seam, which is defined as an eightconnected path of pixels with one-pixel width from top to bottom of an image, is considered in this paper. It can be easily extended for the horizontal situation. 2. SEAM CARVING SC [1, 2] is based on a graph-cut of an image. A dynamic programming approach is used for it to decide which seams can be pruned. A cumulative cost M (i, j) for each vertical seam is calculated by the
ICASSP 2011
The last condition is the requirement from a vertical seam. Let sE,n be the n-th estimated seam path to be pruned. In this paper, sE,n is approximated from sO,n for each L rows of I and we restrict sE,n (mL + 1) = sE,n (mL), i.e., the last seam coordinate of the previous L rows and the first one of the current L rows are the same, to reduce the amount of side information. Consequently, the SPI includes: a set of used G, overall starting point sE,n (1), and a set of L.
Table 1. Reference Table of Symbol Notation for an Image I Symbol Description I(i, j) Pixel value at i-th row and j-th column I(i, :) i-th row I(:, j) j-th column I(L0 : L1 , :) Sub-image where L0 ≤ i ≤ L1 HI Height Width WI
3.2. Recursive Optimization of Estimated Seams following equation: ⎧ ⎪ ⎨M (i − 1, j − 1) + CL (i, j) M (i, j) = min M (i − 1, j) + CU (i, j) ⎪ ⎩M (i − 1, j + 1) + C (i, j) R
(1)
where CL (i, j) =|I(i, j + 1) − I(i, j − 1)| + |I(i − 1, j) − I(i, j − 1)| CU (i, j) =|I(i, j + 1) − I(i, j − 1)| . CR (i, j) =|I(i, j + 1) − I(i, j − 1)| + |I(i − 1, j) − I(i, j + 1)|
× |I(k, sn (k) − 1) − I(k, sn (k) + 1)| (2)
It is called “forward energy” criterion [2]. Finally, a seam with the smallest M (HI , j) is pruned. In this criterion, a retargeted image avoids a large gap between left and right pixel values of the pruned seam. The process is iterated until reaching the desired image width. For other details, please refer to [1, 2]. To avoid confusion, we refer to this type of SC as FE-SC where FE is the abbreviation of forward energy. Note that the FE-SC is not a reversible process. That is, when a seam was pruned, its seam path cannot be recovered from the pruned image of the size HI × (WI − 1) itself without SPI. Therefore, SPI should be sent to the client side if the pre-SC is performed at the server side. However, the FE-SC consumes very high bitrates for SPI since (HI − 1) pixels of a seam requires connection information to recover SPI. Hence, the FE-SC cannot be straightforwardly applied to the purpose in this paper. 3. RATE-DEPENDENT SC: SEAM ESTIMATION WITH PIECEWISE LINEAR APPROXIMATION In this section, we present a SC which estimates the seam path with piecewise linear approximation (PLA) of optimal seams. It utilizes a recursive processing to approximate seams. Hereafter, we refer to it as PLA-SC. 3.1. Slope Information Let sO,n and sO,n (i) (1 ≤ i ≤ HI ) be the seam path of the n-th calculated seam by the FE-SC and its i-th element, respectively. The pruned pixel position is determined by I(i, sO,n (i)). The PLA of sO,n is specified as a set of slopes. A slope G is implemented as G = arctan
Δj Δi
We assume sO,n has been already calculated before the estimation. Therefore, parameters known just before the optimization are sE,n (1) := sO,n (1) and sO,n . It is clearly understood that the optimal unit length Lopt ≤ L is different in each portion of I. To estimate sE,n effectively, we define the simplified forward energy criterion shown as 1 CSFE (sn ) = Lsn k∈row index of s (4) n
(3)
where Δi and Δj are the vertical and horizontal distance from the ≤ tan π . starting point (in pixel coordinates), Δi ≥ 0, and Δj Δi 4
where Lsn is the length of sn . It is similar to the forward energy criterion of the FE-SC [2] but it is less complex. We use it due to many calculation of CSFE (sn ) for the optimization. The following Algorithm 1 presents a detailed optimization process of sE,n for I(L0 : L0 + L, :). In Algorithm 1, TPLA > 0 is a threshold, G = {G0 , . . . , GNG −1 } where NG is the number of slope candidates, and D IREC Q UANT(a0 , a) is a function which finds the nearest value to a0 from a set a, fPLA (j, G, l) returns a set of integer lattice formalized as follows: fPLA (j, G, l) = {x |x = Gy + 0.5 + j ∧ 0 ≤ y ≤ l }
(5)
where · is a flooring operator. It is clear that sE,n can be completely recovered from the returned parameters; the PLA unit length L, the terminated coordinate sE,n (L0 +L) which is equal to sE,n (L0 + L + 1), and the optimal slope Gopt . The brief structure of this algorithm is shown in Fig. 2. Algorithm 1 Recursive Seam Estimation with PLA 1: function S EAM E ST PLA(I(L0 : L0 + L, :), sE,n (L0 ), sO,n (L0 : L0 + L), G, TPLA ) s (L +L)−s (L ) 2: Gopt ← D IREC Q UANT( O,n 0 L E,n 0 , G) 3: sE,n (L0 : L0 + L) ← fPLA (sE,n (1), Gopt , L) 4: Jorig ← CSFE (sO,n (L0 : L0 + L)), Jest ← CSFE (sE,n (L0 : L0 + L)) 5: if (Jorig − Jest ) > TPLA and L > 2 then 6: S EAM E ST PLA(I(L0 : L0 + L/2, :), sE,n (1), sO,n (L0 : L0 + L/2), G, TPLA ) 7: S EAM E ST PLA(I(L0 +L/2+1 : L0 +L, :), sE,n (L0 + L/2), sO,n (L0 + L/2 + 1 : L0 + L), G, TPLA ) 8: else 9: return L, sE,n (L0 + L), Gopt 10: end if 11: end function
3.3. Iterative Optimization We can obtain an optimal slope set by the recursive method above. However, the threshold TPLA in Algorithm 1 should be changed for
1450
Seam Estimation with PLA
sO;n
Step 1
Step 2
Step 3
sO;n sE;n at the currest step Determined sE;n Fig. 2. Recursive seam estimation process with PLA. Red lines represent division points of sO,n .
Algorithm 2 Iterative Optimization of Seams with PLA 1: function I TERO PT S EAM(I,λ, G, TPLA ) 2: sstart ← sO,n (1) 3: for Each L rows of I do 4: Extract IL and sO,L 5: Jprev ← ∞, Jcur ← DPLA (sO,L ) + λRPLA (sO,L ) 6: while Jcur < Jprev and TPLA = ∅ do 7: T ← min(TPLA ), remove T from TPLA 8: S EAM E ST PLA(IL , sstart , sO,L , G, T ) 9: sstart ← sE,n (L) 10: sE,n ← an overall estimated seam 11: Jprev ← Jcur , Jcur ← DPLA (sE,n ) + λRPLA (sE,n ) 12: end while 13: Restore TPLA 14: end for 15: end function
each image portion to take a good tradeoff between the required bitrate for SPI and the retargeted image quality. In this paper, the ratedependent seams are iteratively optimized by using a cost function with Lagrange multiplier. The purpose for the PLA-SC is to find the nearest seam sE,n to sO,n with a bitrate constraint for SPI. The cost function is represented as follows: sE,n = arg min(JPLA (sC )) sC
JPLA (sC ) = DPLA (sC ) + λRPLA (sC )
(6)
Fig. 3. Test images. Left: Yachts. Right: Prague.
where sC is a seam path candidate and λ ≥ 0 is a Lagrange multiplier. In this approach, we define DPLA (sC ) based on the reconstruction error by a simple interpolation shown below.
where N (bdiv ) is the number of elements in bdiv . The first and second terms in (9) are for the slope information and the division information, respectively.
DPLA (sC ) =DMSE (I, sO,n ) − DMSE (I, sC ) 1 (I(:, v) − U (I, v))2 DMSE (I, v) = Lv
4. EXPERIMENTAL RESULTS
(7) (8)
where Lv is the length of v and U (I, v) returns interpolated pixel values at I(:, v) by using surrounding pixels around I(:, v). In this paper, we use a simple averaging of six surrounding pixels for a target pixel1 . For a given λ in (6), the optimum threshold TPLA in Algorithm 1 is the control parameter of the required bitrates for sE,n . Therefore, (6) is iteratively optimized shown as Algorithm 2 where TPLA = {TPLA,0 , TPLA,1 , . . . , TPLA,NT }, TPLA,0 < TPLA,1 < · · · < TPLA,NT . Additionally, IL and sO,L are the original sub-image and sO,n for the corresponding loop, respectively. With this iterative optimization, sE,n for a given λ is yielded. 3.4. Bitrate Calculation In this paper, we calculate bitrates for the unit length set with a simple binary representation. In Algorithm 1, “0” is stored if the seam is not divided, whereas “1” is stored if the seam is divided. After the recursive optimization, we obtain a binary vector bdiv for the division information of the candidate sC . Finally, RPLA (sC ) can be represented as follows: RPLA (sC ) = N (bdiv )log2 NG + N (bdiv )
(9)
1 Two pixels out of eight neighboring pixels are not available due to seam pruning.
1451
In this section, we present comparisons of image retargeting performances and bitrates for SPI. For the experiment, 24-bit color images, Yachts (512 × 900 pixels) and Prague (512 × 680 pixels), which are shown in Fig. 3, are used. For the comparison, the FE-SC and the bicubic scaling are also applied for the image. We used imresize function in Matlab for the scaling. In image retargeting, luminance component of a color image is extracted and it is used for the calculation of SPI, then the calculated set of SPI is used to prune seams in RGB components. The target image widths are set to 600 for Yachts and 512 for Prague. In this paper, SPI is not encoded by entropy coding for the PLA-SC since it does not gain significant bitrate reduction after entropy coding. Whereas, that for the FE-SC is encoded by arithmetic encoder similar to [7]. First, the estimated seams and their associated bitrates are compared. Fig. 4 shows resulting seams yielded from the FE-SC and PLA-SC (λ = 10 and 100). In spite of the arithmetic coding of SPI, the FE-SC still needs high bitrates for its SPI. The required bitrates for the SPI of the PLA-SC vary according to a control parameter λ. However, it requires significantly less bitrates than the FE-SC. As a numerical expression, the PLA-SC reduces 34-97 % bitrates for the SPI compared to the FE-SC. For Yachts, which has a relatively simple structure, the bitrate reduction is significant. Figs. 5 and 6 present the resized results. Also, the enlarged portions of the resized images are shown. Since the target image size changes the aspect ratio of the original image, the scaled image has an unacceptable quality, i.e., the yachts are too short and towers
Bitrate for SPI: 0.466 bpp
Bitrate for SPI: 0.074 bpp
Bitrate for SPI: 0.013 bpp
Bitrate for SPI: 0.347 bpp
Bitrate for SPI: 0.230 bpp
Bitrate for SPI: 0.118 bpp
Fig. 4. Seam estimation results where seam paths are depicted by black. From left to right: FE-SC, PLA-SC with λ = 10, and PLA-SC with λ = 100. Top row: Yachts. Bottom row: Prague.
Fig. 5. Resizing results for Yachts. From top to bottom: resized image and its enlarged portion, respectively. From left to right: scaled image, retargeted image by the FE-SC, and retargeted image by the PLA-SC with λ = 10. become thin. Whereas the retargeted images preserve shapes of objects in the original image. As clearly visible, the retargeted images yielded by the FE-SC and the PLA-SC are very similar to each other. 5. CONCLUSIONS In this paper, a method reducing required bitrates for the SPI of the SC was presented. It is based on PLA of the original seam paths. In the estimation process, a top-down PLA is recursively applied to yield rate-dependent SPI. In the image retargeting performance, the proposed SC maintains the advantage of the original one compared to the simple scaling, whereas the required bitrates for the SPI are significantly reduced. The method is suitable for clients with low computing power, and the bit budget for side information can be negligible when the image itself is compressed at high enough bitrate. 6. ACKNOWLEDGMENT We would like to thank david.nikonvscanon of flickr (www.flickr.com) for making available Prague image in creative commons license.
1452
Fig. 6. Resizing results for Prague. From top to bottom: resized image and its enlarged portion, respectively. From left to right: scaled image, retargeted image by the FE-SC, and retargeted image by the PLA-SC with λ = 10. 7. REFERENCES [1] S. Avidan and A. Shamir, “Seam carving for content-aware image resizing,” ACM Trans. Graph., vol. 26, no. 3, 2007. [2] M. Rubinstein, A. Shamir, and S. Avidan, “Improved seam carving for video retargeting,” ACM Trans. Graph., vol. 27, no. 3, 2008. [3] L. Wolf, M. Guttmann, and D. Cohen-Or, “Non-homogeneous content-driven video-retargeting,” in Proc. ICCV’07, 2007. [4] Y. Wang, C.-L. Tai, O. Sorkine, and T.-Y. Lee, “Optimized scale-and-stretch for image resizing,” ACM Trans. Graph. (Proc. SIGGRAPH Asia), vol. 27, no. 5, 2008. [5] B. Chen and P. Sen, “Video carving,” in Proc. Eurographics, 2008. [6] Z. Li, P. Ishwar, and J. Konrad, “Video condensation by ribbon carving,” IEEE Trans. Image Process., vol. 18, no. 11, pp. 2572– 2583, 2009. [7] N. T. N. Anh, W. Yang, and J. Cai, “Seam carving extension: a compression perspective,” in Proc. seventeenth ACM international conference on Multimedia, 2009, pp. 825–828.