Classified Quadtree-based Adaptive Loop Filter - Semantic Scholar

4 downloads 211 Views 705KB Size Report
Qualcomm Inc., ITU-T SG16, JCTVC-A121, Germany, April, 2010. [7] J. Jung, T.K. Tan, S. Wittmann, E. Francois, and etc., Description of video coding technology ...
Classified Quadtree-based Adaptive Loop Filter Qian Chen† , Yunfei Zheng‡ , Peng Yin‡ , Xiaoan Lu‡ , Joel Sol´e‡ , Qian Xu‡ , Edouard Francois‡ , and Dapeng Wu† † School of Electrical and Computer Engineering, University of Florida, Gainesville, FL USA ‡ Research and Innovations, Technicolor, Princeton, NJ USA

Abstract—In this paper, we propose a classified quadtree-based adaptive loop filter (CQALF) in video coding. Pixels in a picture are classified into two categories by considering the impact of the deblocking filter, the pixels that are modified and the pixels that are not modified by the deblocking filter. A wiener filter is carefully designed for each category and the filter coefficients are transmitted to decoder. For the pixels that are modified by the deblocking filter, the filter is estimated at encoder by minimizing the mean square error between the original input frame and a combined frame which is a weighted average of the reconstructed frames before and after the deblocking filter. For pixels that the deblocking filter does not modify, the filter is estimated by minimizing the mean square error between the original frame and the reconstructed frame. The proposed algorithm is implemented on top of KTA software and compatible with the quadtree-based adaptive loop filter. Compared with kta2.6r1 anchor, the proposed CQALF achieves 10.05%, 7.55%, and 6.19% BD bitrate reduction in average for intra only, IPPP, and HB coding structures respectively.

I. I NTRODUCTION Many techniques have been developed to improve the coding performance of H.264/AVC for high definition video contents. Among these techniques, the adaptive in-loop filter (ALF) is proposed and achieves an attractive gain over H.264/AVC. The basic idea of ALF is to estimate one or multiple filters at encoder by restoring the recontracted frame towards its ordinal version. This technique can effectively remove more general coding artifacts when working together with deblocking filter and prepare better reference frames for coding future frames, therefore improving the coding performance. In[1], the wiener filter was used as the post filter to improve the subjective and objective qualities. [2] and [3] are two earliest proposals of ALF schemes to place the wiener filter in the coding loop. Although the filter is temporally adaptive by updating the filter coefficients frame by frame, it usually keeps spatially invariant. To achieve better adaptation, several techniques are proposed in recent academic and standardization activities. In [4], a block-based adaptive in-loop filter (BALF) was proposed. The recontracted frame is partitioned by nonoverlapped and equal-size blocks. For each block, a flag is used to signal whether the block will be filtered or not. Quadtreebased adaptive in-loop filter (QALF) further improves the filter performance by allowing blocks to be different size, and provides a signaling scheme in a quadtree data structure[5]. Based on QALF, multiple filter scheme is proposed by estimating a set of filters [6] instead of only one filter in QALF. Whenever the QALF partition is indicated to be filtered, for each pixel

in the block, a specific filter from the set is chosen based on a measure of local characteristic of the pixel. The filters take local characteristic of a frame into account and allow more spatial adaptation. In [7], the concept of multiple inputs filter was proposed by taking the reconstructed frame, prediction frame, and prediction residual frame as the inputs of the filter in order to incorporate more hypothesis and consider different characteristics of input signals for restoration. It automatically adapts the filter coefficients to different inputs during the filter estimation. Although the wiener filter can efficiently remove some artifacts that is caused during compression, it is less efficient to remove blocky artifacts compared with the traditional deblocking filter. So in all previous works, the wiener filter is designed after the deblocking filter. These two filters work sequentially but are designed independently, which may cause both to be sub-optimal. In [12], the deblocking filter and adaptive in-loop filter are unified into the wiener filter framework. It shows that the joint design of both filters is important in both performance and complexity. In this paper, we propose a classified quadtree-based adaptive in-loop filter (CQALF) that takes advantages of both multiple filters and multiple inputs filter schemes. The proposed scheme is based on the hybrid video coding framework, for example H.264/AVC. Instead of unifying the deblocking filter and adaptive in-loop filter into one filter, we keep the traditional structure which makes the problem simpler. However, the impact of the deblocking filter of H.264/AVC is considered during the wiener filter restoration. The output pixels of deblocking filter are classified into two categories: those modified by the deblocking filter and those untouched. A filter will be designed for each category. For the pixels modified by the deblocking filter, the adaptive loop filter takes two inputs, frames before and after the deblocking filter. A weighted averaged frame of the two inputs are used to estimate the filter, which avoids introducing additional overhead for more number of filter coefficients. For the pixels which are not modified by deblocking filter, only the frame after the deblocking filter is exploited to estimate the filter coefficients. CQALF classifies pixels by taking the impact of the deblocking filter into account in order to avoid the possible over-filtering problem. No overhead is needed to indicate the classification of each pixel because the classification is based on the reconstructed frame which is available at both encoder and decoder. The rest of the paper is organized as follows. Section II briefly revisits the deblocking filter in H.264/AVC and ALF

technique in KTA. In Section III, the proposed CQALF and its implementation details are elaborated. Section IV shows simulation results of the CQALF. Finally, conclusions are drawn in section V. II. D EBLOCKING F ILTER AND A DAPTIVE IN - LOOP F ILTER A. Deblocking Filter In H.264/AVC, a deblocking filter is exploited to remove the blocky artifacts around the block boundaries in order to improve the quality of a coding frame. To preserve image details and remove blockiness, a boundary strength (Bs) parameter is assigned to every boundary between two blocks. If filtering conditions are met, a pre-defined filter will be applied. Filtering does not take place for edges with Bs = 0. For nonzero Bs values, filtering on the edge only takes place when the following 3 conditions are met: |p0 − q0 | < α(IndexA ) |p1 − p0 | < β(IndexB ) |q1 − q0 | < β(IndexB )

(1) (2) (3)

where p0 , q0 are block boundary pixels located in Fig.1, α and β are dependent on the quantization parameter (QP) employed over the edge as well as the encoder offset value to control the properties of the deblocking filter on slice level [8]. The deblocking filter can successfully remove blocky artifacts. It is not efficient to remove more general artifacts, like ringing artifacts, blurring, and so on, which may be produced during the compression process.

Fig. 1.

Notation of pixels across a block boundary.

B. Adaptive in-loop Filter In recent video coding standardization activities, a wiener filter based adaptive in-loop filter (ALF) was proposed in order to further improve the quality of the reconstructed frame. It restores the reconstructed frame output from the deblocking filter towards its corresponding original frame. At encoder, a filter is adaptively estimated for each frame using wiener filtering theory. The filter coefficients are transmitted to decoder as the overhead. Suppose s is a pixel in the original frame; x0 is its corresponding degraded pixel in the reconstructed frame; X0 = [x0 , x1 , · · · , xn−1 ]T is a pixel set that consists of the pixels surrounding x0 with a defined filter support N . The filtered pixel of x0 can be formulated as: x ˆ0 =

n−1 X i=0

wi xi = H T X0

(4)

where H = [w0 , w1 , · · · , wn−1 ]T is the filter coefficients. Assume the filter is spatial invariant, the wiener filter is the one which minimize the mean square error of the whole frame: M SE =

M X

ksi − x ˆ i k2 =

i=0

M X ° ° °si − H T Xi ° 2

(5)

i=0

where M is the number of pixels in the frame. The wiener filter can be estimated by: H ∗ = arg minM SE = (C T C)−1 C T S

(6)

H

where C = [X0 , X1 , · · · , XM ]T , S = [s0 , s1 , · · · , sM ]T . As mentioned in I, the filter coefficients will be updated frame by frame to achieve the temporal adaptation. A quadtree based filtering signaling scheme is proposed to turn on/off the filtering operation at different regions. III. C LASSIFIED Q UADTREE - BASED A DAPTIVE IN - LOOP F ILTER A. Problem Formulation In KTA or recent released TMuC software, the deblocking filter and adaptive in-loop filter is concatenated to filter the reconstructed frame sequentially but the two filters are designed independently. Under this strategy, some pixels may be filtered twice, which may cause the over-filtering. The over-filtering can degrade the performance of both filters, thus these filters should design and work jointly in order to achieve the best possible performance. In this paper, we propose to consider the deblocking filter and ALF jointly by classifying the pixels into two categories: pixels that are modified by deblocking filter and pixels that are untouched by deblocking filter. To avoid the over-filtering, the idea is to design a different filtering operation for the pixels that have already been filtered by the deblocking filter. Let us denote the original input frame as s, the reconstructed frame before deblocking filter as s0 , and the one after deblocking filter as s00 in the following of the paper. According to the filtering rule of deblocking filter, only some pixels (especially around the block boundaries) are filtered, while the rest are unchanged. The pixel classification is done by comparing the frame s0 and s00 pixel by pixel. We denote Category I as the pixels which are modified by the deblocking filter and Category II as the pixels which are not modified by the deblocking filter. The coding diagram of CQALF is illustrated in Fig.2. For Category I, each pixel has two hypothesis from s0 and s00 . We take both of them as the input of the filter corresponding to Category I. To avoid too much additional overhead, we apply the same set of filter coefficient to both inputs instead of using two sets of filter coefficients as in [7]. To handle the two inputs differently based on their characteristics, we propose to combine them by weighting average, which is controlled by a weighting factor ω. Based on off-line training, we found that there is a relationship between weighting factor and quantization parameter (QP) for each frame type, which will be elaborated in Section III-B. For the pixels in Category

II, only one input s00 is used. Though either s0 or s00 can be used for input as they are exactly equal for every pixel to be filtered, we choose to use s00 . This is because the surrounding pixels within its filter support N are generally of better quality in s00 than in s0 , which is based on the observation that the reconstructed frame after deblocking filter shows overall improvement in both objective and subjective quality than that before deblocking filter.

Fig. 2.

Block Diagram of Encoder with CQALF.

Suppose filter length is n for both filters, the filter of Category I is H1 = [w10 , w11 , · · · , w1,n−1 ]T , the filter of Category II is H2 = [w20 , w21 , · · · , w2,n−1 ]T , input from s0 as X = [x0 , x1 , · · · , xn−1 ]T , input from s00 as Y = [y0 , y1 , · · · , yn−1 ]T , and sˆ as the output of the filter. For Category I, a pixel will be filtered as sˆi = H1T (ωXi + (1 − ω)Yi )

(7)

the filter can be estimated by H1 = arg min H1

M1 X ° ° °si − H1T (ωXi + (1 − ω)Yi )° 2

(8)

i=0

where M1 denotes the number of pixels in Category I. The pixels in Category II will be filtered as sˆi = H2T Yi

(9)

the filter can be estimated by H2 = arg min H2

M2 X ° ° °si − H2 T Yi ° 2

(10)

i=0

where M2 denotes the number of pixels in Category II. B. Parameter Settings It can be seen in 1 that the deblocking filtering strength is controlled by boundary strength and two parameters α and β. The filtering process only takes place when the difference between neighboring pixels is bigger than corresponding threshold α or β. Their values increase with QP and are obtained from a look-up table which is specified by H.264/AVC. In particular, at the low end of table whereIndexA < 16 and IndexB < 16, one or both α and β are clipped to 0 and filtering is turned off [8]. That implies few pixels will be modified in deblocking filter due to small α and β when QP is very small. It results in few different pixels in s0 and s00 . Under such circumstances, Category I is almost empty which leads to CQALF being the same algorithm as ALF.

TABLE I ω

VALUE FOR DIFFERENT FRAME TYPE AND

QP QP 25

I 0.9 0.5

P 0.5 0.5

QP

B 0.5 0.5

As mentioned before, we should consider to jointly design both deblocking filter and adaptive in-loop filter. To improve the overall performance, we observed that to increase the deblocking parameter α and β slightly will give better performance in relatively low QP range. Similar strategy has also been reported in [9]. In our algorithm, the better case of the two: with or without increasing deblocking parameter, is chosen at encoder based on the rate-distortion (RD) cost. A flag at slice header is used to signal the selected deblocking parameter setting if CQALF is enabled. As proposed in CQALF, the filter corresponding to Category I has two inputs. A weighting factor ω is used to combined them as in Eq.(7). When ω varies from 0 to 1, input s0 will gradually get involved in the filter. Considering s0 is usually of lower quality than s00 , how much it is involved should influence the performance of CQALF. In general, the weighting factor is QP dependent. Based on the off-line training, we summarize it in Table I which suggests the best coding performance. C. Compatibility with QALF As the estimated filter is optimal in the sense of minimizing the MSE of the whole frame, it may be far from optimal for each pixel. Actually, degradation can be observed for some pixels that are filtered by the estimated filter. To solve this issue, QALF introduces a region based signaling scheme to indicate whether a region is filtered or not. It adopts a bottomup method for rate-distortion optimization (RDO) of quadtree data structure to signal the block partition and filtering decision [5]. We integrate our scheme on top of QALF framework to take advantage of the region based filtering scheme. Note that in CQALF, though the classification of the pixels makes more adaptive filter design and usually lower distortion than no classification, it sacrifices the bit rate by transmitting two sets of filter coefficients per frame. Consequently, for each frame, we compare the RD cost with and without classification, and select the one with a lower RD cost. If classification is enabled, it enables the proposed CQALF, otherwise, the algorithm goes back to QALF. The quadtree signaling scheme is adopted to indicate the partition and filtering decision, which is similar as that of QALF. Since the pixels are classified into two categories, for each block to be filtered, pixels within each category will be filtered by its corresponding filter. IV. E XPERIMENTAL R ESULTS In this section, intensive experiments are conducted to verify the performance of the proposed CQALF. Simulations are conducted using class B, C, D and E test sequences that are recommended by MPEG Call for Proposal. The test condition is based on [10]. KTA software version kta2.6r1 is used as

the reference with all KTA tools enabled except for QALF. BD-PSNR is measured and compared according to [11]. To verify the performance of CQALF, we compared the BD bitrate saving of QALF and CQALF over kta2.6r1 anchor without adaptive in-loop filter, as shown in Table II. We tested three coding structures, intra only, IPPP, and HB for the first 50 frames of each sequence. QP ranges from 22 to 37 with a step of 5. Weighting factor ω is set according to Table I. As shown in Table II, CQALF always out-performances QALF in all three cases. TABLE II BD BITRATE SAVING (%) OF QALF & CQALF Sequence

IIIII QALF CQALF Kimono1 1920x1080 24 21.39 24.91 ParkScene 1920x1080 24 6.39 11.13 cactus 1920x1080 50 5.36 8.42 BasketballDrive 1920x1080 50 3.74 11.59 BQTerrace 1920x1080 60 3.58 7.04 BasketballDrill 832x480 50 10.30 11.96 BQMall 832x480 60 3.05 5.33 PartyScene 832x480 50 1.53 3.57 RaceHorses 832x480 30 5.68 7.99 BasketballPass 416x240 50 5.36 9.91 BQSquare 416x240 60 1.61 4.18 BlowingBubbles 416x240 50 1.94 3.32 RaceHorses 416x240 30 7.30 10.06 vidyo1 1280x720 60 8.54 11.99 vidyo3 1280x720 60 12.10 15.03 vidyo4 1280x720 60 7.25 11.33 Average 6.57 10.05

(a)

(b)

OVER ANCHOR

IPPPPP QALF CQALF 5.42 5.70 1.87 3.67 4.48 5.38 7.29 9.78 5.22 6.43 10.06 11.96 3.71 4.71 3.53 4.07 2.18 2.39 4.44 6.72 5.56 7.14 2.48 3.32 1.96 2.19 8.92 20.53 11.46 12.84 10.37 14.02 5.56 7.55

HB QALF CQALF 6.17 6.57 2.27 4.03 4.31 5.03 5.45 7.80 7.61 8.49 5.69 7.90 2.58 3.26 5.68 5.96 (c) (d) 1.55 1.83 3.35 4.90 Fig. 3. Reconstructed BasketballDrill (832x480) sequence at the 10th 13.15 13.75 frame with QP=37. (a) QALF: 32.12dB; (b) CQALF: 32.72dB; Reconstructed 2.89 3.31 Basketballpass (416x240) sequence at the 9th frame with QP=37. (c)QALF: 1.18 1.60 31.96dB; (d)CQALF: 32.00dB 5.06 6.76 8.96 9.75 6.78 8.02 R EFERENCES 5.17 6.19 [1] S. Wittmann and T. Wedi, ”Transmission of post-filter hints for video coding schemes“, Proceedings of 2007 IEEE International Conference We observe that the coding gain of intra only case achieved on Image Processing. [2] T. Chujoh, A. Tanizawa and T. Yamakage, ”Adaptive Loop Filter by CQALF is the largest among all other cases. In intra frame, for Improving Coding Efficiency “, ITU-T SG16 Contribution, C402, there are more pixels that may be modified by deblocking Geneva, April 2008. filter than in P and B frames. Hence the over filtering is more [3] Yi-Jen Chiu and L. Xu, ”Adaptive (Wiener) Filter for Video Compression“, ITU-T SG16 Contribution, C437, Geneva, April 2008. severe in I frame. As CQALF proposes to solve this problem, [4] T. Chujoh, A. Tanizawa and T. Yamakage, ”Block-based Adaptive loop it can be foreseen that intra only case is the most effective to Filter“, ITU-T SG16 Q.6 Document, VCEG-AJ13, San Diego, October, demonstrate the improvement of CQALF. 2008, [5] T. Chujoh, N. Wada, T. Watanabe, G. Yasuda, T. Yamakage, ”SpecificaFig.3 are selected partial frames extracted from two setion and Experimental Results of Quadtree-based Adaptive Loop Filter“, quences to demonstrate the visual performance of CQALF. ITU-T SG16, Q.6 Document, VCEG-AK22, Japan, April, 2009. It can be seen the proposed CQALF achieves competitive or [6] M. Karczewicz, P. Chen, Rajan, Video coding technology proposal by Qualcomm Inc., ITU-T SG16, JCTVC-A121, Germany, April, 2010. even better subjective quality than QALF. [7] J. Jung, T.K. Tan, S. Wittmann, E. Francois, and etc., Description of video coding technology proposal by France Telecom, NTT, NTT V. C ONCLUSION DOCOMO, Panasonic and Technicolor, ITU-T SG16, JCTVC-A114, We propose a classified quadtree-based adaptive loop filter German, April, 2010. [8] P. List, A. Joch, J. Lainema, G. Bjøntegaard, and M. Karczewicz, to improve the coding efficiency by jointly considering the ”Adaptive Deblocking Filter“, IEEE Trans. on Circuits and System for design of deblocking filter and adaptive loop filter. Pixels in a Video Technology, vol.13, No.7, July 2003. coded frame are classified into two categories, pixels modified [9] Wen Yang, ”Image/video Processing Techniques: Header Compression and Post-processing“, MPhil Thesis, Hong Kong University of Science and not modified by the deblocking filter. A Wiener filter is and Technology, May, 2010. carefully designed for each category. For the pixels modified [10] T.K. Tan, G.J. Sullivan, and T. Wedi, ”Recommended Simulation Comby the deblocking filter, the reconstructed pixels before and mon Conditions for Coding Efficiency Experiments Revision 1“, ITU-T SG16, Q.6 Document, VCEG-AJ10r1, San Diego, October 2008. after deblocking filter are combined as the input of the filter. For the pixels not modified by the deblocking filter, only [11] S. Pateux and J. Jung, ”An Excel Add-in for Computing Bjøntegaard Metric and its Evolution“, ITU-T SG16 Q.6 Document, VECG-AE07, reconstructed pixel after deblocking filter is used. Extensive Marrakech, January 2007. experiments are conducted to verify the performance of the [12] Y. Liu, Unified Loop Filter for Video Compression,IEEE Transaction on Circuits and System for Video Technology, vol.20, No.10, pp1378-1382, proposed CQALF. Compared with kta2.6r1 anchor, CQALF Oct. 2010

provides an average of 10.05% bitrate deduction for intra only, 7.55% for IPPP and 6.19% for HB case.

Suggest Documents