ABSTRACT. The latest video coding standard, HEVC, uses quarter pixel motion vector (MV) resolution for motion compensation. The adaptation of MV resolution ...
Proceedings of the IEEE International Conference on Multimedia and Expo (ICME) 2017
10-14 July 2017
A BLOCK LEVEL ADAPTIVE MV RESOLUTION FOR VIDEO CODING Bappaditya Ray1,2 , Jo¨el Jung1 and Mohamed-Chaker Larabi2 1
Orange Labs, 2 XLIM, Universit´e de Poitiers, France {bappaditya.ray, joelb.jung}@orange.com {bappaditya.ray, chaker.larabi}@univ-poitiers.fr
ABSTRACT
naling, Guo et al. proposed a scheme where MV resolution was implicitly signaled by computing the variance of the reconstructed residue block [5]. Focusing on residual energy modeling, Wang et al. proposed a scheme relying on a theoretical framework for modeling the residual energy as a function of motion vector resolution, image complexity and inter-frame noise [6]. The latter model was used for frame level optimization of MV resolution. A progressive MV resolution (PMVR) scheme, where MV resolution is gradually decreased as the distance between MV and MVP increases, has been introduced in [7]. Indeed, MV positions, which are closer to MVP [8], are more optimal in the rate-distortion (RD) sense [9], i.e. fine MV resolution is beneficial when the MV is closer to the MVP and vice versa. As a part of this work, a novel MV difference (MVD) coding scheme was proposed, showing noticeable gains over the HEVC reference software specifically for low delay P (LP) configuration [10]. The main limitation is linked to the fact that MV resolution is only dependent on the distance between the MV and MVP, irrespective of the size or spatial characteristics of the block. PMVR has been improved by Cho et al. using adaptive thresholding based on CU depth and strength of the frame [11]. Hence, finer (resp. coarser) MV resolution is employed, when CU depth is small (resp. large). Even though, this scheme brings some additional improvement when using low-delay configurations, the aforementioned strength is calculated globally per frame, and ignores local variations of spatial characteristics that it may contain. Finally in the very recent work of Wang et al., an adaptive PMVR scheme is proposed [12]. The scheme determines the optimal progressive MV resolution by decision trees constructed with the rate-distortion model, which significantly outperforms HEVC and PMVR. In this paper, a more flexible framework of PMVR scheme is proposed. The scheme allows an enhanced support of MV resolutions and provides a mechanism for selecting the appropriate MV resolution for each block depending on its size and spatial characteristics. Thus, it provides MV resolution adaptation at block level, accounting for the local variations of spatial characteristics inside the frame. Additionally, a smarter motion estimation around multiple MV predictors is proposed to further enhance the performance. The proposed scheme is described in Section 2 and the proposed configuration is given in Section 3. Section 4 discusses the simulation results obtained in HEVC, followed by some conclusions and ideas about future work.
The latest video coding standard, HEVC, uses quarter pixel motion vector (MV) resolution for motion compensation. The adaptation of MV resolution supported by PMVR (progressive MV resolution) brings further improvement of the performance, by progressively adjusting the resolution according to the distance between the MV and the MV predictor (MVP). In this work, we propose to improve PMVR by adapting MV resolution at the prediction unit (PU) level relying on its size and its average absolute gradient. We additionally perform a smarter motion estimation around multiple MV predictors to fully take advantage of the proposed scheme. Compared to HEVC reference software (HM-16.6), the proposed method provides 1.2%, 3.2% and 1.2% average BD rate savings respectively for random access (RA), low-delay P (LP) and low-delay (LD) configurations. Index Terms— motion compensation, motion vector resolution, HEVC, PMVR. 1. INTRODUCTION Motion compensation plays an important role for removing temporal redundancies in video coding schemes. Block-based video coders use block matching for motion estimation. For each prediction unit (PU), the associated motion vector (MV) is signaled in the bitstream for decoding purposes. This MV is not necessarily characterized by integer pixels, sub-pixel motion resolution can be used to represent motion in a more accurate way, thus improving the motion compensation process. However, a fine MV resolution also generates more signaling due to the coding of high precision MVs. Consequently a tradeoff should be found for MV resolution to balance between the accuracy of the motion compensation process and the bit-budget used for signaling MVs. In the latest video coding standard (HEVC) [1], MV is represented using 1/4 pixel accuracy and this MV resolution is uniform across all blocks and frames. However, this uniformity of MV resolution is not optimal because of the varying characteristics among the different blocks. During the last decade, adaptive MV resolution has been an active research topic. Ribas-Corbera and Neuhoff provided a theoretical framework for optimizing MV accuracy using rate model for MV and residual coding [2] at both frame and block levels. Corrado et al. presented an H.264/AVC based scheme using quantized MVs [3]. Accordingly, a new coding mode was added to the H.264/AVC framework, which uses quantized MVs (QMV) for motion compensation. Recently, in the framework of screen content video coding (HEVC-SCC), a scheme of adaptive MV resolution where slice-level control is enabled to switch between full-pel and fractional-pel resolutions, was presented [4]. With the aim to avoid overhead sig-
2. PROPOSED METHOD FOR MV RESOLUTION The proposed method is an extension of the basic PMVR scheme. This section starts with a brief description of PMVR in order to introduce all the concepts necessary for the explanation of the proposed extension. The introduction is followed by the description
c 978-1-5090-6067-2/17/$31.00 2017 IEEE 978-1-5090-6067-2/17/$31.00 ©2017 IEEE
ICME 2017 49
of the proposed method namely: Flexible MV resolution prediction (FMVRP).
deriving MV resolution, unfortunately ignoring the intrinsic characteristics of the PU.
2.1. Progressive MV resolution (PMVR)
2.2. Flexible MV resolution prediction (FMVRP)
In both H.264/AVC and HEVC, it is observed that MV positions closer to the MV predictor (MVP) [8] are more likely to be optimal in RD optimization sense [9]. So, for those MV positions, it is beneficial to search for MVs using a higher MV resolution. Consequently, MV positions far from MVP are less likely to be optimal, leading to the use of a lower MV resolution [7]. In PMVR, three different MV resolutions are commonly used, i.e. 1/2, 1/4 and 1/8 pixel. MV resolution is progressively decreased as MV goes away from the MV predictor. Around the MV predictor, a set of thresholds is defined with the aim to segment the search area in different MV resolution regions, as illustrated by Fig. 1. Thresholds T He and T Hq are set in such a way to define a square range size (centered around MVP predictor) of 1/8 and 1/4 pixel MV resolution positions.
In our work, a more flexible framework denoted as FMVRP is introduced using the following features on top of PMVR. 2.2.1. Threshold adaptation based on PU size As shown in [2], optimal MV resolution depends on the block size. Consequently, optimal MV resolution is higher for bigger block size and vice versa. In this work, the threshold is adapted depending on the area of the PU (P Uarea = P Uheight × P Uwidth ). For HEVC, PU size for inter prediction can be from 4 × 8 or 8 × 4 to 64 × 64. So, depending on P Uarea , PUs are classified into different classes and each class has its own set of thresholds. In overall, PUs with higher P Uarea uses fine MV resolution and PUs with lower P Uarea uses relatively coarse MV resolution. Hence, the thresholds (T He and T Hq ) can be adapted depending on the PU size. Although this technique seems similar to the one used in [11], the fundamental difference lies in the fact that the threshold is adapted depending on CU depth, where CU at a particular depth may consists of multiple PUs of different sizes (due to asymmetric motion partitioning [13]) and all such PUs use the same threshold. In our scheme, thresholds are adapted at the PU level, thus providing an improved granularity of the usage of adaptive MV resolution. 2.2.2. Threshold adaptation based on PU strength Optimized MV resolution is also dependent on the spatial complexity or the strength of the block. So, apart from PU area, thresholds are adapted depending on the PU strength. In accordance to the finding in [2], finer MV resolution is used for high complexity PUs and vice versa. So, for each PU, the strength is estimated using the horizontal Gh (x, y) and vertical Gv (x, y) gradient by applying the commonly used sobel filter on PU pixels (high gradient values indicate a high complexity of the block). The gradient magnitude G(x, y) is obtained using eqn. 1 and the average gradient (Gavg ) of the block can be calculated as the average of gradient magnitude over the block’s pixels, as defined by eqn. 2. Here B and NB correspond to the reference block and its number of pixels, respectively. Using Gavg for each block, the threshold can be adapted accordingly, where finer MV resolution is used for high strength PUs (high gradient value) and coarser MV resolution is used for low strength PUs (low gradient value).
Fig. 1. Illustration of the PMVR scheme with thresholds T He = 2 and T Hq = 4, define around the MVP, region A (1/8 pixel), region B (1/4 pixel), and region C (1/2 pixel) MV resolution positions.
G(x, y) = (|Gh (x, y)| + |Gv (x, y)|) 1 Gavg = ×( G(x, y)) NB
With the defined 1/2, 1/4 and 1/8 pixel positions, the whole MV space is divided into three different regions: 1) Region A where 1/8 pixel positions are allowed (inside square range of size 2 × T He ), 2) Region B where 1/4 pixel positions are allowed (inside square range of size 2 × T Hq ) and 3) Region C where 1/2 pixel positions are allowed (outside square range of size 2 × T Hq ). In the PMVR scheme, a piecewise MVD coding scheme is proposed. Thus, for MV in region A, MVD is calculated as the difference between MV and MVP with 1/8 pixel accuracy. When it comes to MV in Region B, the part of MV exceeding 1/8 pixel range (limited by T He ) is calculated with 1/4 pixel accuracy. Further, for MV in region C, the part of MV exceeding 1/4 pixel range (limited by T Hq ) is calculated with 1/2 pixel accuracy. It is to be noted that, there is no change in binarization and context modeler of MVD for the encoding and decoding process. However, all the PUs use this same criterion for
(1) (2)
(x,y)∈B
However, as the current PU is not a causal information (not available at the decoder), gradient is calculated from the reference block. To derive the reference block, first, the closest reference picture of list 0 (having the minimum picture order count (POC) difference with respect to current picture) is derived. Then, the MVP is derived using that reference picture and rounded to integer pixel precision. Subsequently it is used to fetch the reference block from the reference picture. Due to integer pixel precision of MVP, fetching the reference block does not require any fractional pixel interpolation, which significantly reduces the complexity of the overall process. It is to be noted that the derived reference block in this process is also available at the decoder, so no further signaling is needed.
50
2.2.3. Integer pixel MV resolution To enhance the flexibility of MV resolution, an integer pixel layer is added on top of PMVR. As illustrated in Fig. 2, an additional threshold (T Hh ) is defined to represent the boundary between 1/2 and integer pixel. The resulting configuration adds one more layer (Region D) of MV resolution.
Fig. 3. Motion estimation around multiple MV predictors (MEMVP) for T He = 2, T Hq = 4.
Table 1. Different set of thresholds for block size and strength (PS) based threshold adaptation. Fig. 2. Illustration of the PMVR scheme with an additional MV resolution layer for integer pixel obtained using T Hh = 8.
Class 3 Class 2 Class 1 Class 0
P Uarea (Pels) > 1024 > 256 > 64 ≤ 64
PS = 0 (0, 16, ∞) (0, 8, ∞) (0, 4, ∞) (0, 0, 32)
PS = 1 (2, 28, ∞) (2, 16, ∞) (2, 16, ∞) (0, 8, ∞)
PS = 2 (8, 64, ∞) (4, 32, ∞) (2, 32, ∞) (2, 16, ∞)
2.2.4. Motion estimation around mutiple MV predictors (MEMVP) In HEVC, for each block, MV is estimated in order to perform motion compensation. However, to reduce MV coding bits, two different MVs are derived (AMVP) from causal spatio-temporal neighborhood, that act as predictors for MV of the current block [8]. Besides in the reference software, MV search is performed around the MV predictor from AMVP derivation process. However, MV search is only performed around the best MV predictor (amongst the two AMVP predictors). In such a process, the best MV predictor gets more preference for MV search than the other MV predictor, which is sub-optimal. Moreover, for PMVR, allowed MV resolutions also depend on the position of MVP. In Fig. 3, it can be observed that, when using MVP1 for MV estimation, fine MV resolution points around MVP2 are not being searched, due to the distance from MVP1 even though it is also beneficial because, MVP2 can serve as a very good predictor. To overcome this limitation, we propose to perform MV search around both AMVP candidates (MVP1 and MVP2) separately. It is to be noted that the MEMVP scheme is non-normative (specific to the encoding process) and it does not impact the decoding process.
corresponding MV resolution layer is not present. Reversely, when threshold is equal to ∞, the corresponding MV resolution layer is present everywhere beyond finer MV resolution region(s) and no subsequent coarser MV resolution is used. In order to achieve an appropriate threshold adaptation, PUs are divided into 4 different classes depending on PU areas. Besides, to account for PU strength, as described in the section 2.2.2, PU strength or spatial complexity is computed using average gradient magnitude Gavg of the reference block. Here, PUs are classified according to Gavg value of the PU (in addition to the PU size). The PUs are classified into three different PU strength (PS), using two gradient thresholds TG1 and TG2 , as shown by eqn. 3. ⎧ ⎪ ⎨0 PS = 1 ⎪ ⎩2
for Gavg < TG1 for TG1 ≤ Gavg < TG2 for Gavg ≥ TG2
(3)
Based on the above, the set of thresholds for a given block, is jointly decided depending on PU size and PS. Table 1 reports the different set of thresholds corresponding to the 4 classes of PU size and the 3 PU strengths. These values are obtained thanks to an optimization process allowing to gradually and empirically find optimal values for each case. Optimal sets have been derived when using (TG1 , TG2 ) = (50, 400).
3. PROPOSED FMVRP CONFIGURATION As mentioned previously, in this work, the thresholds are adapted depending on PU size and strength. The FMVRP configuration is represented using the set of thresholds TH = (T He , T Hq , T Hh ), implying that when a threshold is equal to 0, it indicates that the
51
Table 3. BD-rate and complexity comparison between PMVR (anchor), and PMVR with MEMVP for RA, LP and LD configurations and different classes of the dataset (see Table 2).
Table 2. List of sequences used in the experimental part. Resolution Class A (2560 × 1600)
Class B (1920 × 1080)
Class C (832 × 480)
Class D (416 × 240)
Class E (1280 × 720)
Name Traffic PeopleOnStreet NebutaFestival SteamLocomotiveTrain Kimono ParkScene Cactus BasketballDrive BQTerrace BasketballDrill BQMall PartyScene RaceHorses BasketballPass BQSquare BlowingBubbles RaceHorses FourPeople Johnny KristenAndSara
Frame Rate (fps) 30 30 60 60 24 24 50 50 60 50 60 50 30 50 60 50 30 60 60 60
ClassA ClassB ClassC ClassD ClassE Average EncT DecT
RA -0.1% -0.2% -0.2% -0.4% -0.2% 123% 100%
LP -0.1% -0.3% -0.2% -0.1% -0.2% 125% 100%
LD -0.1% -0.2% -0.2% -0.1% -0.1% 121% 101%
Table 4. BD-rate and complexity comparison between HM-16.6 (anchor), PMVR (anchor), and proposed FMVRP configuration for RA, LP and LD configurations and different classes of the dataset (see Table 2). ClassA ClassB ClassC ClassD ClassE Average EncT DecT
4. EXPERIMENTAL RESULTS To evaluate the performance of the proposed method, FMVRP is implemented into HEVC reference software (HM-16.6). Simulations are carried out following common test conditions (CTC) [10]. The tests are run with main 10 bits profiles (10 bits for internal processing) of RA (random access), LP (low-delay P), LD (lowdelay B) configurations and with the set of quantization parameter QP={22,27,32,37}. The used test sequences are listed in Table 2. Simulations are performed for the first second of each sequence. PSNR is used as the distortion metric for calculating BD Rate and MS-SSIM helps to report about perceptual visual quality [14].
Anchor: HM-16.6 RA LP LD -0.3% -0.7% -2.2% -0.5% -1.6% -3.6% -1.7% -2.3% -5.3% -2.2% -1.2% -0.6% -1.2% -3.2% -1.2% 143% 142% 141% 113% 105% 109%
Anchor: PMVR RA LP LD -0.4% -0.3% -0.4% -0.2% -0.4% -0.6% -0.5% -0.9% -0.8% -0.7% -0.5% -0.3% -0.5% -0.5% -0.4% 125% 122% 121% 111% 105% 108%
coding runtime by around 10%, when compared to HM-16.6. When it comes to the comparison with PMVR, the increase in encoding and decoding runtime are lower, respectively 25% and 10%. The increase in decoding complexity is mainly due to the average gradient calculation of the reference block. Comparison of BD curves are also presented for BQSquare sequence in LP configuration in Fig. 4 with PSNR and MS-SSIM as distortion metric. It can be seen that FMVRP significantly outperforms HM-16.6, with respect to both metrics (PSNR and MS-SSIM). Simulations are also carried out by disabling MEMVP for the proposed configuration. The obtained performance, given in Table 5, show lower gains and encoding runtime compared to the configuration with MEMVP enabled.
4.1. Performance of the MEMVP scheme To investigate the performance of MEMVP scheme, simulations are carried out with the configuration (T He , T Hq , T Hh ) = (2, 4, ∞), equivalent to the PMVR scheme, as the anchor. MEMVP scheme is enabled on top of this anchor. From Table 3, it can be observed that MEMVP scheme provides around 0.2%, 0.2% and 0.1% gain respectively for RA, LP and LD configurations. The scheme increases the encoding complexity in a range of 20-25%, with almost no impact on the decoding complexity. The increase in the encoding complexity is due to the increased number of MV search during the motion estimation process. 4.2. Performance of the proposed FMVRP scheme Simulations are also carried out with the proposed FMVRP configuration described in Section 3, while enabling the MEMVP scheme. The proposed scheme is compared against the corresponding HEVC anchor (HM-16.6) and PMVR. From Table 4, it can be seen that, the proposed configuration provides respectively 1.2%, 3.2%, 1.2% BD rate gain for RA, LP and LD configurations. Compared to PMVR, it provides 0.5%, 0.5%, and 0.4% gain for respective configurations. The gains in LP configuration are higher than RA and LD ones. In overall, the proposed scheme increases encoding runtime by around 40% and de-
Fig. 4. Comparison of BD curves between HM-16.6 and FMVRP for BQSquare sequence with LP configuration (Left: PSNR, Right: MS-SSIM).
52
Table 5. BD-rate and complexity comparison between HM-16.6 (anchor), PMVR (anchor), and proposed FMVRP configuration with MEMVP disabled for RA, LP and LD configurations and different classes of the dataset (see Table 2). ClassA ClassB ClassC ClassD ClassE Average EncT DecT
Anchor: HM-16.6 RA LP LD -0.1% -0.6% -2.1% -0.3% -1.3% -3.3% -1.5% -2.1% -5.0% -1.8% -1.0% -0.5% -1.0% -2.9% -1.0% 117% 115% 119% 117% 112% 117%
Table 6. Performance of the proposed scheme with respect to spatiotemporal characteristics (SI and TI) of video sequences. Sequence Kimono ParkScene Cactus BasketballDrive BQ Terrace BasketballDrill BQMall PartyScene RaceHorses BasketballPass BQSquare BlowingBubbles RaceHorces FourPeople Johnny KristenAndSara
Anchor: PMVR RA LP LD -0.1% -0.2% -0.2% 0.0% -0.2% -0.2% -0.3% -0.7% -0.5% -0.4% -0.3% -0.1% -0.3% -0.3% -0.2% 102% 99% 102% 115% 112% 116%
4.3. Impact of spatio-temporal characteristics of the video sequences on FMVRP performance In this section, the correlation between the spatio-temporal characteristics of the test sequences and the performance of the proposed scheme is investigated. For the spatio-temporal characterization of the video, we use two indicators, namely spatial information (SI) and temporal information (TI) contents of the sequence, as defined in ITU-T P.910 [15]. SI is an indicator of the spatial complexity of the frame, high SI indicates high spatial complexity and vice versa. As for TI, it is an indicator of motion content, high TI indicates high motion and vice versa. To investigate the impact, the performance of the LP configuration (as LP configuration provides highest average gain) is compared with the SI/TI of the video sequences. From the results of Table 6, it can be observed that, for sequences with high SI and low TI (BQSquare, BQTerrace, PartyScene), the performance of the proposed scheme is significantly better compared to HEVC. Reversely, sequences with high TI and/or low SI (Kimono, RaceHorses), the performance is low. The reason lies in the fact that, sequences with high spatial complexity require fine MV resolution. Moreover, when the sequence is having low TI, the magnitude of MV is generally very small. As our proposed scheme uses fine MV resolution when MVD is small, it perfectly exploits the spatio-temporal characteristics of such a content for its benefit. However it might be noted that, class E sequences (FourPeople, Johnny, KirstenAndSara) are video-conferencing based sequences, having different characteristics (static background) than all the other classes of sequences, which is also reflected in their extremely low TI value. To validate the aforementioned finding, we have also calculated the correlation between SI/TI values of the sequence and performance of FMVRP. SI and TI values of all sequences are normalized using eqn. 4. Here, SImax and SImin are respectively the maximum and minimum values of SI in the set of video sequences (using Table 6). Similarly, for TI, the corresponding values are T Imax and T Imin . Furthermore, a spatio-temporal index IndexST is defined, as described in eqn. 5. According to the definition, sequences with high SI and low TI will have comparatively higher value of IndexST . The pearson correlation coefficient (PCC) and spearman rank order correlation coefficient (SROCC) between IndexST and FMVRP gain are found respectively as 0.7 and 0.46. However, when class E sequences are excluded (due to their different characteristics) for computing correlation, PCC and SROCC are found respectively as 0.86 and 0.81. In Fig. 5, the performance of FMVRP is plotted against IndexST . This implies that the performance of FMVRP
SI 23 47 63 53 104 66 88 103 80 69 158 72 97 68 54 73
TI 13 10 12 19 13 16 16 12 34 10 7 14 32 3 2 2
BD-Rate -0.1% -1.0% -1.3% -1.1% -7.8% -3.4% -3.4% -7.0% -0.6% -1.1% -17.3% -1.9% -1.0% -1.1% -2.4% 0.0%
is strongly correlated with the spatio-temporal characteristics of the sequences.
SInorm = (SI − SImin )/(SImax − SImin ) T Inorm = (T I − T Imin )/(T Imax − T Imin ) IndexST = SInorm + (1 − T Inorm )
(4) (5)
Fig. 5. Correlation between IndexST and performance of FMVRP.
5. CONCLUSION In this work, a PU level adaptive MV resolution is proposed, to improve the PMVR scheme by using PU’s intrinsic information namely the size and the average gradient. A smarter motion estimation algorithm around multiple MV predictors is also introduced to further exploit this scheme. Experimental results have shown performance improvement compared to both HEVC anchor and PMVR. An investigation with the spatio-temporal characteristics of video sequences is also done to provide insights on the performance of the proposed scheme. The proposed method could be further enhanced by adapting MV resolution using the spatio-temporal characteristics of the video sequences. Also, the increase of the encoding complexity is largely due to MEMVP scheme, so this scheme can be further optimized to reduce the encoding complexity.
53
6. REFERENCES
[14] Zhou Wang, Eero P Simoncelli, and Alan C. Bovik, “Multiscale structural similarity for image quality assessment,” in Signals, Systems and Computers, 2004. Conference Record of the Thirty-Seventh Asilomar Conference, Pacific Grove, CA, USA, 2003, vol. 2, pp. 1398–1402.
[1] Gary J. Sullivan, Jens-Rainer Ohm, Woo-Jin Han, and Thomas Wiegand, “Overview of the high efficiency video coding (HEVC) standard,” IEEE Transactions on circuits and systems for video technology, vol. 22, no. 12, pp. 1649–1668, 2012.
[15] ITU-T P.910, “Subjective video quality assessment methods for multimedia applications,” Tech. Rep., ITU, September 1999.
[2] Jordi Ribas-Corbera and David L. Neuhoff, “Optimizing motion-vector accuracy in block-based video coding,” IEEE Transactions on circuits and systems for video technology, vol. 11, no. 4, pp. 497–511, 2001. [3] Silvia Corrado, Marie Andr´ee Agostini, Marco Cagnazzo, Marc Antonini, Guillaume Laroche, and Jo¨el Jung, “Improving H. 264 performances by quantization of motion vectors,” in Proc. IEEE Picture Coding Symposium, Chicago, USA, 2009, pp. 1–4. [4] Jizheng Xu, Rajan Joshi, and Robert A. Cohen, “Overview of the emerging HEVC screen content coding extension,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 1, pp. 50–62, 2016. [5] Liwei Guo, Peng Yin, Yunfei Zheng, Xiaoan Lu, Qian Xu, and Joel Sole, “Adaptive motion vector resolution with implicit signaling,” in Proc. IEEE International Conference on Image Processing, Hong Kong, China, 2010, pp. 2057–2060. [6] Zhao Wang, Juncheng Ma, Falei Luo, and Siwei Ma, “Adaptive motion vector resolution prediction in block-based video coding,” in Proc. IEEE Visual Communications and Image Processing (VCIP), Singapore, 2015, pp. 1–4. [7] Juncheng Ma, Jicheng An, Kai Zhang, Siwei Ma, and Shawmin Lei, “Progressive motion vector resolution for HEVC,” in Proc. IEEE Visual Communications and Image Processing (VCIP), Kuching, Malaysia, 2013, pp. 1–6. [8] Jian-Liang Lin, Yi-Wen Chen, Yu-Pao Tsai, Yu-Wen Huang, and Shawmin Lei, “Motion vector coding techniques for HEVC,” in Proc. IEEE 13th International Workshop on Multimedia Signal Processing (MMSP), Hangzhou, China, 2011, pp. 1–6. [9] Gary J. Sullivan and Thomas Wiegand, “Rate-distortion optimization for video compression,” IEEE signal processing magazine, vol. 15, no. 6, pp. 74–90, 1998. [10] Karsten Suehring and Karl Sharman, “Common test conditions,” in Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11. JCTVCX1100, 2016, pp. 1–14. [11] Yeon-Jea Cho, Jung-Hoon Ko, Hyeong-Geun Yu, Jai-Hoon Lee, Dong-Jo Park, and Sung-Ho Jun, “Adaptive motion vector resolution based on the rate-distortion cost and coding unit depth,” in Proc. IEEE International Advance Computing Conference (IACC), Gurgaon, India, 2014, pp. 1000–1003. [12] Zhao Wang, Shiqi Wang, Jian Zhang, and Siwei Ma, “Adaptive progressive motion vector resolution selection based on rate– distortion optimization,” IEEE Transactions on Image Processing, vol. 26, no. 1, pp. 400–413, 2016. [13] Il-Koo Kim, Sunil Lee, Min-Su Cheon, Tammy Lee, and JeongHoon Park, “Coding efficiency improvement of hevc using asymmetric motion partitioning,” in Proc. IEEE international Symposium on Broadband Multimedia Systems and Broadcasting, Seoul, South Korea, 2012, pp. 1–4.
54