Multimed Tools Appl (2016) 75:1963–1981 DOI 10.1007/s11042-014-2382-7
Fast intra mode decision algorithm for HEVC based on dominant edge assent distribution Yingbiao Yao & Xiaojuan Li & Yu Lu
Received: 24 March 2014 / Revised: 22 September 2014 / Accepted: 10 November 2014 / Published online: 29 November 2014 # Springer Science+Business Media New York 2014
Abstract As the latest video coding standard, high efficiency video coding (HEVC) is a successor to H.264/AVC. To improve the coding efficiency of intra coding, HEVC employs a flexible quad-tree coding block partitioning structure and 35 intra prediction modes. The optimal prediction mode is selected through rough mode decision (RMD) and rate distortion optimisation (RDO) process. Due to the huge search space of all of the possible depth levels (CU sizes) and intra prediction modes, intra coding of HEVC is a very time-consuming and complicated process, which limits the application of HEVC. In order to reduce the intra coding complexity, we propose a fast mode decision algorithm for HEVC intra prediction which is based on dominant edge assent (DEA) and its distribution. The four DEAs in the directions of degree 0, 45, 90 and 135 are computed first; then, the dominant edge is decided according to the minimum DEA. Next, a subset of prediction modes in accordance with the dominant edge is chosen for the RMD process. The rule is as follows: When the standard deviation of DEA is distinctly small, we skip the RMD process and take the direct current (DC) mode and planar modes as the candidate modes for the RDO process; when the minimum DEA is distinctly small, we select seven modes as the candidate modes for the RMD process; otherwise, we select 11 modes for the RMD process. Lastly, the prediction unit (PU) size-based number of RDO candidate modes (3 for PU size 4×4 and 8×8 and 1 for the other PU sizes) is modified according to experimental analysis. Compared with HM 9.1, Shen’s proposal and da Silva’s proposal, which are two state-of-the-art fast intra mode decision algorithms, the experimental results reveal that the proposed algorithm can save 36.26, 13.85 and 20.81 % coding time on average with a negligible loss of coding efficiency, respectively. Keywords HEVC . Mode decision . Intra prediction . Dominant edge assent distribution
This work was supported in part by the National Natural Science Foundation of China under Grant No.61100044 and Zhejiang Provincial Natural Science Foundation of China under Grant No.LY12F01007.
Y. Yao (*) : X. Li : Y. Lu College of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, People’s Republic of China e-mail:
[email protected] X. Li e-mail:
[email protected] Y. Lu e-mail:
[email protected]
1964
Multimed Tools Appl (2016) 75:1963–1981
1 Introduction High efficiency video coding (HEVC) is the latest video coding standard developed by the Joint Collaborative Team on Video Coding (JVC-TC) [20]. HEVC is aimed to substantially improve coding efficiency compared with its predecessor H.264/AVC; it employs many state-of-the-art techniques and algorithms, and these tools greatly contribute to meeting the goal of reducing the bitrates by half while keeping comparable video quality compared to H.264/AVC High Profile [14]. The official version of the HEVC standard was issued in February 2013 and still uses block-based hybrid coding architecture. For intra prediction, it employs a flexible quad-tree coding block partitioning structure and a multi-angle intra prediction method [18]. The partitioning structure of the quad-tree coding block introduces three block concepts: coding unit (CU), prediction unit (PU) and transform unit (TU), which enables the efficient use of large and multiple sizes of coding, prediction and transform blocks. CU is the elementary unit of region splitting used for inter/intra prediction; its size can range from 8×8 to 64×64 in the test model of HEVC (HM) [13]. The CU concept allows for recursive splitting into four equal blocks, as shown in Fig. 1. PU is the elementary unit for prediction and is defined after the last
Fig. 1 Illustration of recursive CU structure in HEVC
Multimed Tools Appl (2016) 75:1963–1981
1965
depth level of CU splitting. For intra prediction, two PU sizes are supported at each depth level: 2 N×2 N and N×N. The types of PU are 64×64, 32×32, 16×16, 8×8 and 4×4. TU is the unit for transform and quantisation. The multi-angle intra prediction method offers 35 modes, including 33 direction modes, a direct current (DC) mode and a planar mode, as shown in Fig. 2. The flexible block structure better adapts to the image coding, and a large number of prediction modes greatly improve the prediction precision. Just like the predecessor H.264/AVC, the intra mode decision process in HEVC is to utilise the rate distortion optimisation (RDO) process to find the mode with the least rate distortion (RD) cost in all of the possible depth levels (CU sizes) and intra prediction modes. The RD cost function (JRDO) used in HM is evaluated as follows: J RDO ¼ SSE þ λ B
ð1Þ
where SSE is the sum of square error of current CU and the matching blocks, λ is the Lagrange multiplier and B specifies the bit cost to be considered for mode decision, which depends on each decision case. The process of RDO calculation is very complex. Moreover, the technologies of flexible block structure and many prediction modes add to the number of RDO calculations; therefore it is a very time-consuming and complicated process for the HEVC encoder to utilise RDO by traversing all combinations of CU, PU and TU sizes and 35 modes for each block. The complexity greatly limits the application of HEVC; thus, how to reduce the HEVC intra
Fig. 2 Thirty-five intra prediction modes in HEVC
1966
Multimed Tools Appl (2016) 75:1963–1981
coding complexity on the premise of guaranteeing coding performance has been become one of the research hotspots in the practical application of HEVC. In this paper, we proposed a fast HEVC intra mode decision algorithm to reduce the complexity of intra encoding. Based on dominant edge assent and its distribution, our algorithm can flexibly choose a different number of prediction modes as candidate modes for the RMD and RDO processes. The major contributions of this paper are as follows: 1) We improve the mode classification method in [6] by taking advantage of the effective edge direction detection method, which is based on dominant edge assent (DEA) in [19]. Instead of categorising modes into five types and determining the type by utilising five different filters in [6], our DEA-based method classifies 35 intra prediction modes into four types at the following degrees: 0, 45, 90 and 135. Each type has 11 intra prediction modes for the RMD process. 2) We further reduce the number of prediction modes for the RMD process by the distribution of DEA, including its standard deviation and the difference between the second lowest value and the minimum value. According to the difference and standard deviation of DEA, seven or eleven prediction modes are selected as candidate modes for the RMD process; in a particular case, we skip the RMD process. 3) We modify the PU size-based number of modes selected from the RMD process (3 for PU size 4×4 and 8×8 and 1 for the other PU sizes); thus, fewer modes are applied in the RDO calculation. 4) We integrate the proposed algorithm into the test model HM 9.1 and conduct an extensive performance evaluation. We compare the △Bitrate, △PSNR, △T, BD-Rate and BD-PSNR of the proposed algorithm with those of HM 9.1, Shen’s proposal in [17] and da Silva’s proposal in [6], which are two state-of-the-art fast intra mode decision algorithms. The results show that the proposed algorithm can save 36.26, 13.85 and 20.81 % coding time on average with a negligible loss of coding efficiency over HM 9.1, Shen’s proposal and da Silva’s proposal, respectively. The remainder of this paper is organised as follows. Section 2 reviews related works. Section 3 presents a detailed description of the proposed fast intra mode decision algorithm. Section 4 shows the experimental results. Finally, Section 5 concludes the paper.
2 Related works Recently, much attention has been paid to fast HEVC intra coding algorithms, which can be classified into three types: simplifying RD cost function in [9, 16], fast block partitioning in [5, 10–12, 17, 21] and fast mode decision in [3, 4, 6–8, 15, 22]. In this section, we briefly discuss the intra coding algorithms, especially fast block partitioning algorithms and fast mode decision algorithms. In [16], a hadamard cost-based RMD process is introduced to choose N best candidate intra modes. Only N candidate modes are tested in the RDO process, instead of testing all of the modes. Since RMD is much less time-consuming, it can substantially save encoding time and is adopted in the first test model of HM 1.0. Zhao et al. [22] further reduce the candidate modes, which are selected from the RMD process to cut down RDO calculation; meanwhile, they make full use of the direction information of the neighbouring blocks and add the most probable mode (MPM) of neighbouring blocks to the candidate mode set for the RDO process. Zhao’s proposal is partly adopted in HM 2.0. In subsequent versions of test model HM, the intra encoding framework changes little.
Multimed Tools Appl (2016) 75:1963–1981
1967
Works in [5, 10–12, 17, 21] use fast block partitioning algorithms by employing coding block structure to speed up the intra coding process. Shen et al. [17] skip the coding process of seldom-used depth level when partitioning a coding block; they also fully utilise the RD cost and prediction mode correlations among different depth levels or spatially nearby CUs to skip some prediction modes that are rarely used in the parent CUs of the upper depth levels or spatially nearby CUs. According to the spatial and temporal correlation between adjacent CUs, Yan et al. [21] predict the depths of coding block by using the optimal depths of adjacent CUs. In [11], Kim et al. analyse the correlation between RD costs and block size. During the process of block partitioning, they terminate the sub-block dividing by comparing RD cost with the threshold value given by experiments. Furthermore, the authors in [5] also analyse the correlation between block size and hadamard cost in the RMD process cost. They skip the unnecessary coding process of large-size coding block and terminate the sub-block partitioning by using the hadamard cost and RD cost. According to the principle that prediction residual can reflect prediction accuracy, Ma et al. [12] terminate their block division by comparing prediction residual with the threshold value given by their experiments. Khan et al. [10] consider the correlation between video texture feature and the variance of pixel values to simplify the coding block segmentation process. Fast mode decision algorithms can also be found in the literature [3, 4, 6–8, 15]. As mentioned in [15], since pixels along the direction of the local edge are normally of similar values, a good prediction could be achieved if we predicted the pixels using the neighbouring pixels that are in the same direction as the edge. Jiang et al. [8] calculate the gradient directions of current PUs’ pixels and use the gradient-mode histogram to choose a reduced set of candidate prediction directions for RMD calculation. Chen et al. [4] use a similar method to Jiang’s in the RDO process and reduce the number of candidate prediction modes for RDO as well. Chen et al. [3] introduce the conception of kernel density estimation into the histogram calculation, which improves the robustness of the edge direction statistics. As the calculation of each pixel’s gradient information requires much time, fast intra mode decision algorithms in [6, 7, 19] extract the edge information of each encoding block rather than each pixel. For H.264 intra prediction, the authors in [19] effectively estimate the edge direction inside the block by employing dominant edge assent to narrow down the prediction modes to reduce the RDO calculation. Da Silva et al. [6] apply five different filters to detect the dominant edge; then, they choose a subset of the prediction modes in accordance with the detected dominant edge for the RMD calculation. Fang et al. [7] exploit direction energy distribution to select a subset of the prediction modes for the RMD process. Although these algorithms can be used to reduce the computational complexity of HEVC, considering the new features of the current block’s dominant edge statistics can further reduce the complexity.
3 Proposed fast intra mode decision algorithm Figure 3 describes the intra mode decision algorithm in HM 9.1. There are three steps: the RMD process, the process of adding MPM into the candidate set and the RDO process. To avoid checking all of the intra modes in the RMD process and further reducing the candidate modes of the RDO calculation, we propose a fast mode decision algorithm that significantly reduces the encoder complexity. The detailed algorithm is described next.
1968
Multimed Tools Appl (2016) 75:1963–1981 RMD: Check 35 intra prediction modes by SATD and select N candidates
Add MPM into the candidate set
N: 8 for 4 x 4 8 for 8 x 8 3 for 16 x 16 3 for 32 x 32 3 for 64 x 64
RDO: Select the best intra modes
Fig. 3 Intra mode decision procedure for a depth level in HM 9.1
3.1 DEA-based mode decision 3.1.1 Prediction mode classification In order to reduce the coding complexity by selecting reduced modes for the RMD process, we first consider classifying the intra prediction modes. As described in [6], da Silva et al. give five subsets of prediction modes associated with five types of edge (four directional, 0, 45, 90, 135°, and one non-directional edge). The dominant edge is detected by using five different filters; it can save a great deal of time by choosing a subset of prediction modes for RMD calculation rather than all prediction modes. Instead of categorising the edges into five types and detecting the dominant edge by using five different filters in [6], our method offers four directional edge types, except the non-directional edge type, which can be detected by analysing the dominant DEA of the current block. Figure 4 shows the candidates of the directional modes in accordance with the different dominant edge directions. The candidate directional modes are composed of the mode corresponding to the dominant edge direction and the eight adjacent angular modes that are most often selected as the best mode. Modes of 14, 22 and 30 belong to two adjacent types of edge. In order to retain prediction performance in a smoother block, the DC mode and planar mode are always chosen for the RMD process. Table 1 shows the four kinds of subsets of candidate prediction modes in accordance with the different dominant edge directions. Each subset contains only 11 modes rather than 35 modes for RMD calculation. 3.1.2 Dominant edge assent In order to determine the prediction mode type, we detect the dominant edge by employing DEA in [19]. As described in [19], an N×N-size PU is equally divided into four N/2×N/2-size parts and an N/2×N/2-size central part, shown in Fig. 5. The average pixel values of each part are computed as (2) - (6). i¼ N2 −1 j¼ N2 −1
pa ¼
X X i¼0
pi j
j¼0
i¼N −1 j¼ 2 −1 N
pb ¼
X X i¼ N2
j¼0
pi j
.N N 4
ð2Þ
.N N 4
ð3Þ
Multimed Tools Appl (2016) 75:1963–1981 18
19
20
21
22
23
1969 24
25
26
27
28
29
30
31
32
33
34
17 16 15 14 13 12 11 10 9 8
: 0o edge
7
: 45o edge
6
: 90o edge
5
: 135o edge
4 3 2
Fig. 4 Modes in accordance with four dominant edges i¼ N2 −1 j¼N −1
pc ¼
X X i¼0
pi j
j¼ N2
.N N 4
ð4Þ
.N N 4
ð5Þ
.N N 4
ð6Þ
i¼N −1 j¼N −1
pd ¼
X X i¼ N2
pi j
j¼ N2
3N i¼ 3N 4 −1 j¼ 4 −1
pe ¼
X X i¼ N4
j¼ N4
pi j
where pij indicates the pixel value at the position (i, j) of the N×N-size PU. Table 1 Subsets of prediction modes in accordance with the detected dominant edge direction Dominant edge direction
Subset of modes chosen for RMD calculation
0°
6, 7, 8, 9, 10, 11, 12, 13, 14, 0, 1
45°
2, 3, 4, 5, 30, 31, 32, 33, 34, 0, 1
90°
22, 23, 24, 25, 26, 27, 28, 29, 30, 0, 1
135°
14, 15, 16, 17, 18, 19, 20, 21, 22, 0, 1
1970
Multimed Tools Appl (2016) 75:1963–1981
N
pa
pb pe
pc
N
pd
Fig. 5 N×N-size PU partition into five N/2×N/2-size sub-blocks
After getting pa, pb, pc, pd and pe, the four dominant direction assents DEA1, DEA2, DEA3 and DEA4 at degree 0, 45, 90 and 135 are given by the following: DEA1 ¼ jpb −pa j þ jpd −pc j
ð7Þ
DEA2 ¼ jpc −pe j þ jpe −pb j
ð8Þ
DEA3 ¼ jpc −pa j þ jpd −pb j
ð9Þ
DEA4 ¼ jpd −pe j þ jpe −pa j
ð10Þ
Theoretically, the direction in accordance with the minimum dominant edge assent (DEAmin) is the dominant edge. DEAmin is obtained by the following: DEAmin ¼ MinfDEA1 ; DEA2 ; DEA3 ; DEA4 g
ð11Þ
We decide the right mode type according to the dominant edge direction detected. Thus, we can choose the right subset of prediction modes to do the RMD process from Table 1. Then, fewer modes selected in the RMD process as well as MPM will be applied in the RDO process. Theoretically, a great deal of coding time can be saved. 3.1.3 Mode decision method verification In order to verify the accuracy of our method, we employ five sequences in different resolutions with the quantisation parameters 22, 27, 32 and 37. Table 2 shows the percentage of the optimal prediction mode that is included in the candidate set for the RDO process selected by our method. The average percentage is up to 92.57 %, which verifies the accuracy of our method. The theoretical and experimental results show that our modified mode decision method can save coding time and guarantee the quality of the pictures.
Multimed Tools Appl (2016) 75:1963–1981
1971
Table 2 Percentage of the optimal prediction mode being included in the selected candidate set for RDO Sequences
Resolution (pixels)
Percentage (%)
Traffic
2560×1600
88.14
Kimono1
1920×1080
90.25
Flowervase
832×480
95.67
Mobisode2 Four people
416×240 1280×720
96.95 91.85
Average
92.57
3.2 DEA distribution-based mode decision As described in Section 3.1, the DEA-based mode decision algorithm selects 11 modes from all of the prediction modes for the RMD calculation, which saves time compared to applying all of the modes to the RMD process. Considering the obvious properties of orientation and smoothness, the subset modes can be further reduced. Since the calculated DEA indicates the information about edge direction, we can apply its distribution to check the properties of obvious orientation and smoothness. 3.2.1 Obvious directional coding block On the one hand, we find that if the PU is obviously directional, the minimum DEA (DEAmin) in accordance with the edge direction is distinct and very small; thus, the obvious orientation can be decided by the difference between the second smallest value of DEA (DEAsecmin) and DEAmin. DEAmin can be obtained by (11). DEAsecmin is the smallest value among the DEAs except for DEAmin, defined by function SecMin. DEAsecmin can be obtained by (12): DEAsecmin ¼ SecMinfDEA1 ; DEA2 ; DEA3 ; DEA4 g
ð12Þ
We define a threshold (Thmin) to decide whether the PU is obviously directional. We assume that the PU is obviously directional if the DEA meets (13): DEAsecmin −DEAmin > T hmin
ð13Þ
When the current PU is very directional rather than smooth, the DC mode and planar mode should be rejected from the prediction modes subset. Considering the detected strong directional property, the subset modes in Table 1 should be further reduced. We reduce the adjacent modes, whose direction is in accordance with the dominant edge direction. Further, we employ the subsets shown in Table 3 when we detect that the PU is very directional. The candidate is composed of the mode corresponding to the dominant direction and six adjacent angular modes. Table 3 Subsets of prediction modes when the PU is obviously directional Dominant edge direction
Subset of modes chosen for RMD calculation
0°
7, 8, 9, 10, 11, 12, 13
45°
2, 3, 4, 31, 32, 33, 34
90°
23, 24, 25, 26, 27, 28, 29
135°
15, 16, 17, 18, 19, 20, 21
1972
Multimed Tools Appl (2016) 75:1963–1981
In order to obtain the value of Thmin, we employ sequences in Table 2 with the quantisation parameters 22, 27, 32 and 37 to check coding efficiency at different threshold values. An example of RD curves of Kimono1 (1920×1080) at different threshold values is shown in Fig. 6. We test different Thmin values of 30, 40 and 50 by coding 10 frames of the sequence Kimono1. The RD curves at different threshold Thmin are close to the curve of HM 9.1. The coding efficiency measured by BD-Rate (%) and BD-PSNR (dB) is shown in Table 4. According to Table 4 and Fig. 6, we can deduce that the threshold Thmin should be around 40. By analysing the extensive experimental results, we set the threshold Thmin as 43. Table 5 shows the percentage of the optimal prediction mode that is included in the candidate set for the RDO process obtained by the above methods. From Table 5, we find that the percentages are similar to that of Table 2, which verifies the accuracy of our method. 3.2.2 Obvious smooth coding block On the other hand, when the PU is extremely smooth, the DC mode or the planar mode will be most likely to be the optimal prediction mode. In this case, we skip the RMD process, and only the DC mode, planar mode and MPM are applied in the RDO calculation. If the DEAs of the four dominant edge directions are almost equal to one other, we can deduce that the PU is not directional, but that it is very smooth. We then employ the standard deviation of the four DEAs (DEAdev) to decide whether the PU is obviously smooth. DEAdev is given by (14) and (15). sffiffiffiffiffiffiffiffi . 4 X ð14Þ ðDEAi −uÞ2 2 DEAdev ¼ i¼1
u¼
4 X
. DEAi 4
i¼1
Fig. 6 RD curves of Kimono1 at different threshold values of Thmin
ð15Þ
Multimed Tools Appl (2016) 75:1963–1981
1973
Table 4 Coding efficiency at different Thmin compared with HM 9.1 Thmin
BD-Rate (%)
BD-PSNR(dB)
30
79.53
−0.0282
40
76.04
−0.0268
50
78.47
−0.0277
We assume that the PU is extremely smooth when DEAdev meets (16): DEAdev < T hdev
ð16Þ
where Thdev is a threshold that optimises the trade-off between coding time and coding efficiency. To derive the value of Thdev, we employ the sequence of Vidyo1 (1280×720), the texture of which is plain, with the quantisation parameters 22, 27, 32 and 37. The statistics of DEAdev are created when the optimal prediction mode is DC mode or planar mode. The mean of DEAdev is 2.3 and the median of DEAdev is 0.98. We test coding efficiency and coding time when Thdev is 2, 1.5, 1.0, 0.95 and 0.9. The results show that the smaller Thdev is, the better coding the efficiency is; whereas the larger Thdev is, the more coding time can be saved. Table 6 shows the percentage of the optimal prediction mode that is included in the candidate set for the RDO process when we set the threshold Thdev as 0.95. The percentages do not decrease very much compared with the above percentages. Considering the trade-off between coding time and coding efficiency, it is suitable to set the threshold Thdev as 0.95. 3.3 PU size-based number of RDO candidate modes The above method indicates that less intra prediction modes, instead of all intra prediction modes, are selected for the RMD calculation. After the RMD process, as Fig. 3 indicates, N prediction modes will be selected and the N modes as well as MPM will be applied in the RDO process to determine the optimal prediction mode. In HM 9.1, N is 3 or 8, according to the PU size. Through the experimental analysis, we find that keeping the original value of N cannot improve the coding performance greatly, but instead wastes time checking more modes in the RDO process. To further reduce the complexity of the encoder, different settings of N are checked. For the PU size of 4×4 and 8×8, the values of 6, 4, 3 and 2 are checked. For the PU
Table 5 Percentage of the optimal prediction mode being included in the selected candidate set for the RDO Sequences
Resolution (pixels)
Percentage (%)
Traffic
2560×1600
88.07
Kimono1
1920×1080
90.24
Flowervase
832×480
95.64
Mobisode2
416×240
96.92
Four people
1280×720
91.83
Average
92.54
1974
Multimed Tools Appl (2016) 75:1963–1981
Table 6 Percentage of the optimal prediction mode included in the selected candidate set for the RDO Sequences
Resolution (pixels)
Percentage (%)
Traffic
2560×1600
87.25
Kimono1
1920×1080
89.48
Flowervase
832×480
94.93
Mobisode2 Four people
416×240 1280×720
95.89 91.00
Average
91.71
sizes of 16×16, 32×32 and 64×64, the values of 3, 2 and 1 are checked. Finally, considering the trade-off between the performance and complexity, we set N as 3 for PU size 4×4 and 8×8 and 1 for other PU sizes, as shown in Table 7. The experimental results show that a smaller N can save a great deal of coding time with the similar coding performance of the above algorithms. 3.4 Overall algorithm Based on the above analysis, we propose a fast intra mode decision algorithm for HEVC based on dominant edge assent distribution, as shown in Fig. 7. Step 1 Compute DEA and DEAmin respectively based on (7)–(10) and (11) for a PU block. The direction indicated by DEAmin is the dominant edge direction. Step 2 Compute DEAdev based on (14). When DEAdev meets (16), skip the RMD process and take the DC mode and the planar mode as the candidate modes selected from RMD. Then go directly to Step 6. Step 3 Compute DEAsecmin based on (12). If the difference of DEAsecmin and DEAmin meets (13), select seven modes, referring to Table 3 as the candidates for the RMD process, and go to Step 5. Step 4 Select 11 modes, referring to Table 1 as the candidates for RMD process, and go to Step 5. Step 5 Do the RMD process. Check seven or 11 intra prediction modes based on SATD, which are chosen by Step 3 or Step 4, and select N candidates for the RDO calculation. N is 3 for PU sizes 4×4 and 8×8 and 1 for other PU sizes; then, go to Step 6. Step 6 Check whether MPM is included in the candidates of the RDO process. If MPM is not included in the candidate set, N +1 modes, comprised of N best modes of RMD and a MPM, will be employed in the RDO process. Otherwise, only N best modes will be employed in the RDO process. Step 7 Do the RDO calculation to find out the optimal prediction mode from the candidates. Table 7 Number of modes selected from the RMD process for different PU sizes in our algorithm PU size
N
4×4 or 8×8
3
Other
1
Multimed Tools Appl (2016) 75:1963–1981
1975
4 Experiment results 4.1 Efficiency analysis In order to analyse the efficiency of our proposed algorithm, we give the statistic for average mode number of RMD and RDO candidates for five sequences in different resolutions with the quantisation parameters 22, 27, 32 and 37. Table 8 shows the average number of RMD candidates of HM 9.1 and our proposed algorithm. Table 9 shows the average number of RDO candidates of HM 9.1 and our proposed algorithm. We find that our proposed algorithm greatly reduces the modes of RMD calculation and RDO calculation. Because of the simpler calculation of DEA and its distribution compared with RMD and RDO, the proposed algorithm can save much encoding time and reduce the calculation complexity significantly.
Compute DEA and its distribution
Y
DEAdev < Thdev N
DEAsecmin - DEAmin > Thmin
Y
N Select 11 modes into the candidate set
Take DC, planar modes as the candidate modes for the RDO process
Select seven modes into the candidate set
RMD: Check seven or 11 intra prediction modes by SATD and select N candidates for the RDO process
Add MPM into the candidates set
RDO: Select the best intra modes
N: 3 for 4 x 4 3 for 8 x 8 1 for 16 x 16 1 for 32 x 32 1 for 64 x 64
End Fig. 7 Mode decision flow chart (The procedures in grey blocks represent our proposed algorithm.)
1976
Multimed Tools Appl (2016) 75:1963–1981
Table 8 Average number of RMD candidates Sequences
HM 9.1
Proposed method
Traffic
35
7.48
Kimono1
35
7.60
Flowervase
35
3.19
Mobisode2 Four people
35 35
1.98 6.00
Average
35
5.24
4.2 Performance analysis To verify the performance of the proposed fast intra mode decision, we implemented it in the test model HM 9.1 of HEVC. Experiments are carried out for all I-frames sequences. The recommended video sequences specified by [2] in four resolutions (416×240, 832×480, 1920×1080 and 2560×1600 formats) and quantisation parameter values of 22, 27, 32 and 37 are implemented. Table 10 shows the comparison results between the proposed algorithm and HM 9.1 in terms of coding efficiency and computational complexity. The coding efficiency is measured by BD-Rate (%) and BD-PSNR (dB) [1]. Furthermore, the percentage difference in bitrate (△Bitrate), the luminance PSNR difference (△PSNR) and the percentage difference in encoding time (△T) are also used to compare our algorithm with HM 9.1. Positive and negative values represent increments and decrements, respectively. The criteria are calculated as (17)–(19). ΔBitrate ¼
Bitrateproposed −BitrateHM 9:1 100% BitrateHM 9:1
ð17Þ
ΔPSNR ¼ PSN Rproposed −PSN RHM 9:1
ΔT ¼
ð18Þ
T proposed −T HM 9:1 100% T HM 9:1
ð19Þ
Experimental results reveal that the proposed algorithm can save an encoding time of about 36 % on average, whereas the average increase of bitrate is only 0.57 % and the average degradation of PSNR is 0.07 dB over all test sequences. However, the simulation with the Table 9 Average number of RDO candidates Sequences
HM 9.1
Proposed algorithm
Traffic
7.95
3.17
Kimono1
7.88
2.92
Flowervase
7.79
2.69
Mobisode2
7.75
2.48
Four people
7.92
3.10
Average
7.86
2.87
Multimed Tools Appl (2016) 75:1963–1981
1977
Table 10 Comparison of coding performance and complexity reduction with HM 9.1 Sequences
Resolution (pixels) △Bitrate (%) △PSNR (dB) △T (%) BD-Rate (%) BD-PSNR (dB)
Traffic
2560×1600
0.48
−0.07
−35.23 1.72
−0.15
People on street
2560×1600
0.83
−0.07
−41.54 2.11
−0.12
Steam locomotive 2560×1600
0.087
−0.02
−37.08 0.42
−0.11
Park scene Kimono1
1920×1080 1920×1080
−0.39 0.24
−0.06 −0.02
−39.29 0.96 −40.44 0.85
−0.04 −0.03
Cactus
1920×1080
0.62
−0.06
−35.40 2.12
−0.08
BQ terrace
1920×1080
0.81
−0.06
−40.11 1.80
−0.09
Basketball drive
1920×1080
0.83
−0.04
−37.80 2.02
−0.05
Race horses
832×480
0.38
−0.08
−34.71 1.64
−0.11
Party scene
832×480
0.47
−0.15
−26.28 2.48
−0.19
BQ mall
832×480
0.94
−0.09
Blowing bubbles Race horses
416×240 416×240
0.79 1.29
−0.12 −0.09
−34.40 2.57 −34.94 2.81 −34.18 2.71
−0.15 −0.17 −0.18
0.57
−0.07
−36.26 1.86
−0.11
Average
ParkScene (1920×1080) test sequence shows a small decrease in bitrate of 0.39 %, and the simulation with the Kimono1 (1920×1080) test sequence shows over 40 % time saving, while the BD-Rate loss is only 0.85 % and the BD-PSNR loss is only 0.03 dB. For sequences with large resolutions (such as 1920×1080 and 2560×1600), the proposed algorithm shows an impressive performance, with a 38.36 % coding time reduction on average, whereas the coding efficiency loss is negligible, with an average 0.05 dB loss of PSNR. Considering all evaluations, the average BD-Rate loss is 1.86 % and the BD-PSNR loss is 0.11 dB. Table 11 shows the performances of our algorithm compared with Shen’s proposal. The proposed algorithm can save 13.85 % coding time on average compared with Shen’s proposal, with a maximum of 21.28 % in the RaceHorses (416×240) sequence and a minimum of 4.44 % in the Kimono1 (1920×1080) sequence. Table 11 Results of the proposed algorithm compared with Shen’s [17] Sequences
Resolution (pixels)
BD-Rate (%)
BD-PSNR (dB)
△T (%)
Traffic
2560×1600
−0.47
−0.05
−12.93
People on street
2560×1600
−0.26
0
−19.94
Park scene
1920×1080
−1.25
0.05
−13.19
Kimono1
1920×1080
−0.18
0
−4.44
Cactus
1920×1080
−0.01
−0.01
−11.90
BQ terrace
1920×1080
−0.6
0.04
−14.61
Basketball drive Race horses
1920×1080 832×480
−1.02 0.2
0.01 −0.03
−6.00 −17.81
Party scene
832×480
1.51
−0.12
−8.48
BQ mall
832×480
0.51
−0.03
−15.80
Blowing bubbles
416×240
1.63
−0.11
−19.84
Race horses
416×240
1.66
−0.11
−21.28
0.14
−0.03
−13.85
Average
1978
Multimed Tools Appl (2016) 75:1963–1981
Table 12 Results of the proposed algorithm compared with da Silva’s [6] Sequences
Resolution (pixels)
BD-Rate (%)
BD-PSNR (dB)
△T (%)
People on street
2560×1600
−0.19
0.01
−20.00
Steam locomotive
2560×1600
−0.08
−0.11
−21.70
Park scene
1920×1080
0.26
−0.01
−29.87
Kimono1 BQ terrace
1920×1080 1920×1080
−0.25 −0.10
0.01 0.01
−9.16 −23.31
−0.07
−0.02
−20.81
Average
Table 12 shows the performances of our algorithm compared with da Silva’s proposal. The experimental results show that our proposed algorithm consistently outperforms da Silva’s proposal. The proposed algorithm can save 20.81 % coding time on average compared with da Silva’s, with a maximum of 29.87 % in the ParkScene (1920×1080) sequence and a minimum of 9.16 % in the Kimono1 (1920×1080) sequence. Moreover, the proposed fast intra mode decision algorithm achieves a better RD performance, with about a 0.07 % BD-Rate gain compared to da Silva’s. Figure 8 gives more detailed information of our proposed algorithm compared with HM 9.1 for the RaceHorsesC (832×480) and Kimono1 (1920×1080) sequences. We can observe that our proposed algorithm performs almost the same coding efficiency from low to high bitrate compared with HM 9.1; meanwhile, it consistently saves time.
(a) RD curves of RaceHorsesC
(c) RD curves of Kimono1
(b) Time saving curve of RaceHorsesC compared to HM 9.1
(d) Time saving curve of Kimono1 compared to HM 9.1
Fig. 8 Experimental results of RaceHorsesC and Kimono1 under different QPs (22, 27, 32 and 37)
Multimed Tools Appl (2016) 75:1963–1981
1979
5 Conclusion In order to reduce the intra coding complexity of HEVC, this paper proposes a fast mode decision algorithm that is based on the DEA and its distribution. We use DEA and its distribution to identify features of the current PU. If the PU is obviously directional, we select seven modes from all of the prediction modes for the RMD calculation; if the PU is extremely smooth, we skip the RMD process, and only the DC mode, planar mode and MPM are applied in the RDO calculation. Otherwise, we select 11 modes from all of the prediction modes for the RMD calculation. In addition, we modify the PU size-based number of modes selected from the RMD process. The experiment results show that the proposed algorithm can save 36.26, 13.85 and 20.81 % coding time on average with a negligible loss of coding efficiency over HM 9.1, Shen’s proposal and da Silva’s proposal, respectively. Acknowledgments We thank the anonymous reviewers for their helpful comments and insights to improve this manuscript significantly. The work was supported in part by the National Natural Science Foundation of China (61100044).
References 1. Bjøntegaard G (2001) document VCEG-M33: Calculation of average PSNR differences between RD-curves. In ITU-T VCEG Meeting, Austin, Texas, USA, Tech. Rep 2. Bossen F (2012) Common HM test conditions and software reference configurations. Document: JCTVCK1100, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC), Shanghai 3. Chen G, Liu Z, Ikenaga T, Wang D (2013) Fast HEVC intra mode decision using matching edge detector and kernel density estimation alike histogram generation. IEEE Int Symp Circ Syst (ISCAS), 53–56. doi: 10. 1109/ISCAS.2013.6571780 4. Chen G, Pei Z, Sun L, Liu Z, Ikenaga T (2013) Fast intra prediction for HEVC based on pixel gradient statistics and mode refinement. IEEE China Summit Int Conf Signal Inf Process (ChinaSIP), 514–517. doi: 10.1109/ChinaSIP.2013.6625393 5. Cho S, Kim M (2013) Fast CU splitting and pruning for suboptimal CU partitioning in HEVC intra coding. IEEE Trans Circ Syst Vi Technol 23(9):1555–1564. doi:10.1109/TCSVT.2013.2249017 6. da Silva T L, Agostini L V, da Silva Cruz L A (2012) Fast HEVC intra prediction mode decision based on EDGE direction information. Proc Eur Signal Process Conf (EUSIPCO), 1214–1218 7. Fang C-M, Chang Y-T, Chung W-H (2013) Fast intra mode decision for HEVC based on direction energy distribution. IEEE Int Symp Consum Electron (ISCE), 61–62. doi: 10.1109/ISCE.2013.6570252 8. Jiang W, Ma H, Chen Y (2012) Gradient based fast mode decision algorithm for intra prediction in HEVC. Int Conf Consum Electron Commun Netw (CECNet), 1836–1840. doi: 10.1109/CECNet.2012.6201851 9. Johar S, Alwani M (2013) Method for fast bits estimation in rate distortion for intra coding units in HEVC. IEEE Consum Commun Netw Conf (CCNC), 721–724. doi: 10.1109/CCNC.2013.6488534 10. Khan M U K, Shafique M, Henkel J (2013) An adaptive complexity reduction scheme with fast prediction unit decision for HEVC intra encoding. In ICIP, 1578–1582. doi: 10.1109/ICIP.2013.6738325 11. Kim J, Choe Y, Kim Y-G (2013) Fast coding unit size decision algorithm for intra coding in HEVC. IEEE Int Conf Consum Electron (ICCE), 637–638. doi: 10.1109/ICCE.2013.6487050 12. Ma S, Wang S, Wang S, Zhao L, Yu Q, Gao W (2013) Low complexity rate distortion optimization for HEVC. IEEE Data Compression Conf (DCC), 73–82. doi: 10.1109/DCC.2013.15 13. McCann K, Bross B, Han W-J, Kim I-K, Sugimoto K, Sullivan G-J (2012) High efficiency video coding (HEVC) test model 9 (HM 9) Encoder description. Document: JCTVC-K1002, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC), Shanghai 14. Ohm J, Sullivan G-J, Schwarz H, Tan T-K, Wiegand T (2012) Comparison of the coding efficiency of video coding standards—including high efficiency video coding (HEVC). IEEE Trans Circ Syst Vi Technol 22(12):1669–1684. doi:10.1109/TCSVT.2012.2221192 15. Pan F, Lin X, Rahardja S, Lim K-P, Li Z-G, Wu D, Wu S (2005) Fast mode decision algorithm for intra prediction in H. 264/AVC video coding. IEEE Trans Circ Syst Vi Technol 15(7):813–822. doi:10.1109/ TCSVT.2005.848356
1980
Multimed Tools Appl (2016) 75:1963–1981
16. Piao Y, Min J, Chen J (2010) Encoder improvement of unified intra prediction. Document: JCTVC-C207, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC), Guangzhou 17. Shen L, Zhang Z, An P (2013) Fast CU size decision and mode decision algorithm for HEVC intra coding. IEEE Trans Consum Electron 59(1):207–213. doi:10.1109/TCE.2013.6490261 18. Sullivan G-J, Ohm J, Han W-J, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circ Syst Vi Technol 22(12):1649–1668. doi:10.1109/TCSVT.2012. 2221191 19. Tsai A-C, Wang J-F, Lin W-G, Yang J-F (2007) A simple and robust direction detection algorithm for fast H. 264 intra prediction. IEEE Int Conf Multimedia Expo, 1587–1590. doi: 10.1109/ICME. 2007.4284968 20. Wiegand T, Ohm J-R, Sullivan G-J, Han W-J, Joshi R, Tan T-K, Ugur K (2010) Special section on the joint call for proposals on high efficiency video coding (HEVC) standardization. IEEE Trans Circ Syst Vi Technol 20(12):1661–1666. doi:10.1109/TCSVT.2010.2095692 21. Yan K, Teng G, Hu J, Li G, Zhao H, Wang G (2014) A rapid classification decision algorithm on CU depth based on temporal-spatial correlation. J Optoelectron Laser 25(1):156–162 22. Zhao L, Zhang L, Ma S, Zhao D (2011) Fast mode decision algorithm for intra prediction in HEVC. IEEE Vis Commun Image Process (VCIP), 1–4. doi: 10.1109/VCIP.2011.6115979
Yingbiao Yao Associate professor in the college of communication engineering, Hangzhou Dianzi University, Hangzhou, P.R. China. Received the Ph.D. degree in communication and electronic engineering from the Zhejiang University, Hangzhou, China, in 2006. Research interests include hardware/software co-design of embedded systems, parallel processing, multimedia systems and wireless sensor networks.
Multimed Tools Appl (2016) 75:1963–1981
1981
Xiaojuan Li Candidate of Master Degree in the college of communication engineering, Hangzhou Dianzi University, Hangzhou, P.R. China. Research interests include multimedia processing, High Efficiency Video Coding (HEVC), etc.
Yu Lu Received the PhD degree in communication and information system from Shanghai University in 2009, and now is with the college of communication engineering at Hangzhou Dianzi University. Research interests include image segmentation and reconstruction as well as video coding.