1. However, with different approaches and extensive rate distortion optimization (RDO) techniques,. H.264/AVC can offers a significant improvement of about.
Fast Three Step Intra Prediction Algorithm for 4x4 blocks in H.264 Chao-Chung Cheng, Tian-Sheuan Chang Dept. Electronics Engineering, National Chiao-Tung University Hsinchu, Taiwan, R.O.C. {fury, tschang}@twins.ee.nctu.edu.tw Abstract—This paper presents a fast three step algorithm for H.264 4x4 intra prediction. Motivated by the highly correlation of RD cost between different prediction direction, in each step of the algorithm, the proposed algorithm efficiently explores the neighborhood direction around the minimum one and thus skips other unlikely ones. Thus, the proposed algorithm can complete 4x4 intra prediction by only examining 6 modes instead of 9 modes in the full search algorithm. The simulation result shows that the proposed algorithm can maintain the similar PSNR quality to the full search algorithm with only about 1% bit-rate increase.
I.
INTRODUCTION
In the recent years, multimedia application becomes more flexible and more powerful with the development of digital signal processing and communication technology. The Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG develop a new standard for the coding (compression) of natural video images. The new standard [1] is known as H.264 and also MPEG-4 Part 10 Advanced Video Coding, and regarded as the next generation video compression standard.
as inter-frame to determine the block type. It is thus highly desirable to develop fast intra-prediction mode selection. This paper proposes a fast three step algorithm for H.264 4x4 intra prediction. To determine the prediction mode, six modes is required instead of nine modes in the full search method. The simulation show very promising results. The rest of the paper is organized as follows. In Section II, we first briefly introduce the intra prediction used in H.264. Then in Section III, we review the past approaches. In Section IV, we propose the three step algorithm. The efficiency and simulation results are shown in Section V. Finally, conclusions are made in Section VI. Coder Control Input Video Signal
0-7803-8834-8/05/$20.00 ©2005 IEEE.
Transform/ Scal./Quant.
Decoder
Quant. Transf coeffs. Scaling & Inv. Transform Entropy Coding
The overall architecture of H.264/ AVC, the same with previous video coding standard, is also a hybrid coder as shown in Fig. 1. However, with different approaches and extensive rate distortion optimization (RDO) techniques, H.264/AVC can offers a significant improvement of about 50% bit-rate reduction compared with previous video standards under the same PSNR quality. To achieve the highest coding efficiency, RDO techniques in H.264 exhaustively examine all the mode combinations, including different intra and inter prediction modes. As a result, the complexity and computation load of video coding in H.264 increase drastically. Among the huge computational load, intra prediction are the dominate components besides the motion estimation. Direct approach for intra prediction is the full search that exhaustedly searches all possible modes and is adopted in the reference software. Although full search can achieve optimal prediction mode selection, it is computationally expensive. Besides, intra prediction is computed for intra-frame as well
-
Control Data
Intra-frame Prediction MotionIntra/Inter Compensation
De-blocking Filter Output Video Signal Motion Data
Motion Estimation
Fig. 1. Encoding loop of H.264 II.
REVIEW OF INTRA PREDICTION IN H.264
Different from AC/DC prediction in MPEG-4, intra prediction is adopted to H.264/AVC intra frame coding. There are 9 kinds of intra prediction modes for 4x4 intra blocks. The 4x4 intra-prediction is shown in Fig. 2. A 4x4 block contains 16 pixels labeled from a to p. The pixels A to
1509
M are from the neighboring blocks and are assumed to be already reconstructed. A prediction mode is a way to generate 16 predictive pixel values (named a to p) using some or all of the neighboring pixels A to M as shown in Fig. 3. Fig. 4 shows the nine prediction modes designed in a directional manner. Mode 2 is called DC prediction in which all pixels (a to p) are predicted by (A+B+C+D+I+J+K+L)/8. Mode 0 is the vertical prediction mode in which pixels a, e, i, and m are predicted by A. Mode 1 is the horizontal prediction mode in which pixels a, b, c, and d are predicted by I. The other modes are similar except that the directions are different. 8
2 DC
1
4 7
B
Fig. 5. adjacent block of current 4x4 block In addition to the 9 modes of 4x4 intra prediction, the correlation between spatially adjacent blocks is also exploited to encode the prediction mode efficiently. In Fig. 5, A and B are the up and left encoded 4x4 blocks near current block C. The probability of the 9 modes being the optimal intra prediction mode for C is different depending on the prediction modes of the top block A and left block B. A probability list is generated by JVT for each combination of the modes of A and B. Rather than sending the selected mode number, the position of the selected mode in the probability list is sent.
5
1. Generate a 4x4 predicted block according to a mode;
0
2. Calculate sum of absolute difference (SAD) between the original 4x4 block and the predicted block;
Fig. 2. Direction of 9 4x4 intra prediction modes in H.264
M
A
B
C
D
I
a
b
c
d
J
e
f
g
h
K
i
j
k
l
L
m
n
o
p
C
In the reference software [2], a full search (FS) approach is used to examine all the 9 modes to find the one with the smallest cost. The main steps are:
6 3
A
E
F
G
4. Repeat 1 to 3 for all the 9 modes, and choose the one that has the minimum cost.
H
Fig. 3. A 4x4 block and its neighboring pixels
Fig. 4. 9 mode of 4x4 intra prediction
3. Compute Cost of the mode
Since each 4x4 block will go through these steps, the computation load is quite large and becomes the one of computational bottleneck. III.
REVIEW OF
PREVIOUS APPROACHES
Few approaches have been proposed on fast intra prediction algorithm [3][4][5]. In [3][4], they proposed a threshold to early terminate the computation of the most probable mode. If the cost of most probable mode is larger than the threshold, the quartet cost of remaining 8 modes is computed. The mode with minimum quartet cost is chosen among 8. Thus, the required computation cost varies with the contents of video. Besides, their algorithm still needs to determine a threshold, which could affect its efficiency severely. In [5], it assumes high correlation between the edge direction and intra mode, and thus adopts the edge direction to predict the possible mode. However, the assumption of edge direction is not always true. Due to above drawbacks, these previous approaches increase the bit-rate significantly with PSNR loss. For example, approach in [5] increases bitrate by 4.0% with 0.27dB PSNR loss in all I-frame condition. In this paper, we contributed a three step algorithm of intra prediction for intra 4x4 blocks. The modes needed to be examined are constant 6 modes rather than variable number of modes as in the previous approaches. The three step intra prediction algorithm can save 33% computational complexity with only bit-rate loss of about 1%. Besides, the decision method is very simple.
1510
Step1:calculate cost of mode 0,1,2,
From the result, we can find that bit-rate is increased about 1% with almost the same PSNR. We can also find bitrate increase is step up when QP is from low to high. But when QP is high, the bit-rate increase is reduced. The phenomenon may relate to the Intra16 mode, an intra prediction mode for 16x16 blocks. In the high QP case, the opportunity to select Intra16 will also increases since Intra16 mode decision is also using full search algorithm. For high motion and low motion test sequence, the result in bit-rate increase is almost the same. It is because that picture is intra coded without using information of other frames. The comparison with [5] is shown in Table 1 to Table 4 with four different QP values, from 28 to 40. The proposed algorithm outperforms the previous approach.
and select one with minimum cost
mode 0 < mode 1
Yes Step2.1:calculate cost of mode 5,7
mode 5 < mode 7
Yes Step3.1.1: select mode from mode 5,4, and the selected mode in step1
No Step3.1.2: select mode from mode 7, 3, and the selected mode in step1
No Step2.2:calculate cost of mode 6,8
mode 6 < mode 8
Yes Step3.2.1: select mode from mode 4,6, and the selected mode in step1
No Step3.2.2: select mode from mode 8,3, and the selected mode in step1
Table 1.QP = 28, Comparison results
Fig. 6. Flow chart of three step intra algorithm IV.
THREE STEP INTRA ALGORITHM
From our observation, we find the SAD of the modes at the neighborhood of the optimal mode is also small. It means that we can skip some mode after initial search to save computation power. After initial search of some modes, we can examine the two modes neighboring to the selected mode in the first step. In the last step, extra one mode is refined to improve the prediction precision. Fig. 6 shows a flow chart of the algorithm. We first start from the horizontal (mode 1) and vertical (mode 0) and DC mode (mode 2) since these modes occur in the high probability. Then in the second step, we select the neighboring 22.5 degree modes (mode 5 and 7 for vertical direction, or mode 6 and 8 for horizontal direction) based on the smaller one of horizontal or vertical mode. In the third step, we refine the search further by considering the remaining neighbor mode (mode 3 or 4). We compare the best mode from the step 1 and 2 and the neighboring mode of the best one from step 2 (mode 3 or 4), and choose the best one as our final decision. In the algorithm, three modes are initially compared in step1. In step2, two of neighboring modes are examined to determine the refined direction. And in the last step, cost of the refine mode is calculated, and the one of the three modes are compared to make the final decision. So there are constant six modes to be examined by this algorithm. V.
SIMULATION RESULTS AND DISCUSSIONS
Sequence
CHG BIT (%) [5]
CHG BIT (%) Ours
CHG PSNR (dB) [5]
CHG PSNR (dB) Ours
CHG T_I (%) Ours
CHG T_AVG (%) Ours
Container
1.80
0.79
0.039
-0.01
-32.34
-16.03
News
2.56
1.09
0.045
-0.02
-31.05
-15.92
Paris
1.60
0.93
0.043
-0.01
-31.91
-16.62
Tempete
1.58
0.79
0.091
-0.01
-31.33
-16.22
Table 2.QP = 32, Comparison results
Sequence
1511
CHG BIT (%) Ours
CHG PSNR (dB) [5]
CHG PSNR (dB) Ours
CHG T_I (%) Ours
CHG T_AVG (%) Ours
Container
2.64
1.02
0.044
-0.01
-32.94
-16.46
News
3.09
1.30
0.031
-0.02
-31.11
-15.50
Paris
2.43
1.16
0.032
-0.01
-30.29
-15.54
Tempete
2.32
0.96
0.065
-0.01
-31.43
-16.25
Table 3.QP = 36, Comparison results
Sequence
CHG BIT (%) [5]
CHG BIT (%) Ours
CHG PSNR (dB) [5]
CHG PSNR (dB) Ours
CHG T_I (%) Ours
CHG T_AVG (%) Ours
Container
4.06
1.21
0.005
-0.02
-32.82
-16.46
News
4.26
1.20
0.000
-0.02
-31.41
-15.29
Paris
3.25
1.21
0.013
-0.01
-30.45
-15.46
Tempete
3.11
1.00
0.051
-0.02
-31.23
-16.16
Table 4.QP = 40, Comparison results
Sequence
The proposed three step algorithm and the full search are simulated on five CIF sequences, mobile and calendar, foreman, Stefan, news, and coastguard. For each sequence, 300 frames are encoded with intra frame coding. We simulate these sequences with 5 different fixed QP values, from 12 to 44. RD-curve is shown in the Fig. 7 to Fig. 11.
CHG BIT (%) [5]
CHG BIT (%) [5]
CHG BIT (%) Ours
CHG PSNR (dB) [5]
CHG PSNR (dB) Ours
CHG T_I (%) Ours
CHG T_AVG (%) Ours
Container
5.18
1.00
0.001
-0.03
-32.75
-16.21
News
5.31
1.38
0.006
-0.03
-31.64
-15.16
Paris
4.91
1.58
0.003
-0.04
-30.39
-15.23
Tempete
3.67
0.78
0.024
-0.03
-31.17
-16.09
45
45
40
40
PSNR
50
PSNR
50
FS TSS
35
FS TSS
35
30
30
25
25 0
2500
5000
bit-rate
20 0
5000
10000 bit-rate
15000
7500
10000
Fig. 10. RD-cure of news
20000
50
Fig. 7. RD-curve of mobile & calendar
45 PSNR
50 45 40
PSNR
FS
FS
40
TSS
35 30
TSS
25
35
0
5000
30
bit-rate
10000
15000
Fig. 11.RD-curve of coastguard
25 0
4000
bit-rate
8000
VI.
12000
CONCLUSION
In this paper, we propose a three step intra prediction mode selection algorithm. Computation reduction is achieved by examining only six of total intra prediction modes. Simulation results suggest that three step algorithm can achieve similar PSNR as full search and only about 1% of increase on bit-rate.
Fig. 8. RD-curve of foreman
50 45
REFERENCES
40 PSNR
[1]
FS
35
TSS
30
[2] [3]
25 [4]
20 0
4000
8000 bit-rate
12000 [5]
Fig. 9. RD-curve of Stefan
1512
Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (ITU-T Rec. H.264/ ISO/ IEC 14496-10 AVC), Mar. 2003. H.264/AVC reference software JM8.2, Jul. 2004 Meng, B.; Au, O.C, “Fast intra-prediction mode selection for 4x4 blocks in H.264”in Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal, 2003., vol. 3, 6-10 pp.III - 389-92 ,April2003 Meng, B., Au, O.C., Chi-Wah Wong, Hong-Kwai Lam, “Efficient intra-prediction mode selection for 4x4 blocks in H.264” in Proc. of Int. Conf. on Multimedia and Expo, 2003, vol. 3 , 6-9 Pages:III - 5214, July 2003 Feng PAN, Xiao LIN, Rahardja SUSANTO, Keng Pang LIM, Zheng Guo LI, Ge Nan FENG, Da Jun WU, and Si WU, "Fast Mode Decision for Intra Prediction," JVT-G013, 7th Meeting, Pattaya II, Thailand, 7-14 March, 2003.