MB (Rate Distortion Optimized [RDO] mode selection). The process of exhaustively evaluating each mode requires significant computation. In a typical low- to.
Fast H.264 Skip Mode Selection Using an Estimation Framework *
*
Iain Richardson, ‡Maja Bystrom, *Yafan Zhao
School of Engineering, The Robert Gordon University Aberdeen, AB10 1FR UK ‡ ECE Dept., Boston University, 8 St. Mary’s Street, Boston, MA 02215 USA
Abstract. A method for making early mode decisions in an H.264 coder is presented. Significant computation savings can be achieved, since the proposed algorithm can estimate whether the coder would choose to skip, that is, transmit minimal information, for the current macroblock without explicitly coding the macroblock. Estimates are based on coding mode statistics gathered from representative sequences, and results show computation savings of 70-85% for low-activity sequences.
Index Terms— H.264/AVC, fast mode decision, video coding
1. INTRODUCTION The compression performance achieved by the ITU H.264 Advanced Video Coding standard [1] enables high quality compressed video on a wide range of platforms. However, compression is achieved at the expense of significant computational complexity [ 2 ]. Each macroblock (MB) can be coded in one of a large number of coding modes, many of which are typically evaluated before a decision on the appropriate mode is made. The increased computational burden is a challenge for powerand/or computation-constrained platforms such as mobile and battery powered devices. In order to achieve good rate-distortion performance, it is necessary to evaluate the distortion and rate of each candidate mode prior to deciding the mode for the current MB (Rate Distortion Optimized [RDO] mode selection). The process of exhaustively evaluating each mode requires significant computation. In a typical low- to medium-activity sequence, many MBs are skipped (no information is coded other than a skip indication). Methods for making early skip versus code mode decision have been proposed [ 3, 4 ]. However, these methods still involve some preliminary computation, such as motion estimation. Other recent methods for fast mode choice have involved estimating and comparing rate-distortion costs of skipping versus coding an MB [5,6]; however, these methods do not take into account sequence statistics or interdependencies between MBs.
Our contribution is to incorporate knowledge of mode statistics in the early skip decision process. This is done by generating and modeling probability densities for mode cost differences as functions of sequence activity and quantization factor and by framing code/skip mode selection as a Bayesian decision process. If the estimated mode cost for the macroblock is below the threshold the macroblock is skipped, otherwise coding proceeds as usual and additional coding modes are evaluated until the appropriate mode is determined. In the remaining sections the methods for generating and modeling the mode cost differences are presented, the cost difference threshold is discussed, and an overview of the fast mode selection algorithm is given. In Section 3 performance of the proposed algorithm is compared with methods evaluated in [4]. It is shown that due to early estimation of skip mode, significant computational savings can be achieved, especially for low-activity sequences. The proposed method significantly reduces the complexity of RDO mode selection whilst maintaining good rate-distortion performance. The use of statistical models enables the algorithm to adapt to a wide range of sequences without the need for experimental tuning.
2. FAST SELECTION OF SKIP MODE 2.1 Mode Cost Calculation The challenge of estimating the coding mode on the basis of observed prior coding modes, available rate, and known or estimated distortion can be placed in a Bayesian framework. Assuming the rate cost of skipping a macroblock is negligible, then the cost of skipping is due solely to the increased distortion, namely, Dskip . However, the rate-distortion cost due to coding the macroblock in the appropriate coding mode is given as Dcode + λ Rcode where λ is a Lagrange multiplier and is calculated as in [7]. While checking each macroblock for the appropriate coding mode, it is simple to
determine Dskip ; however, significant computations would be involved in calculating Dcode , λ , and Rcode . Since the goal of this work is to minimize computations, we first estimate these parameters from the temporally-previous macroblock as in [6]. Then we use the cost difference to determine whether it is better to skip or code the macroblock. The observations in the proposed estimation framework are taken to be
J d = Dskip − Dcode − λRcode
.
work
[6]
involved
We develop pdfs for five selected 50-frame low- to mid-activity sequences. To develop improved decision thresholds, incorporating knowledge of representative sequences’ class-conditional pdfs, we use parametric models to capture these pdfs as functions of F and QP. P (J Skip )
P(J Code )
d d For tractability, both and modeled with Rayleigh models
2.2 Estimation Framework Prior
Fig. 1. Experimental pdfs for the Mother & Daughter sequence with QP=24,40.
computing
on
Jd
a
macroblock-by-macroblock basis and making a skip decision if J d