Adaptive Initial Quantization Parameter Determination for H.264/AVC ...

3 downloads 2745 Views 1MB Size Report
Z. Wu is with the University of Electronic Science and Technology of China,. Chengdu ..... neering at the Florida Institute of Technology from. 2003 to ... and Computer Engineering Department at the University of Missouri-Columbia from 1996 to ...
IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 2, JUNE 2012

277

Adaptive Initial Quantization Parameter Determination for H.264/AVC Video Transcoding Zhenyu Wu, Hongyang Yu, Bin Tang, and Chang Wen Chen, Fellow, IEEE

Abstract—Video adaptation through transcoding can provide both bit-rate reduction and resolution reduction to meet various requirements from display devices to network links. One important issue in video transcoding is the design of rate control algorithm in order to achieve target bit rate by adjusting certain coding parameters. Among them, proper selection of initial quantization parameter (QP) has been shown to induce noticeable impact on the performance of video transcoding scheme. Current approaches in initialQP determination are either too complicated or lacking adequate accuracy. This paper presents an adaptive QP initialization for H.264/AVC transcoding. First, we carefully build the models of R-MSE and QP-PSNR. Then, we introduce an R-QP model and allocate an optimal target buffer to the first frame by considering its temporal importance. The analysis and the R-QP model lead to a novel scheme to determine the initial QP adaptively to achieve more accurate estimation. Experiments have been carried out to demonstrate that substantial gains in objective quality measures can be consistently obtained. Without increasing complexity in transcoding system, the proposed adaptive initial QP scheme outperforms current existing schemes for various video sequences tested in this research. Index Terms—Adaptive quantization, rate control, rate-quantization model, transcoding.

I. INTRODUCTION ONTEMPORARY development of multimedia systems has been driving the convergence of consumer electronic devices with mobile devices such as laptops, PDAs and smart phones. However, these two classes of devices are quite different in terms of display resolution, memory, access bandwidths, and so on. It is very challenging to develop enabling technologies for end users to enjoy smooth flow of multimedia contents, especially video, between various devices. Scalable video coding (SVC) and transcoding techniques, as promising technologies for media content conversion, have attracted more and more attentions, from academia to industry. Compared with SVC, transcoding technique is more flexible and does not require the replacement of the existing decoders to support SVC. An appropriately designed transcoding scheme will not only be able

C

Manuscript received November 02, 2010; revised July 29, 2011; accepted September 28, 2011. Date of publication February 13, 2012; date of current version May 18, 2012. Z. Wu is with the University of Electronic Science and Technology of China, Chengdu 610054, China and also with SUNY at Buffalo, NY 14260 USA (e-mail: [email protected]; [email protected]). H. Yu and B. Tang are with the University of Electronic Science and Technology of China, Chengdu 610054, China (e-mail: [email protected]; [email protected]). C. W. Chen is with SUNY at Buffalo, NY 14260 USA (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TBC.2011.2182430

to generate suitable spatial resolution for heterogeneous display devices, but also produce proper frame rate under bandwidth limitations [1]. Numerous transcoding schemes have also been developed to maintain acceptable video quality for given application scenarios [2], [3] and to ensure quality-of-service (QoS) oriented video streaming over resource limited networks [4]. Recent review papers by Vetro [5] and Ishfaq [6] presented an excellent overview of transcoding technologies, including openloop structure, close-loop structure, spatial domain transcoding structure, and frequency domain transcoding structure. In particular, references [5] through [8] described three typical types of transcoding architecture: (1) FDR (Full decoder re-encoder), which cascades a decoder with a full encoder, (2) CPDT (Cascade pixel domain transcoder) [7], [8], which reduces motion estimation step in FDR by reusing the motion vectors in the input sequences, and (3) FPDT (Fast pixel domain transcoder) [7], [8], which combines motion compensation, DCT and IDCT in both decoding and encoding part of CPDT structure to further reduce the complexity. In general, among the transcoding architectures mentioned above, quantization parameter (QP) is re-calculated to facilitate efficient re-encoding process. This is because QP determination is one of the key components in rate control. The main reason for QP re-calculation is that the characteristics of video at the rate after transcoding can be quite different from that before transcoding. In the case of transcoding between different coding standards without new rate, we only need to re-calculate the QP based on the relationship of quantization between two standards. In the case of transcoding that result in substantial bit rate change, a rate control scheme as in the process of video encoding is very much desired. Since video transcoding among different standards, frame size and frame rate are usually accompanied with substantial bit rate change, rate control shall play an important role in all these transcoding cases. Existing transcoding schemes either simply adopt the rate control algorithms of video encoding standards, such as Version 5 of MPEG-2 video Test Model (TM5), MPEG-4 Verification Models (VM5), H.263 Test Model Version 8(TMN8), the RC algorithm of the Joint Model (JM) [9] reference software of H.264/AVC, or conveniently adopt simplified models as reported in [10], [11]. Other rate control schemes are based on various models, including R-Q model for transcoding by K. Seo [12], -domain R-D model proposed by He et al. [13], [14], Cauchy-density-based R-D model developed by Kamaci et al. [15], and R-D model in -domain proposed by Chang et al. [16]. However, many existing transcoding schemes rely on an accurate determination of initial QP to achieve the desired performance. This is particularly true for some transcoding schemes that have to limit the range of QP value changes between consecutive frames. One prime example is the H.264/AVC Joint Model

0018-9316/$26.00 © 2011 IEEE

278

IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 2, JUNE 2012

II. BRIEF REVIEW OF EXISTING APPROACHES JVT-G012’s RC (rate control) algorithm [17] is the most classical one and has been commonly used in H.264/AVC reference software. Its initial quantization parameter determination scheme, which has been implemented in JM, is computed as follows: (1) represents bits per pixel. , , for QCIF/CIF and , , for the image size larger than CIF. Obviously, this scheme will values produce the same initial quantization parameter for different sequences as it only depends on the target output bit-rate and image size. However, optimal value should not only be dependent on target bit-rate and GOP length, but also dependent on the complexity of video contents. Therefore, determined in such a fashion would not be able to produce consistent encoding results. Zhou et al. [18] proposed an optimal initial QP selection scheme that considers the ratio of the first frame’s source data variance and the average standard deviation of P-frames within a GOP in order to allocate the bit buffer for the first frame. The scheme adopts the following R-Q model to obtain initial quantization step . where

Fig. 1. Effect of initial QP on performance, FDR transcoding structure, 248 frames, all transcoded streams here are QCIF, 30 Hz, GOP structure (IBBP), GOP length 15, the rate control method here is JM.

(JM) based transcoding schemes that limited the QP value difference ( QP) between successive frames, and obtained the QP value of every I-frame by averaging the previous GOP’s QP values of P frames. This well-known mechanism will limit the range of changes in QP values for the subsequent frames to be close to the initial QP. Therefore, an improper initial QP will significantly deteriorate the quality of the entire video sequence and may lead to buffer overflow or underflow at the encoder. Fig. 1 clearly illustrates the impact of initial QP value selection on R-D performance of rate control. It can be easily seen that the initial quantization step size would affect the objective quality of transcoding up to 3.21 dB in PSNR with initial QP value setting from 16 to 40. At present, most H.264/AVC based transcoders still simply adopt the initial quantization parameter methods of JM. It has not been fully recognized that transcoding differs considerably from the normal encoding in several key aspects, i.e. pre-coded information such as quantization parameter (QP), bits of current frame, and percentage of non-zero coefficients of current frame, can all be extracted from the input video stream. Thus, a proper initial QP that is different from the one used in normal encoding process needs to be carefully determined to maximize the transcoding performance. In this paper, we propose an adaptive initial QP determination scheme for H.264/AVC-based video transcoding. This adaptive scheme aims at obtaining more suitable initial QP value without introducing extra computational needs. We have carried out some extensive experiments to demonstrate that the proposed initial QP scheme is indeed able to outperform most existing schemes as discussed in Section II. The organization of this paper is as follows. Section II provides a brief description of existing approaches in initial QP estimation. Section III explains the proposed adaptive initial QP estimation for the video transcoding system after the modeling R-PSNR and QP-PSNR relations. Section IV presents the experimental results to demonstrate the effectiveness of the proposed initial QP estimation developed in this research. Finally, Section V concludes this paper with a summary.

(2) where is the standard deviation of the first frame. Values a, b and c are statistical parameters obtained by linear regression. However, it has been shown that this scheme is only suitable for low bit rate video coding and will cause delay of a GOP. Wang et al. [19] developed an initial QP determination method based on both and the content of video sequences. This scheme can be described as: (3) where

(4) (5) However, this scheme needs to calculate the probability of occurrence for gray levels and the INTRA 16 DC mode to obtain EI and IM, where

(6) is the maximum gray level; is the probability of occurrence for gray level ; is the number of MBs in a frame; is the pixel value at position of the th MB. is the predicted compensation value obtained from the INTRA 16

IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 2, JUNE 2012

279

DC mode. This scheme is computationally demanding and has not taken into consideration the temporal importance of the first frame. The schemes described above are all for video encoding. Among few transcoding schemes, Yang et al. [20] proposed an initial QP scheme for MPEG-2 to H.264/AVC transcoding as: (7) where , However, this scheme suffers from the same problem as in JVT-G012 [17] because it does not consider the content of video sequences or the information that can be obtained from the precoded video.

TABLE I

, VALUES FOR I FRAME

TABLE II

, VALUES FOR P FRAME

TABLE III

III. PROPOSED INITIAL QP SCHEME FOR H.264/AVC BASED TRANSCODING When re-encoding decoded video sequences fully or partially in the transcoding systems, a starting QP value (initial QP) needs to be determined as in the case of initial video encoding. As described in Section II, the initial QP in existing schemes is only as described in JVT-G012 usually determined by either and the statistical characteristics of [17], [18] or by both frame itself such as [18], [19] and [21]. The schemes in JVTG012 [17] and [20] are too coarse while the scheme in [19] lacks of desired accuracy and is very complicated. The scheme in [14] is only suitable for low bit-rate cases while the scheme proposed in [21] has only been verified for bit rate at 544 kps. For transcoding, the schemes developed for encoding cannot be directly applied because transcoding are often carried out at network gateway with limited resources. In this proposed initial QP estimation scheme specifically developed for transcoding, we first analyze R-MSE modeling, and then study the relationship between PSNR and QP. Based on these analysis and some experiments, we obtain a more accu. By reusing the information rate R-QP model to calculate from the input of pre-coded video streams for transcoding, the proposed scheme is expected to achieve a desired performance improvement. A. R-MSE Modeling In the earlier years of video coding research, the AC coefficients of the DCT were conjectured to have Gaussian distributions. Later, several other distribution models were investigated, including generalized Gaussian and Laplacian distributions [22]. Among them, the Laplacian distribution has been widely used in practice. According to the paper by Viterbi [23], we can represent the well-known R-D model as: (8) The popular quadric R-D model was derived from the above model by (quantiexpanding to second-order Taylor series, taking zation step) to represent D and introducing H (the header bits) and MAD (mean absolute difference). However, in the first frame of video transcoding, it is impossible to predict MAD either from the pre-transcoded frame (there is none for first frame), or from the input of pre-coded frame (because the lack of the original frame). Therefore, we need to establish a new R-D model under such transcoding specific constraints.

, VALUES FOR B FRAME

By considering the effects of non-DCT coefficients and the integer transform, we proposed the following R-D model based on extensive experimental study: (9) where MSE represents the mean square error, and are parameters dependent on video content, as shown in Tables I–III-C is a constant valued as 0.01654 obtained from experimental study. is the unit complexity parameter defined as follows: (10) where

is the percentage of non-zero coefficients of the frame, is the quantization step in the input stream. can be calculated by R and its corresponding PSNR value. It is well known that the video content determines the complexity of each frame. It affects the percentage of non-zero coefficients directly given a fixed QP value. The more complex the video content, the larger the parameter will be with the same quantization step. That is to say, the more complex the video content, the larger the quantization step will be in order to obtain the same percentage of non-zero coefficients. Since represents the video content complexity to a large extent, we adopt this as the unit complexity measurement in this proposed scheme. Fig. 2(a)–(c) show the linear relation between R and for several typical video sequences. From these, we can conclude that the proposed model fits the linear assumption quite well. B. QP-PSNR Modeling H.264/AVC adopts a new quantization algorithm whose quantization step follows exponential distribution rather than linear distribution which has been used in H.263, MPEG-4 part2 and MPEG-2. Based on the spirit shown in Liu’s paper in [24], the linear PSNR QP model can be written as (11), for PSNR and QP value of coded frames also exhibit approximately linear relationship. (11)

280

IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 2, JUNE 2012

x ,x

TABLE IV VALUES FOR I FRAME

x ,x

TABLE V VALUES FOR P FRAME

x ,x

TABLE VI VALUES FOR B FRAME

mally the MSE cannot be extracted from the input of coded stream directly because the lack of original video frames. The schemes reported in [25] and [25] proposed to estimate MSE from coding parameters. However, these schemes are too complex to be implemented for the transcoding applications. Therefore, we establish a relationship between R and QP by combining the R-MSE model and QP-PSNR model to obtain (12): (12) where , , and are and are the same as in the same as in R-MSE model, can be calculated from the QP and its QP-PSNR model. corresponding bit size (R) of the frame in the input stream. Finally, the proposed initial QP scheme can be completed in following four steps: 1) Extract QP, , frame size, frame rate and coded bits of the frame from the first I frame in the input video stream by decoding the first frame. and by input video 2) Choose the parameters: , , frame’s complexity at first. Then, determine the parameter in (12) according to bit size of the first I frame and its corresponding QP in the input stream. 3) Allocate target bit buffer for the first frame in the transcoded stream as defined in (13): (13) is the bits allocated for the first I frame where is the output bit rate and is the in the input stream, input bit rate. is a weighting factor as defined in (14). Fig. 2. The relationship between coded bit per frame (R) and ' (R is the determination coefficient of linearity). (a). I frame. (b) P frame. (c) P frame.

(14)

where and are source complexity dependent parameters with values in Tables IV– VI from extensive experimental study. C. The Proposed Initial QP Scheme To determine the parameter in the R-MSE model shown in (9), we need to obtain the MSE from the input stream. Nor-

represents the coded bits of I frame in the input denotes the average bits per unit (a unit stream. can be a PBB unit in IBBPBBPBBP coding structure) of and indicates the input stream. The ratio of

IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 2, JUNE 2012

QP = 15, bitrate = 2268 k, IBBP, and 30 fps. (b) News_qcif input stream QP = 5, QP = 12, bitrate = 1330 1 k, IBBP, and 30 fps. (d) Football_qcif input stream QP = 15,

Fig. 3. R-D curves and bitrate accuracy. (a) Mobile_qcif input stream : , IBBP, and 30 fps. (c) Paris_qcif input stream : , IBBP, and 30 fps.

bitrate = 1490 99 k bitrate = 2105 97 k

281

the temporal importance of I frame in the fixed QP input stream. It can exclude the large motion and scene change cases efficiently whose ratio can be much lower. The proposed scheme allocates more bits for I frame for the

:

video sequences with smaller motions at relative low bit-rate case. This will be able to improve the transcoding quality which will be confirmed in the experiments shown in Section IV.

282

IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 2, JUNE 2012

4) Calculate the initial QP for the transcoding according to the target bit buffer by (12). Finally, the initial QP will need to be bounded between 1 and 51. IV. EXPERIMENTS AND RESULTS In order to compare the proposed scheme with four other initial QP schemes [18]–[20] and JVT-G012 [17] at their best performance, the Full Decode and Recode (FDR) transcoding architecture is adopted to evaluate them against the proposed algorithm. Notice that there is no need for the proposed scheme to adopt such FDR transcoding in practice. We fix all QP values for every frame in input video streams. The H.264 reference software JM10.2 [9] is adopted in the re-encoder part of the transcoder. The number of reference frames is set to 5, the motion search range is set to16 and full search is enabled with 1/4 pixel accuracy. In this research, we choose “frame” as the basic unit for rate control. 15 standard video sequences with large, medium and small motions are tested. The test frame rate is 30 fps. The video GOP structure is IBBP and GOP length is set to 15. The video quality is evaluated in terms of peak signal-to-noise ratio (PSNR) in dB. Due to limited space, Fig. 3 only illustrates the R-D performance of the five initial QP algorithms and their impacts on rate control accuracy for four (4) typical video sequences: mobile-calendar and football with large motion, Paris with medium motion, and news with small motion. Left column in Fig. 3 displays clearly, as we have indicated early, that Zhou’s scheme [18] performs well in low bit rate case for all kinds of sequences, but it is up to 2.5 dB in PSNR reduction than the proposed scheme in higher bit rate case. JVT-G012 [17] works well in small and medium motion sequences(news and Paris) but does poorly (with up to 4 dB reduction in PSNR compared with the proposed method) in large motion vectors sequences(mobilecalendar and football). Some significant performance drop is observed from the “mobile-calendar” sequence between target bitrate of 450 700 kbps and “football” sequence between target bitrate of 50 700 kbps. These drops in performance are caused by serious degradation of video quality frame by frame, because JVT-G012’s choice of initial QPs by (1) is too small for large motion sequences. Such small initial QPs will allocate too many bits for the beginning frames of the video by the transcoder, which must be compensated with continuous QP increasing to its max value in the following frames. Wang’s scheme [19] shows good performance in large motion sequences but it has up to 0.5dB PSNR dropping in smaller motion sequence compared with the proposed scheme. Furthermore, Wang’s scheme demands great computation to carry out INTRA 16 DC mode coding, prediction and calculation of the probability of gray level in order to get parameters EI and IM. Yang’s scheme [20] is similar with JVT-G012. Its performance is even worse than JVT-G012, for it selected smaller initial QP for sequences with large and medium motion than the JVT-G012 scheme. The right column in Fig. 3 illustrates the five initial QP schemes’ contributions on rate control accuracy of the proposed transcoding scheme. It is obvious that the methods of Wang [19], Zhou [18] and the proposed can reach the target bitrate accurately, while schemes of JVT-G012 [17] and Yang [20] perform relatively poorly in the case of large motion. This

Fig. 4. Curve of PNSR per frame (mobile_qcif input stream , IBBP, and 30 fps, output ).

bitrate = 2268 k

bitrate = 100 k

QP = 15,

TABLE VII COMPARISON OF PERCENTAGE OF FRAMES SKIPPED

is because improper initial QP selection causes encoder bit buffer to overflow. It is clear that only the proposed scheme achieve consistent PSNR gain and consistent bit rate accuracy for a wide variety of video sequences at various bit rates. Fig. 4 displays the transcoded streams’ PSNR frame by frame. It is obvious that the proposed method maintains the most stable PSNR per frame among all schemes. To evaluate this initial QP scheme in the limited buffer sce, where is the narios, we set the bit buffer size (Bs) to target bit rate. The frame skipping method is implemented for all five transcoding algorithms. When the current buffer fullness exceeds 80% of the Bs size, the transcoder will skip encoding the next frame until the buffer fullness becomes less than 80% of the buffer capacity. When frame skipping occurs, the receiver displays the previous encoded frame in place of the skipped one. The average ratios of frames skipped at each test target bit rates for every video sequence are shown in Table VII. We can easily see that the proposed algorithm does not cause the transcoder to skip any frames while schemes of [17] and [20] both need their transcoder to skip frames in order to prevent bit buffer from overflowing, especially in the case of large motion sequences. Fig. 5 compares the bit buffer occupancy in a frame

IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 2, JUNE 2012

Fig. 5. Curve of bit buffer occupancy per frame (mobile_qcif input stream , , IBBP, and 30 fps, output ).

QP = 15 bitrate = 2268 k

bitrate = 500 k

TABLE VIII COMPLEXITY COMPARISON OF FIVE INITIAL QP METHODS

by frame fashion between different initial QP schemes. The proposed scheme performs much better than existing schemes by keeping the buffer occupancy most stable among all schemes. Meanwhile, the G012 scheme cannot avoid the overflow and Zhou’s scheme causes underflow. The complexity comparison of these five initial QP schemes is shown in Table VIII. The comparisons show that schemes of G012 and Yang are the simplest ones. Schemes of Zhou and Wang are much more complicated than the other three schemes, since they must either calculate the standard deviation of the frame or carry out INTRA 16 DC prediction. The proposed scheme is slightly more complicated in terms of computation complexity than the schemes of G012 and Yang. However, the proposed scheme is less than 2% of the complexity compared with Wang’s scheme. V. CONCLUSION In this paper, we have presented an adaptive initial quantization parameter determination scheme for H.264/AVC based transcoding. R-MSE and QP-PSNR relationships are analyzed to establish the R-QP model for predicting initial QP value. The proposed algorithm first utilizes the information extracted from the input streams to obtain the parameters of (12). Then, the scheme optimally allocates bit buffer for the first I frame by considering the temporal importance of I frames and sets up the R-QP model to obtain the initial QP. The experimental results have demonstrated that the proposed initial QP algorithm outperforms the four well-known existing rate control algorithms reported in [14]–[16] and [12] in terms of PSNR in case of unconstrained bit buffer. The proposed scheme has also been evaluate for the cases when the coding buffer is limited. Because of proper initial QP selection, the proposed scheme does not skip any frame due to

283

buffer constraint while two simpler schemes are forced to skip frames due to buffer constraint. We have carried out complexity analysis for the proposed scheme comparing with existing schemes. The analysis shows that the proposed scheme is only slightly more complicated than the simplest schemes while still maintaining improved performance over more complicated schemes with a small fraction of the computational complexity. Furthermore, the adaptive initial QP algorithm we developed for transcoding can also be adopted by H.264/AVC based video encoding with a GOP pre-coding to obtain the corresponding parameters for an improved performance in encoder rate control. The additional computation for the case of video encoding may bring substantial gains in quality improvement, especially in the case of video encoding with limited buffer for low bit rate applications. REFERENCES [1] C. T. Hsu, C. H. Yeh, C. Y. Chen, and M. J. Chen, “Arbitrary frame rate transcoding through temporal and spatial complexity,” IEEE Trans. Broadcasting, vol. 55, no. 4, pp. 767–775, Dec. 2009. [2] R. Feghali, F. Speranza, D. Wang, and A. Vincent, “Video quality metric for bit rate control via joint adjustment of quantization and frame rate,” IEEE Trans. Broadcasting, vol. 53, no. 1, pp. 441–446, March 2007. [3] S. H. Hong, S. J. Yoo, S. W. Lee, H. S. Kang, and S. Y. Hong, “Rate control of MPEG video for consistent picture quality,” IEEE Trans. Broadcasting, vol. 49, no. 1, pp. 1–13, Mar. 2003. [4] H. Xiong, J. Sun, S. Yu, J. Zhou, and C. Chen, “Rate control for realtime video network transmission on end-to-end rate-distortion and application-oriented QoS,” IEEE Trans. Broadcasting, vol. 51, no. 1, pp. 122–132, March 2005. [5] A. Vetro, C. Christopoulos, and H. F. Sun, “Video transcoding architectures and techniques: An overview,” IEEE Signal Proc. Magazine, pp. 18–29, Mar. 2003. [6] I. Ahmad, X. Wei, Y. Sun, and Y. Zhang, “Video transcoding: An overview of various techniques and research issues,” IEEE Trans. Multimedia, vol. 7, no. 5, pp. 793–804, Oct. 2005. [7] H. F. Sun, W. Kwok, and J. W. Zdepski, “Architectures for MPEG compressed bit stream scaling,” IEEE Trans. Circuits Syst. Video Technol., vol. 6, no. 2, pp. 191–199, Apr. 1996. [8] D. Lefol, D. Bull, and N. Canagarajah, “Performance evaluation of transcoding algorithms for h.264,” IEEE Trans. Consum. Electron., vol. 52, no. 1, pp. 215–222, Feb. 2006. [9] “JVT Reference Software Version JM10.2,” [Online]. Available: http:// iphome.hhi.de/suehring/tml/download/old_jm/jm10.2.zip [10] C. Ho, O. C. Au, S. G. Chan, S. Yip, and H. Wong, “Low-complexity rate control for efficient H.263 to H.264/AVC video transcoding,” in ICIP, Oct. 2006. [11] Y. Xiao, H. Lu, and X. Xue, “Efficient rate control for MPEG-2 to H.264/AVC transcoding,” in ISCAS, May 2005. [12] K. Seo, S. Heo, and J. Kim, “Rate control algorithm for fast bit-rate conversion transcoding,” IEEE Trans. Consum. Electron., vol. 46, pp. 1128–1136, Nov. 2000. [13] Z. He and S. K. Mitra, “Optimum bit allocation and accurate rate control for video coding via -domain source modeling,” IEEE Trans. Circuits Syst. Video Technol., vol. 23, no. 10, pp. 840–849, Oct. 2002. [14] Z. He and S. K. Mitra, “A unified rate-distortion analysis framework for transform coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 12, pp. 1221–1236, Dec. 2001. [15] N. Kamaci, Y. Altunbasak, and R. M. Mersereau, “Frame bit allocation for the H.264/AVC video coder via cauchy-density-based rate and distortion models,” IEEE Trans. on Circuits Syst. Video Technol., vol. 15, no. 8, pp. 994–1006, Aug. 2005. [16] C. Y. Chang, T. Lin, D. Y. Chan, and S. H. Hung, “A low complexity rate-distortion source modeling framework,” in Proc. Int. Conf. Acoust., Speech, Signal Process. (ICASSP’06), May 2006, pp. 929–932.

284

IEEE TRANSACTIONS ON BROADCASTING, VOL. 58, NO. 2, JUNE 2012

[17] Z. Li, F. Pan, K. P. Lim, G. Feng, X. Lin, and S. Rahardja, “Adaptive basic unit layer rate control for JVT,” Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG Document JVT-G012r1, Mar. 2003. [18] S. Zhou, J. Li, J. Fei, and Y. Zhang, “Improvement on rate-distortion performance of H.264 rate control in low bit rate,” IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 8, pp. 996–1006, Aug. 2007. [19] H. Wang and S. Kwong, “Rate-distortion optimization of rate control for H.264 with adaptive initial quantization parameter determination,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 1, pp. 140–144, Jan. 2008. [20] J. Y. Yang, Q. H. Dai, W. L. Xu, and R. Ding, “A rate control algorithm for MPEG-2 to H.264 real-time transcoding,” in Proc. SPIE, Jul. 31, 2006, vol. 5960. [21] S. Kwon, S. Lee, and D. Lee, “Improved initial QP prediction method in H.264/AVC,” in Proc. ACM Multimedia, 2007. [22] E. Y. Lam and J. W. Goodman, “A mathematical analysis of the DCT coefficient distributions for images,” IEEE Trans. Image Processing, vol. 9, no. 10, pp. 1661–1666, Oct. 2000. [23] A. Viterbi and J. Omura, Principles of Digital Communication and Coding. New York: McGraw-Hill Electrical Engineering Series, 1979. [24] Y. Liu, Z. G. Li, and Y. C. Soh, “A novel rate control scheme for low delay video,” IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 1, pp. 68–78, Jan. 2007. [25] A. Ichigaya, Y. Nishida, and E. Nakasu, “Nonreference method for estimating PSNR of MPEG-2 coded video by using DCT coefficients and picture energy,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 6, pp. 817–826, Jun. 2008. [26] A. Ichigaya, M. Kurozumi, N. Hara, Y. Nishida, and E. Nakasu, “A method of estimating coding PSNR using quantized DCT coefficients,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 2, pp. 251–259, Feb. 2006.

Zhenyu Wu received the B.S., M.S.E.E., and Ph.D. degrees from the University of Electronic Science and Technology of China in 2000, 2003, and 2011, respectively. She is currently a Lecturer at the School of Electronic Engineering, University of Electronic Science and Technology of China. Her current research interests include image and video compression, media adaptation, video transcoding, image processing, machine learning, and computer vision.

Hongyang Yu received the Ph.D degree from the University of Electronic Science and Technology of China in 1995. He joined the University of Electronic Science and Technology of China in 1990. His current research interests are in the development of algorithms for the 3-D video coding, image recognition and intelligent video analysis.

Bin Tang received the B.S. degree from the Chongqing University in 1986, the M.S. degree from the University of Electronic Science and Technology of China in 1991, and the Ph.D. degree from the University of Electronic Science and Technology of China in 1996. He he had his postdoctoral research at the Southwest Petroleum University until August 1998. He is currently a Professor and Doctoral Supervisor at the School of Electronic Engineering, University of Electronic Science and Technology of China. His research interests include ECM system and technology, ECCM technology for radar, and new generation communication system and technology. He has currently published more than 200 academic papers.

Chang Wen Chen (F’04) received the B.S. degree from the University of Science and Technology of China in 1983, the M.S.E.E. degree from the University of Southern California, Los Angeles, in 1986, and the Ph.D. degree from the University of Illinois at Urbana-Champaign, Urbana in 1992. He is a Professor of Computer Science and Engineering at the State University of New York at Buffalo. Previously, he has been Allen Henry Endow Chair Professor of Electrical and Computer Engineering at the Florida Institute of Technology from 2003 to 2007. He was on the faculty of Electrical Engineering Department at the University of Rochester from 1992 to 1996, on the faculty of the Electrical and Computer Engineering Department at the University of Missouri-Columbia from 1996 to 2003. He also served as the Head of Interactive Media Group at David Sarnoff Research Labs in Princeton, NJ, from 2000 to 2002, managing numerous research projects in video coding standards and wireless video communications.

Suggest Documents