IEEE TRANSACTIONS ON BROADCASTING, VOL. 52, NO. 2, JUNE 2006
173
Video Traffic Prediction Based on Source Information and Preventive Channel Rate Decision for RCBR Myeong-jin Lee, Member, IEEE
Abstract—In this paper, we address the problem of dynamic bandwidth allocation in real-time video transmission. Firstly, a source traffic prediction method is proposed which is based on the rate-distortion relation of source video. This method can detect changes in the source traffic level before encoding by using source information. Secondly, a preventive channel rate decision algorithm, called PCRD, is proposed. The transmission rate bounds are derived from the constraints of the encoder and decoder buffers based on the predicted bit-rate of video frames. From simulation results, the proposed traffic prediction method is shown effective in detecting scene changes and estimating changed traffic levels. Also, the PCRD method is shown to have low renegotiation cost and high channel utilization without violating delay constraints. Index Terms—Rate-distortion, RCBR, traffic renegotiation, video traffic prediction, video traffic smoothing.
I. INTRODUCTION
C
OMPRESSED video sources show burstiness over multiple time-scales, periods of millisecond to several seconds. Because of the characteristics of multiple time scale burstiness and long-term correlation in compressed video, the possible multiplexing gain could be low in preventive call admission control mechanisms for guaranteeing Quality of Service (QoS). Thus, dynamic bandwidth allocation during a connection has been considered [1]–[3], where the server requests bandwidth dynamically according to its current bandwidth demand. Reininger [3] proposed a renegotiated VBR scheme, where the source renegotiates bandwidth using a set of existing standard parameters. Zhang and Knightly studied the network performance of a dynamic bandwidth allocation scheme based on the D-BIND model [2]. More recently, Grossglauser presented a simple and efficient dynamic bandwidth allocation model, renegotiated constant bit rate (RCBR) [1], where the source negotiates only peak cell rates according to its short-term average. Compared to the other approaches, the RCBR service model has the key advantage that the signaling cost and network function for renegotiation are as simple as those of CBR service. Previous works on dynamic bandwidth allocation have focused on source rate prediction algorithms for live-video Manuscript received July 30, 2004; revised August 15, 2005. This research was supported by the MIC (Ministry of Information and Communication), Korea, under the ITRC (Information Technology Research Center) support program supervised by the IITA (Institute of Information Technology Assessment). The author is with the Department of Electrical and Electronic Engineering, Kyungsung University, 110 Daeyeon-dong, Nam-gu, Busan 608-736, Korea (e-mail:
[email protected]). Digital Object Identifier 10.1109/TBC.2005.859234
transmission and smoothing algorithms for bandwidth renegotiation. In [4], [5], adaptive linear prediction methods were used to predict the bandwidth requirements for future frames. In [4], high frequency components of source traffic were not seriously considered—it was assumed that the encoder buffer could absorb such components without causing large delay. In [5], Adas showed the encoder buffer occupancy is much less than that of the static rate allocation method. However, the end-to-end delay cannot be bounded because they only considered the average level of the generated source traffic or the low frequency component of source traffic for dynamic bandwidth renegotiation. Also, the encoder buffer occupancy is not an absolute measure of the buffering delay, but a relative measure dependent on the channel rate. Thus, it is needed to design dynamic bandwidth allocation methods considering the end-to-end delay of each video frame by observing and controlling it in the frame time scale. For smoothing algorithms for bandwidth renegotiation, there have been many approaches to stored video transmission where the transmission scheduler knows the bit-rates of all frames [6]. For online smoothing, video frames are captured, encoded, and transmitted over the network in real-time. Thus, online smoothing for real-time video applications typically has limited knowledge of frame sizes and requires strict bounds on the delay. By buffering sufficient frame data in a smoothing server, an on-line smoothing algorithm was proposed that considers the delay constraint and the finite server and client buffers [7]. However, because the buffering time in the smoothing server contributes to the end-to-end delay, it cannot be applied to real-time video applications with a strict end-to-end delay requirement. The less the required end-to-end delay, the greater the number of renegotiations are needed with a shorter renegotiation interval. To increase the renegotiation interval, traffic prediction algorithms were adopted by smoothing algorithms [7], [10]. However, these algorithms use the history of generated traffic for prediction, and hence, they cannot efficiently decide the changed traffic level and the renegotiation instant. In this paper, we address the problem of dynamic bandwidth renegotiation for real-time MPEG video applications, especially for low-delay conversational applications. Firstly, we propose a source traffic prediction method based on the rate-distortion relation of source video. Changes in source traffic level and scene changes can be detected before encoding by using the spatial variance of video frames and the estimated number of intracoded macroblocks available in the encoding process. Secondly, we address the channel rate decision problem for real-time video transmission. The transmission rate bounds are derived from the
0018-9316/$20.00 © 2005 IEEE
174
Fig. 1. Two types of source traffic prediction and dynamic bandwidth request. (a) Channel rate renegotiation based on the NLMS prediction. (b) Channel rate renegotiation based on the picture coding information.
underflow and overflow constraints of the encoder and decoder buffers based on the predicted bit-rate of video frames. Based on the constraints, a preventive channel rate decision algorithm, called PCRD, is proposed. The rest of this paper is organized as follows. In Section II, we describe our system configuration and bandwidth allocation mechanism for real-time video transmission. Then, we present a source traffic prediction method based on source information. In Section III, we propose the PCRD algorithm for real-time video transmission. In Sections IV and V, we present simulation results and conclusion, respectively. II. RATE PREDICTION BASED ON VIDEO CODING INFORMATION A. System Configuration for Real-Time Video Transmission Under RCBR Service In conventional video traffic prediction methods, such those shown in Fig. 1(a) [4], [5], the transmission controller decides on the new channel rate based on the bit-rate history of compressed video; thus, the controller cannot effectively catch and handle the changes in source traffic level and scene changes. Channel rate renegotiation reflecting the changed characteristics of source traffic is invoked after encoding the changed video frames. Until the new channel rate is active, large amounts of source traffic may be buffered for video sources with increasing bit-rate, resulting in increased buffering delay. Adversely, channel under-utilization may occur for video sources with a decreasing bit-rate. Also, the transmission controller has no ability to adapt to the channel rate for a renegotiation failure because it considers only the encoded output of the input video source [4], [5], [7], [8]. Knowledge about the coding complexity of the video source is important to predict the bit-rates of video frames and to detect scene changes. Channel rate renegotiation such as in Fig. 1(b), can rapidly detect the changed characteristics of source traffic and invoke a renegotiation request before encoding. Thus, the generated traffic of incoming video frames can be served faster than the conventional system shown in Fig. 1(a) by the renegotiated channel rates reflecting the changed characteristics.
IEEE TRANSACTIONS ON BROADCASTING, VOL. 52, NO. 2, JUNE 2006
We considered an MPEG video encoder which consists of motion compensation, DCT (Discrete Cosine Transform), etc. To also predict the video traffic accurately for scene changes, we added a Frame Analyzer between motion compensation and DCT modules. This analyzer calculates the spatial variance of the video source,1 detects scene changes, and estimates the output bit-rate based on the rate-distortion relation of the video source. The renegotiation module uses the predicted bit-rate to decide the renegotiation point and the new channel rate. In this paper, we assume that the renegotiation time, which is the elapsed time until the acknowledgment is received from networks after sending a renegotiation request, is less than one frame time. This assumption is valid if the distance between the encoder and the decoder is not so far. In this paper, we are interested in detecting scene changes, the bit-rate estimation of video frames, and the renegotiation cost of open-loop VBR video traffic under the RCBR service. Thus, we assume that renegotiation requests are always accepted by networks. Also, the rate control minimizing the distortion under the constraint of available channel resource is not considered. For renegotiation failures, the transmission controller should notify the encoder to force its output rate constrained by current channel rate while keeping video quality as constant as possible. The problem of renegotiation failures should be considered seriously and is left for further study. B. Prediction of Video Traffic Using the Exponential Rate-Distortion Model General rate control methods cannot handle scene changes appropriately because they are based on the assumption of stationary video source [15], [16]. To solve the problem of scene changes, Lee [13] and Luo [14] detected scene changes and calculated target bit-rates for constant rate channels by measuring the spatial power of video sources before DCT. To predict video traffic from source information, we took a similar approach as in our previous work [13]. We assume that the rate-distortion relation of video source shows an exponential behavior and is given by [15] 1
(1)
where is a constant related to picture coding types, and is the spatial variance of the video source. , , and represent distortion, bits per pixel, and a constant parameter, respectively. We call it the exponential rate-distortion (ERD) relation of the source video. We predict the bit-rate of the current incoming frame using the ERD relation of recently coded frames. We consider only I- and P-pictures in MPEG video coding standard for real-time applications. Because the rate-distortion characteristics of Iand P-pictures are different due to different source redundancies [13], we predict the bit-rates of I- and P-pictures separately. Scene changes in P-pictures can be detected by the number of intra-coded macroblocks, e.g., if it is larger than a half of the number of total macroblocks in a frame. The number of intra-coded macroblocks can be easily estimated 1Spatial variance of P-picture corresponds to the spatial variance of residual image after motion compensation.
LEE: VIDEO TRAFFIC PREDICTION BASED ON SOURCE INFORMATION
Fig. 2. Picture level traffic prediction. Multi sequence, (NLMS).
175
Q = 14, = 32. (a) Prediction using source information. (b) Prediction based on normalized LMS
before encoding by comparing the variances of intra signal and inter-frame difference signal after motion-compensation of each macroblock. For scene changed P-pictures, we use the ERD relation of I-pictures because most macroblocks are coded in intra-mode. Scene changes in I-pictures are not specially treated because they rarely cause abrupt changes in the generated traffic level. For any picture coding type, it is desirable to keep constant of video quality over time; thus, we can predict the bit-rate the current frame using the ERD relation of the recently coded frame of the same picture coding type. By assuming the distortion to be constant for all video frames, we can relate with as follows. 1
(2)
and are spatial variances of recently coded and where current video frames of the same picture coding type, respectively. The constant can be fitted by the samples of the encoded bit-rate and the distortion of frames, and the fitted value is used throughout the video sequence. Then, the current bit-rate can be predicted by using the following relation. (3)
C. Performance of the Video Traffic Prediction Method Using the ERD Model To evaluate the performance of the proposed video traffic prediction method, we performed a simulation for two video sequences, Multi and Bear, in SIF format. Multi is a 900 frames long composite sequence consisting of 6 MPEG-1 test sequences, “Flower Garden,” “Table Tennis.” just to name a few. Bear is extracted from one of the National Geographic video series and is 1200 frames long. We used the MPEG-2 software codec released by the MPEG Simulation Group. For performance comparison, we used the normalized least mean square (NLMS) method [5] to predict encoded bit-rates. For the
Fig. 3. Picture level traffic prediction efficiency between the ERD model and NLMS. Multi sequence, 32. 1=SNR is the ratio of prediction error power to the signal power.
=
NLMS method, we used the parameters of the best performance with less than 20 filter taps. Fig. 2 shows the picture level traffic prediction results of Multi sequence for the proposed and the NLMS-based prediction methods. There are two scene changes in the 219th and 249th frames. The NLMS method shows a delayed prediction result for the region after scene changes or fast motion. On the other hand, the proposed method shows quite accurate prediction results, especially for scene changes, which were handled without lag to the original trace. Fig. 3 shows the picture level traffic prediction efficiency of the ERD and NLMS methods. As a performance metric, we use the noise-to-signal ratio 1 of , where and are the predicted bit-rate and the real bit-rate, respectively. For different quantization parameters and picture coding types, the proposed method outperforms the NLMS method. III. PREVENTIVE CHANNEL RATE DECISION FOR REAL-TIME VIDEO UNDER RCBR SERVICE In this section, we review the channel rate constraint for real-time video transmission systems and propose a preventive channel rate decision algorithm for RCBR service.
176
IEEE TRANSACTIONS ON BROADCASTING, VOL. 52, NO. 2, JUNE 2006
A. Channel Rate Constraint for Real-Time Video Transmission In lossless video transmission systems, encoder and decoder buffers should not have underflow or overflow states [11], [12]. Encoder and decoder buffer overflow can be prevented by placing sufficiently large buffers. Also, encoder buffer underflow does not have any serious effect on the operation of video systems, except for low channel utilization. However, decoder buffer underflow, the condition in which sufficient video frame data has not arrived by its decoding time, may stop the decoding process. Thus, it should be prevented by a controlling encoding process or by renegotiating a new channel rate. In the system of Fig. 1(b), the encoder buffer occupancy at frame time is given by the 1
(4)
and are the encoder output rate and the effecwhere tive transmission rate, respectively.2 The decoder buffer occupancy is also given by if if
(5)
where is the end-to-end delay through the video transmission system. at the If the channel rate is changed to a new value 1 frame time and it is assumed that no encoder and decoder buffer overflow or underflow occurs,3 the accumulated channel during frame intervals from the 1 frame time rate has the constraint which is given by (6) where and are the lower and upper bounds of the accumulated channel rate and are given by (7) and (8)
(7)
(8)
Fig. 4 illustrates the constraints of the accumulated channel 1 frame. The upper rate during frame interval from the and , represent the cumulative data and the lower lines, 2If
the channel rate is set to R, the effective transmission rate is given by B (n 1) + e(n); R . 3The encoder and decoder buffer constraints are given by 0 B (n ) B and 0 B (n) B ; n, where B and B are encoder and decoder buffer sizes, respectively. r (n)
= min
f
0
g
8
Fig. 4. Channel rate decision for real-time video transmission.
transmitted by the encoder and consumed by the decoder, respectively. They do not include the data which may be lost in encoder buffer and decoder buffer overflow states, and the channel rate should be determined to avoid buffer overflow. The lower and the upper line can be represented by line 0 and 0 , respectively. B. Preventive Channel Rate Decision Algorithm (PCRD) for Real-Time Video Transmission Because the estimation of source traffic cannot be as accurate as the real bit-rate, we argue that the transmission scheduler has to continuously monitor the generated bit-rate and any forthcoming emergency cases, such as encoder buffer overflow and decoder buffer underflow. PCRD consists of two main algorithms. The first is to estimate the bit-rate for look-ahead frames through the frame analysis, bit-rate prediction of (3), and bit-rate update. Scene change detection and accurate bit-rate estimation for scene changed frames are quite important because larger estimation error may cause large and frequent changes in channel rate. The second is to decide channel rate and its changing time. Based on the estimated bit-rate for look-ahead frames, the transmission scheduler continuously monitors any forthcoming emergency cases using the channel rate constraint of (6). If any emergency case is found, the scheduler calculates a new channel rate considering changes in source traffic and the type of emergency. 1) Bit-Rate Estimation for Look-Ahead Frames: The transmission scheduler for real-time video applications typically has limited knowledge of frame bit-rates and requires strict bounds on the delay. For a channel rate decision, the scheduler buffers frame data or uses enough look-ahead through the estimation of the bit-rates from the past information [7]–[10]. However, the look-ahead cannot catch abrupt scene changes due to the lack of the information of future video frames [10]. Also, because the window size for smoothing is limited by the delay constraint, frequent changes of channel rate might occur, which increases the renegotiation cost or cannot be accepted by networks. To solve these problems of renegotiation cost and scene changes, we use the proposed video traffic prediction method. Generally, traffic characteristics before and after scene changes may be different. Thus, it is necessary to update the reference bit-rates continuously for future I- and P-pictures by analyzing
LEE: VIDEO TRAFFIC PREDICTION BASED ON SOURCE INFORMATION
177
Fig. 5. Illustration of the PCRD variables for channel rate change points. (a) Channel rate increase (convex point). (b) Channel rate decrease (concave point).
the characteristics of the incoming frame. After the frame analysis, the reference bit-rates for I- and P-pictures are updated considering the characteristics of the currently predicted frame: if if
or and
1 0
if if
or
(13) •
: the minimum channel rate at which the encoder may without decoder transmit over a given interval buffer underflow or encoder buffer overflow. (14)
•
(10)
: the latest time at which the decoder buffer is empty or the encoder buffer is full when the encoder transmits at over .
is updated using the
(15)
if if frame,
: the latest time at which the decoder buffer is full or the encoder buffer is empty when the encoder transmits at over .
(9)
represent the reference bit-rates of I- and P-picwhere and , , and represent the pretures, respectively. dicted bit-rate by (3), the picture coding type, and the scene frame, respectively. change flag of the By using the reference bit-rate for I- and P-pictures, the estiframe is given by mated bit-rate of the
After coding the coded bit-rate :
•
or and
•
1
0 (11) The scene change flag is set only for the incoming scene changed frame and it just forces the transmission scheduler to update the reference bit-rates for future frames based on the previous and the predicted bit-rates as in (9) and (11). and , and The estimated values of , can be calculated using the estimated bit-rates of the frame. 2) Description of Variables for PCRD: Before describing the PCRD algorithm, we define the following variables shown in Fig. 5 for increasing and decreasing cases of channel rate.4 • : the maximum channel rate at which the encoder may transmit over a given interval without decoder buffer overflow or encoder buffer underflow. (12) 4In MVS algorithm [6], the change point of channel rate is said to be convex, if the channel rate is increased. Also, it is said to be concave, if the channel rate is decreased.
: a flag indicating whether any decoder buffer underflow or encoder buffer overflow is expected over based on the current parameters of and and the channel rate . otherwise.
(16)
To decide the channel rate for real-time video transmission, it is necessary to continuously monitor changes in the upper and the lower bound functions due to scene changes or other variations in scene complexity. The parameter represents the monitoring range of any expected emergency cases. Also, we used to indicate the forthcoming encoder buffer overthe flag flow and decoder buffer underflow within the range . If the flag is set, the scheduler decides to calculate a new channel rate to avoid the buffer problems. The range should be a trade-off between the computational overhead and the accurate channel rate changing time. The larger the value of , the greater the computational overhead and the more accurate time to change the channel rate can be expected. PCRD calculates a new channel is rate over the look-ahead interval only when the flag set or enough time has elapsed after the recent renegotiation.
178
IEEE TRANSACTIONS ON BROADCASTING, VOL. 52, NO. 2, JUNE 2006
TABLE I NOTATIONS
3) Preventive Channel Rate Decision (PCRD) Algorithm: Fig. 6 shows five cases of the new channel rate decision, and Fig. 7 is the pseudo-code of the PCRD algorithm. For the algorithm, we use the notations in Table I. During the allowed look-ahead interval (lines 5 and 18 in Fig. 7), the scheduler , , , and calculates and updates the values of (lines 6 8). Fig. 6(a) shows the normal channel rate change where there is no expected encoder and decoder buffer underflow or overflow. If the scheduler changes channel rate only for an imminent buffer emergency, the difference between neighboring channel rates would be large. Because fluctuation in channel rates may impose a burden on the network, the renegotiation request may be accepted with low probability. Thus, if enough time has elapsed after the recent renegotiation, it is necessary to consider the gradual changes in source traffic level for the channel rate decision. For that purpose, we used the rate change conditions described in [6], which were proven to be optimal in stored video streaming (lines 9 12). Both Figs. 6(b) and 6(c) show the cases of imminent decoder buffer underflow or encoder buffer overflow, where the scheduler should increase the channel rate. Especially, Fig. 6(b) is the case of increasing source traffic (lines 9 12). In this case, only can avoid decoder buffer underflow. Fig. 6(c) is the case of the under-estimated channel rate for a constant level of source traffic (lines 13 and 14). Though both and can avoid decoder buffer underflow, is preferred as a new channel rate because the changed amount of channel rate is less than that of . Both Fig. 6(d) (lines 15 17) and (e) (line 11) show the cases of encoder buffer underflow or decoder buffer overflow. If the decoder buffer size is assumed to be large enough to avoid overflow, the encoder buffer underflow does not affect the continuous operation of the video transmission system except for lower channel utilization. In these cases, PCRD invokes renegotiation only when enough time has elapsed after the recent renegotiation time . Though channel under-utilization may occur, the average renegotiation interval can be increased by removing short renegotiation intervals caused by encoder buffer underflow. In Fig. 6(d), is preferred as a new channel rate considering the amount the channel rate changed. In Fig. 6(e), only can avoid a further buffer emergency. The transmission scheduler executes PCRD for every frame after the bit-rate estimation. This is because the characteristics of video source changes continuously over time; however, the main part of PCRD, which determines new channel rates, runs only when the flag is set or enough time has elapsed after recent renegotiation (line 3). Also, PCRD changes the channel rate if under-estimation ( , Fig. 6(c)) of the
current channel rate is expected during the look-ahead interval . It is to amend possible incorrect bit-rate estimation and to prevent decoder buffer underflow. In online traffic smoothing, it is not needed to examine future frames further than the length of the scene because the channel rate is decided based on the predicted bit-rates of future frames, which may be inaccurate after scene changes. To reflect the changed characteristics of source traffic and to reduce the scheduling overhead, we limit the renegotiation interval to a reference value from the recent renegotiation time, and the look ahead interval to from the current frame time. The larger the value , the lower the channel utilization and the larger average renegotiation interval are expected. That is, there should be a trade-off among channel utilization, scheduling overhead, and average renegotiation interval, considering both the minimum renegotiation interval allowed by networks and the availability of network resources. However, there still exist renegotiations with short renegotiation intervals due to decoder buffer underflow, which can be prevented only by encoding rate control. IV. EXPERIMENTAL RESULTS In this section, we show some simulation results for the PCRD algorithm. We used six video sequences in SIF format, including Bear and Multi in Section II-C. Tita, Poet, CNN, and Soon sequences are 10 000 frames long each and were digitized from “Titanic,” “Dead Poets Society,” “CNN News,” and a Korean situational comedy, respectively. While encoding, we used constant quantization parameters for all video frames. Because we focused on the ability of channel rate renegotiation algorithms to estimate the changed traffic level and the resulting cumulative distribution of the renegotiation interval, we assumed that the renegotiation request is always accepted by networks. For performance comparison, we used the MVS algorithm, for offline optimal smoothing [6], and a window-based online smoothing algorithm (WOS) [7]. Because the window size for online smoothing is limited by the look-ahead interval in [7], we used both window-based online smoothing with no lookahead (WOS) and with look-ahead (WOS-L). In [7], the target application is a smoothing server where there may be enough buffer for look-ahead, thereby increasing the look-ahead interval results in the increase in the end-to-end delay. However, for conversational applications, because the end-to-end delay is set to some values, the encoder cannot use the real bit-rate, but uses instead the estimated bit-rate for look-ahead frames. Thus, to use look-ahead frames for the WOS-L algorithm, we combined the WOS algorithm [7] and the bit-rate estimation scheme of [10]. The bit-rates in the previous Group of Pictures (GOP) are used for look-ahead frames. The smoothing window construction of each algorithm is depicted in Fig. 14, and the computational complexity of the algorithms will be discussed later. A. Basic Performance—PCRD Fig. 8 shows the smoothed traces of VBR video traffic by the PCRD and MVS algorithms. The smoothed output of PCRD is quite close to that of MVS. Fig. 9 shows cumulative renegotiation interval for the Soon sequence by PCRD and MVS. While PCRD has quite a small frequency in a short renegotiation interval, MVS has more rate changes with shorter renegotiation
LEE: VIDEO TRAFFIC PREDICTION BASED ON SOURCE INFORMATION
179
Fig. 6. Five cases for new channel rate decision. (a) Normal change. (b) Imminent decoder buffer underflow (increasing source traffic). (c) Imminent decoder buffer underflow (constant level of source traffic). (d) Encoder buffer underflow (case I). (e) Encoder buffer underflow (case II).
Fig. 8. Real-time video transmission using the PCRD algorithm. Soon 5, 20, 100 for all algorithms. 30 for PCRD. sequence,
D= Q=
L=
W=
Fig. 7. Preventive channel rate decision algorithm: PCRD.
intervals. From the results, the smoothing performance of PCRD is shown to be close to that of MVS. The renegotiation cost of PCRD is much less than that of MVS with a slight decrease in channel utilization. The channel utilizations of PCRD are 0.9763 and 0.8738 for 30 and 100, respectively. Fig. 10 shows the effective bandwidth of the smoothed output and the number of decoder buffer underflows for possible average renegotiation intervals in three online smoothing algorithms. The effective bandwidth statistically estimates the network throughput required to transmit the video through a -byte buffer with a tolerable loss rate of . For a video with transmission rates of , the effective bandwidth is computed as , where . We explore how these two performance metrics change as a function of the possible average renegotiation interval which is obtained by
Fig. 9.
Performance of PCRD.
D = 5, Q = 20, L = 30.
180
IEEE TRANSACTIONS ON BROADCASTING, VOL. 52, NO. 2, JUNE 2006
D= Q=
Fig. 10. Performance comparison of three online smoothing algorithms. 5, cell loss prob. 10 . (a) Effective bandwidth. (b) Number of decoder buffer underflow.
=
Fig. 11. times).
Effect of the reference renegotiation interval in PCRD.
20,
L=
30,
W =
10
1600, switch buffer size
=
5 kbits,
Q = 20, L = 60, D = 5. (a) Channel utilization. (b) Average renegotiation interval (in frame
simulating each algorithm with various smoothing window sizes of WOS and WOS-L and reference renegotiation intervals of PCRD. The average renegotiation interval of WOS and WOS-L is shown to be bounded by the end-to-end delay and its lookahead interval . Also, for small smoothing window sizes less than ten, the effective bandwidth in WOS and WOS-L is close to that of MVS. However, for large smoothing window sizes, the effective bandwidth in WOS-L grows abruptly which causes many decoder buffer underflows. This is because the larger smoothing windows include the more look-ahead frames, of which bit-rates should be predicted from the history of previous GOP. For the PCRD algorithm, there is no decoder buffer underflow, and the effective bandwidth is nearly the same level as in WOS and MVS. The result shows that PCRD determines the channel rate quite efficiently because PCRD estimates future traffic quite accurately and predicts any possible emergency cases in advance to prevent abrupt channel rate change and frequent renegotiation. B. Effect of Reference Renegotiation Interval Though MVS, WOS, and PCRD have very high channel utilization up to 1.0, they cannot be applied to RCBR applications
directly. This is because they sometimes renegotiate with a short renegotiation interval, e.g. less than 15 frame times, which results in increased renegotiation cost or renegotiation failure. For the encoder buffer underflow, PCRD invokes renegotiation only when the reference renegotiation interval elapses after the recent renegotiation time. Fig. 11 shows the channel utilization and the average renegotiation interval as a function of reference renegotiation interval. As expected, channel utilization decreases and the average renegotiation interval increases as the reference renegotiation interval decreases. Though there exist changes in source characteristics due to scene changes or time-varying coding complexity, the transmission scheduler should wait until its reference renegotiation time except for the emergency cases when the decoder buffer underflow is imminent. Thus, the average renegotiation interval does not increase linearly with increasing reference renegotiation interval and saturates at some values. Channel utilization and average renegotiation interval are conflicting elements to trade-off by adjusting the reference renegotiation interval. From the simulation results, it may be possible to achieve a channel utilization of over 0.9 while maintaining an average renegotiation interval of over 30 frame times in an average case. Shorter renegotiation requests which
LEE: VIDEO TRAFFIC PREDICTION BASED ON SOURCE INFORMATION
Fig. 12.
L
Effect of the look-ahead interval ( ) in PCRD.
181
Q = 20, W = 30, D = 5. (a) Channel utilization. (b) Average renegotiation interval (in frame times).
cannot be supported by the network should be prevented by channel constrained encoding rate control. C. Effect of Look-Ahead Interval PCRD uses the look-ahead to keep the renegotiation interval large enough. Because the content of the video sequence changes over time, the bit-rate estimation of (10) would not be valid for frames which are far from the current frame. Thus, the look-ahead interval for traffic smoothing should be set considering the average scene length which is delineated by scene changes. Fig. 12 shows the channel utilization and average renegotiation interval as a function of the look-ahead interval. For a small look-ahead interval, the channel utilization is quite low and the average renegotiation interval is quite large. This is because the rate change condition of for encoder buffer underflow case (Fig. 6(e)) is rarely satisfied for smaller look-ahead intervals. Thus, the encoder buffer underflow continues until the rate change condition is satisfied, which results in undesirable low channel utilization and a large average renegotiation interval. However, if the look-ahead interval is larger than some values, it does not affect the channel utilization and average renegotiation interval any more. Because the controller continuously monitors and detects scene changes, the look-ahead interval does not need to be much larger than average scene length, which may be less than 100 frame intervals for a general video sequence. For specific video applications where scene changes rarely occur, a larger look-ahead interval is expected to be advantageous. D. Effect of Finite Buffer Sizes To investigate the effect of finite buffer sizes, we simulate PCRD with different sets of encoder and decoder buffer sizes. Because the decoder transacts the compressed bit-stream on a frame basis, we set the minimum requirement for the encoder and the decoder buffer sizes as the maximum frame bit-rate of the video sequence. As performance measures, we adopted channel utilization, average renegotiation interval, coefficient of variation (COV) and peak to average ratio (PAR). COV is introduced as a measure to investigate and evaluate the temporal
variation of the transmission rate [10] and is defined as the ratio of the standard deviation of the transmission rate to the average transmission rate. PAR is a measure to determine the worst-case bandwidth requirement and is defined as the ratio of peak transmission rate to average transmission rate. Fig. 13 shows the performance of PCRD as a function of encoder buffer size, where the decoder buffer size is the value after subtracting the encoder buffer size from the total buffer size. It is shown that the channel utilization and the average renegotiation interval maintain nearly the same level for different encoder and decoder buffer sizes, which are larger than the maximum frame bit. For smaller encoder buffers and large enough decoder buffers, the channel utilization decreases and COV and PAR get higher because the controller does not have enough buffer space to hold and efficiently smooth the encoded bit-stream. The lack of buffer space can cause renegotiation for the higher channel rates because it determines new channel rates based largely on the current buffer fullness and less on the estimated bit information of future frames. For smaller decoder buffers and large enough encoder buffers, there are no significant performance changes as in the smaller encoder buffer cases. In this case, because the decoder buffer only needs to hold enough data not to cause decoder buffer underflow, the encoder buffer can hold the generated data and smooth it efficiently. In summary, the encoder buffer size has a more significant effect on the performance of PCRD than the decoder buffer size if the encoder and the decoder buffers are at least large enough to hold the largest frame data. We argue that it is enough for both buffers to have space to hold the data corresponding to the end-to-end delay in an online smoothing environment. In Fig. 13, it is enough for both buffer sizes to be 120 kbits frame max 5 frames 600 kbits, where the performance measures are expected to be the best. E. Computational Complexity The computational complexity is another important measure to evaluate online smoothing algorithms because the algorithms should run in real-time. Fig. 14 shows the smoothing process of three online smoothing algorithms. The window size of each algorithm is bounded by the delay or the look-ahead interval.
182
IEEE TRANSACTIONS ON BROADCASTING, VOL. 52, NO. 2, JUNE 2006
Q=
L=
D= W =
Fig. 13. Effect of finite buffers. 20, 60, 5, 30. Minimum requirement of encoder and decoder buffers (b) Average renegotiation interval. (c) Coefficient of variance. (d) Peak to average ratio.
The overhead associated with computing the transmission schedule at each invocation of PCRD are and for the cases of no rate change and rate change, respectively. Then, the total computational overhead for PCRD for smoothing a N-frame video is given by , where is the number of renegotiations in the video sequence. For windowbased online smoothing with a hopping interval of , the total computation for smoothing N-frame video is given by and for WOS and WOS-L, respectively. Because the number of renegotiations is not fixed for any video sequence, we compare the computational overhead from the previous simulation results with specific parameters. For the parameters 5, 15, and 5, the WOS and WOS-L have a computational overhead of and 4 , respectively. For the parameters 5, 30, and 3, the PCRD has a computational overhead of 3.53 for Soon with of 57.14 and 4.18 for CNN with of 25.45. By decreasing the value of , less computational overhead is expected with an increasing probability of decoder buffer underflow. Though the computational overhead of the PCRD varies for different sets of smoothing parameters, it is the same level to that of WOS/WOS-L algorithms.
= 120 kbits. (a) Channel utilization.
Fig. 14. Smoothing process of three online smoothing algorithms.
LEE: VIDEO TRAFFIC PREDICTION BASED ON SOURCE INFORMATION
V. CONCLUSIONS In this paper, we proposed an adaptive traffic renegotiation method for real-time video transmission. Source traffic prediction method based on a simple rate-distortion relation of video was proposed. Also, a channel rate bound was derived using underflow and overflow constraints of encoder and decoder buffers. Based on the traffic prediction method and the rate bound, a preventive channel rate decision (PCRD) algorithm was proposed. From the simulation results, it was shown that PCRD achieves a high channel utilization and an acceptable range of average renegotiation interval without decoder buffer underflow. Though the average renegotiation interval can be increased by a few GOP times in PCRD, there may exist renegotiation requests with short renegotiation intervals and renegotiation failures for some periods of video sequences. Thus, it is necessary to combine the channel rate renegotiation and the channel constrained video coding for real-time video transmission. REFERENCES [1] M. Grossglauser et al., “RCBR: a simple and efficient service for multiple time-scale traffic,” IEEE/ACM Trans. Networking, vol. 5, no. 6, pp. 741–755, 1997. [2] H. Zhang and E. W. Knightly, “RED-VBR: a renegotiation-based approach to support delay-sensitive VBR video,” ACM/Springer-Verlag Multimedia System Journal, vol. 5, no. 3, 1997. [3] D. Reininger, D. Raychaudhuri, and J. Hui, “Bandwidth renegotiation for VBR video over ATM networks,” IEEE J. Select. Areas Commun., vol. 14, no. 6, pp. 1076–1086, Aug. 1996. [4] S. Chong, S. Li, and J. Ghosh, “Predictive dynamic bandwidth allocation for efficient transport of real-time VBR video over ATM,” IEEE J. Select. Areas Commun., vol. 13, no. 1, 1995. [5] A. Adas, “Using adaptive linear prediction to support real-time VBR video under RCBR network service model,” IEEE/ACM Trans. Networking, vol. 6, no. 5, pp. 635–644, 1998. [6] J. Salehi, Z. Zhang, J. Kurose, and D. Towsley, “Supporting stored video: reducing rate variability and end-to-end resource requirements through optimal smoothing,” IEEE/ACM Trans. Networking, vol. 6, no. 4, Aug. 1998.
183
[7] S. Sen, J. Rexford, J. Dey, J. Kurose, and D. Towsley, “Online smoothing of variable-bit-rate streaming video,” IEEE Trans. Multimedia, vol. 2, no. 1, pp. 37–48, Mar. 2000. [8] J. Rexford, S. Sen, J. Dey, W. Feng, J. Kurose, J. Stankovic, and D. Towsley, “Online smoothing of live variable-bit-rate video,” in IEEE NOSSDAV, 1997, pp. 235–243. [9] K. Joseph and D. Reininger, “Source traffic smoothing and ATM network interfaces for VBR MPEG video encoders,” in IEEE ICC, 1995, pp. 1761–1767. [10] S. S. Lam, S. Chow, and D. K. Y. Yau, “An algorithm for lossless smoothing of MPEG video,” in ACM SIGCOMM, Aug. 1994. [11] A. R. Reibman and B. G. Haskell, “Constraints on variable bit-rate video for ATM networks,” IEEE Trans. Circuits Syst. Video Technol., vol. 2, no. 4, pp. 361–372, 1992. [12] T. V. Lakshman, A. Ortega, and A. R. Reibman, “VBR Video: tradeoffs and potentials,” Proceedings of the IEEE, vol. 86, no. 5, pp. 952–973, May 1998. [13] M. Lee, S. Kwon, and J. Kim, “A scene adaptive bitrate control method in mPEG video coding,” in SPIE Visual Communications and Image Processing, San Jose, CA, Feb. 1997. [14] L. Luo, C. Zou, and Z. He, “A new algorithm on MPEG-2 target bitnumber allocation at scene changes,” IEEE Trans. Circuits Syst. Video Technol., vol. 7, no. 5, pp. 815–819, Oct. 1997. [15] N. S. Jayant and P. Noll, Digital Coding of Waveforms: Prentice Hall, 1984. [16] J. Katto and M. Ohta, “Mathematical analysis of MPEG compression capability and its application to rate control,” in IEEE ICIP, 1995, pp. 555–558.
Myeong-jin Lee received the B.S., M.S., and Ph.D. degrees in electrical engineering from Korea Advanced Institute of Science and Technology (KAIST), Daejon, Korea, in 1994, 1996, and 2001, respectively. From 2001 to 2004, he was a Senior Engineer with the System LSI Biz., Samsung Electronics, Gyeonggi, Korea. Since 2004, he has been a Faculty Member of Kyungsung University, Busan, Korea, where he is an Assistant Professor in the Department of Electrical and Electronic Engineering. His current research interests are in the areas of video coding and multimedia communication systems. Prof. Lee received the Samsung Human Tech Thesis Prize in 2001. He is a member of the IEEE and the Korean Institute of Communication Sciences.