fuzzy joint encoding and statistical multiplexing of multiple video ...

1 downloads 0 Views 1MB Size Report
method for broadcasting over DVB-H (Digital Video Broadcasting for Handhelds) chan- ... tested for DVB-H broadcast system, it can be deployed in other video ...
International Journal of Innovative Computing, Information and Control Volume 5, Number 7, July 2009

c ICIC International °2009 ISSN 1349-4198 pp. 1—IHMSP07-07

FUZZY JOINT ENCODING AND STATISTICAL MULTIPLEXING OF MULTIPLE VIDEO SOURCES WITH INDEPENDENT QUALITY OF SERVICES FOR STREAMING OVER DVB-H Mehdi Rezaei1 , Imed Bouazizi2 and Moncef Gabbouj1 1

Department of Signal Processing Tampere University of Technology P.O.Box 553, FI-33101 Tampere, Finland [email protected]; [email protected] 2

Nokia Research Center Tampere, Finland [email protected]

Received February 2008; revised July 2008 Abstract. A novel fuzzy joint video encoding and statistical multiplexing (StatMux) method for broadcasting over DVB-H (Digital Video Broadcasting for Handhelds) channels is proposed to decrease end-to-end delay in a broadcast system. DVB-H uses a time-sliced transmission scheme to reduce the power consumption used for radio reception in handheld receivers. Due to the time slicing scheme in DVB-H, channel changing delay, i.e. changing from one audio-visual service to another, and thereafter end-to-end delay becomes significant. The proposed video encoding method decreases the buffering delays that constitute the major parts of the end-to-end delay by implementing statistical multiplexing (StatMux) over video services. Unlike conventional similar methods, in the proposed method the multiplexed services can have independent bit rates and quality of services. Moreover, the computational complexity of the proposed method is much lower than that of conventional methods. Although the proposed method has been designed and tested for DVB-H broadcast system, it can be deployed in other video broadcast systems in which a number of video services are encoded and broadcasted simultaneously. Simulation results show that the proposed method can considerably decrease end-to-end delay without any cost in the overall quality of compressed video. Keywords: Broadcasting, Fuzzy logic control, Rate control, Statistical multiplexing, Streaming, Video coding

1. Introduction. Digital Video Broadcasting for Handheld terminals (DVB-H) is an ETSI specification for delivering broadcast services to battery-powered handheld receivers [1-4]. DVB-H is mainly based on the DVB-T specification for digital terrestrial television. However, it adds a number of features designed to consider the limited battery life of handheld devices and the particular environments in which such receivers operate [5,6]. Services used in mobile handheld terminals require relatively low bit rates. The estimated maximum bit rate for streaming video using advanced compression technology like H.264/AVC is in the order of a few hundred kilobits per second. A DVB-T transmission system usually provides a bit rate of up to 8Mbps or more. This provides a possibility to significantly reduce the average power consumption of a DVB-H receiver by introducing a scheme based on time division multiplexing. This scheme is called Time-slicing. To reduce the power consumption in handheld terminals, the service data is time-sliced and then it is sent through the channel as bursts at a significantly higher bit rate compared to the bit rate of the audio-visual service. Time-slicing enables a receiver to stay active 1

2

M. REZAEI, I. BOUAZIZI AND M. GABBOUJ

during only a small fraction of the time, while receiving bursts of a requested service. It significantly reduces the power consumption used for radio reception. DVB-H also employs additional forward error correction to further improve mobile and indoor reception performance of DVB-T. Channel changing delay in DVB-H refers to the time between the start of switching to a new channel and the start of the media rendering [7]. Channel changing delay includes several parts: Arrival Delay (delay to arrival of desired burst), Reception Delay (reception duration of desired burst), Decapsulation Buffering Delay, Decoder Refresh Delay (delay to the first random access point), Decoder Buffering Delay (Initial buffering period of Coded Picture Buffer). The decapsulation buffering delay includes two buffering delays for the Multi-protocol Decapsulation Buffer (MDB) and RTP (Real Time Protocol) Decapsulation Buffer (RDB). The decapsulation buffering delay is required to compensate for the variations of burst size and the decoder buffering delay is needed to compensate for the variations of bit rate. Moreover, another delay is needed for synchronization between the associated streams (e.g. audio and video). One of the significant factors in channel changing delay is the arrival delay. The arrival delay depends on the time-slicing parameters that define the power consumption of DVBH receivers. The lower the receiver power consumption, the higher will the arrival delay be. Another factor in channel changing delay is the required delay to compensate the variation in bit rate. For video streaming over DVB-H system the advantages of variable bit rate video is exploited. For most video contents, a variable bit rate (VBR) video can provide better visual quality and coding efficiency than a constant bit rate video [8]. A higher quality and compression performance can be obtained by more variations in bit rate and higher buffering delay. When VBR bit streams are broadcasted in DVB-H, utilizing statistical multiplexing (StatMux) is beneficial to reduce end-to-end delay and to maximize utilization of transmission bandwidth. StatMux in DVB-H can be implemented in conjunction with encoding at the encoders where a number of services are encoded and broadcasted simultaneously and/or it can be implemented in conjunction with the time-slicing at a network element called IP Multiprotocol Encapsulator. Depending on the implementation method, StatMux can affect on different parts of the end-to-end delay. Implementation of StatMux at the IP encapsulator is out of scope of this paper. The proposed joint encoding and StatMux method affect the buffering delay that is required at IP encapsulator before encapsulation and the decoder initial buffering delay that is required before decoding. The major problem of joint video encoding and StatMux is how to allocate the available bit budget among the video sources that share the common channel bandwidth and are jointly encoded. The conventional joint encoding methods follow two main approaches: forward analysis and modeling approach. In forward analysis, a preprocessing is performed on video sources to gather statistics about the coding complexity. The real coding process can operate based on the statistics obtained by the preprocessing. In the modeling approach, first it is attempted to model the performance of video encoder and the coding complexity of video sources and then the allocated bits to video sources is controlled based on provided models while the models are updated during the encoding. See the proposed methods in [9-11] as examples for the two approaches. The system presented in [9] consists of several preprocessors and video encoders. Each preprocessor analyzes a video source and derives picture statistics. Using these statistics, a joint rate controller calculates dynamically the bit rate for each encoder based on the relative complexities of the sources. Another bit allocation method for joint coding of multiple video sources is presented in [10]. In this method, the input video sources are divided into Super GOPs (a number of GOP) and Super Frames (a set of frames, one from each source) and then, the

FUZZY JOINT ENCODING AND STATISTICAL MULTIPLEXING

3

bit budget is distributed hierarchically between the video super GOPs, super frames and frames according to their relative complexities while the encoder and decoders’ buffers are prevented from overflowing and under-flowing. Finally, using a rate-distortion model, a quantization parameter is calculated for each frame according to the allocated bits to the frame. A similar approach to that in [10] is presented in [11]. In this paper we propose a novel fuzzy joint encoding and StatMux method that is different from conventional methods. The proposed method does not use any preprocessing or model for controlling. It utilizes a number of fuzzy controllers to control the bit rates of encoded bit streams simultaneously in real-time and to decrease the buffering delays and thereafter end-to-end delay of broadcast system. The proposed joint video encoding and multiplexing method in this paper has been built based on a combination and modification of the rate control methods proposed in [12,13] and [14] for independent and joint video encoding, respectively. The paper is organized as follows: Section 2 presents an overview on the proposed method. Detailed information about the proposed method is presented in Section 3 and Section 4. Simulation results of the proposed method are provided in Section 5. The paper ends with conclusions in Section 6.

Figure 1. Block diagram of proposed joint encoding system 2. Proposed Joint Video Encoding and Statistical Multiplexing Method. The proposed joint encoding and StatMux method is implemented by a joint video rate control system. Figure 1 shows the simplified block diagram of the rate control system. A number of video sources are encoded simultaneously, each by one encoder. The rate control system utilizes a number of fuzzy controllers to control the bit rate of each encoded bit stream and also the bit rate of aggregated bit stream. The proposed system is a real-time control system without any look ahead and preprocessing. Utilizing the fuzzy controllers, it has a very low degree of complexity in comparison to the conventional methods. Unlike the conventional methods, in the proposed method, the multiplexed bit streams can have independent quality of services and different bit rates. The proposed method can be tuned to only allow for short-term exchanges of bit budget information between bit streams, in which case the long-term exchanges of bits between the bit streams are prevented. In this case, the average quality of encoded bit stream remains constant in comparison to the independent encoding case. In another case, it can be tuned to allow long-term exchanges of bit budget between the bit streams similar to the conventional methods. According to the proposed method, an independent video rate controller (IRC) is used for encoding each video bit stream to guarantee a VBR bit stream with an average bit rate and a buffering constraint. The encoded bit streams are multiplexed and moved to

4

M. REZAEI, I. BOUAZIZI AND M. GABBOUJ

a virtual joint buffer. The data is removed from the joint buffer at a constant bit rate appropriate to the target bandwidth of the transmission channel. The occupancy of joint buffer is used as a feedback signal by a joint rate controller (JRC). The JRC controls the variations in bit rate of the aggregate bit stream to guarantee a limited buffering delay. Less variations in the bit rate of aggregated bit stream means smaller buffering delays. The proposed method not only decreases the decoder buffering delay by allocating variable bandwidth instead of a fixed bandwidth to each bit streams, but also, decreases the buffering delay that is required before the start of encapsulation at the IP encapsulator of the DVB-H network. These two delays are related to each other such that any reduction in the decoder buffering delay means a similar reduction in the buffering delay at the IP encapsulator. The video encoders are controlled by adjusting the Quantization Parameter (QP) on a picture basis. The QP is mainly controlled by the IRCs while the JRC adds a small positive or negative value to the QP values determined by the IRCs according to the occupancy of the joint buffer and the bit rate of the aggregate bit stream as Qn = QIRCn + ∆QJ

(1)

where Qn denotes the used QP by the nth encoder. QIRCn is the QP calculated by the nth independent rate controller and ∆QJ represents the output of JRC. More details about the IRCs and JRC systems are presented in Section 3 and Section 4, respectively. 3. Fuzzy Independent Rate Controller. The IRC controls the bit rate of a bit stream by adjusting the QP on a picture basis. It utilizes a fuzzy rate controller and several other tools to calculate the QP for different types of video pictures. Although here, only intra-prediction pictures (I-picture) and reference inter-prediction pictures (P-pictures) are explained, the algorithm is easily stretched to support other types of pictures as well. The IRC utilizes a virtual buffer to impose a buffering constraint on the bit stream. The IRC can be functionally divided into two main parts. The first part utilizes the fuzzy controller to compute the QP of P-pictures. The second part of the algorithm uses other feedback signals from uncompressed and compressed video to calculate the QP of I-pictures. The I-pictures at the scene boundaries are treated differently from the normal I-pictures at the periodic random access points. In VBR, the bit allocation to I-pictures has a remarkable impact on the overall rate-distortion (RD) performance. Therefore, the QP of I-pictures should be computed very carefully. The key point in the proposed IRC is to prevent unnecessary variations in quality while the buffer constraint is observed. Detailed information about the calculation of QP is presented in the sequel.

Figure 2. Block diagram of the fuzzy IRC

FUZZY JOINT ENCODING AND STATISTICAL MULTIPLEXING

5

3.1. Output of IRC for P-pictures. The output of IRC includes the main part of a QP that is used for encoding a video picture as in (1). The output of IRC (QIRCn ) for P and I pictures are denoted as QP and QI , respectively, in the sequel. The output of IRC for P-pictures is defined by the fuzzy controller. Figure 2 depicts the block diagram of the proposed rate control system for P-pictures. The fuzzy controller and the virtual buffer (Buf.1 to Buf.N in Figure 1) are the basic elements of the control system. The fuzzy controller attempts to control the bit rate of the encoded bit stream by adjusting the variations of QP while it has been optimized to prevent unnecessary fluctuations of QP. In computation of QP, it is assumed that the consequent video frames have a similar degree of coding complexity (except in scene cuts). Therefore, the coding complexity of the previously encoded picture is used as an estimate for the coding complexity of the current encoding picture and the QP of the current picture is computed based on the QP of the previously encoded picture with small variation which is defined by the fuzzy controller. The fuzzy controller uses two feedback signals about the buffer fullness and about the bit rate. Furthermore, a low pass filter (LPF) smoothes the feedback signal about the bit rate to smooth the variations in the output of the fuzzy controller. The output of IRC for the current P-picture (QP ) is the sum of the QPs used for encoding the previous picture and the output of the fuzzy controller (∆QF ) QP (i) = QP (i − 1) + ∆QF (i)

(2)

From the system point of view, the main part of IRC output is the delayed version of the previous output and the variation of IRC output is adjusted by the fuzzy controller. In this approach, in fact, the RD performance of the previously encoded pictures is used as a reference point for encoding the next picture and, if necessary, a small adjustment as compared to the reference point is computed. The main advantage of this approach is that in the small range around the reference point, the all nonlinear functions that exist in the system can be assumed as linear without losing the computational accuracy. 3.2. Virtual buffer. The virtual buffer used by the IRC simulates the buffering process of the decoder at the receiver side. Although it utilizes a simple model, it is nearly identical to the hypothetical reference decoder models used in different video coding standards. The occupancy of the virtual buffer is updated after encoding each video picture as follows OB (i + 1) = OB (i) − B(i) + (RT /F )

(3)

where OB (i) denotes the occupancy of the buffer before encoding ith picture. B(i) shows the number of bits consumed by the ith encoded picture (P or I). RT indicates the target average bit rate for the bit stream and F represents the video frame rate. Note that the virtual buffer models the decoder buffer at the receiver side. Therefore, the inputs to this buffer correspond to the outputs from a buffer that operates at the encoder side. 3.3. Fuzzy controller. Study of the conventional rate control approach shows that many heuristic functions coexist with the nonlinear RD models in the rate control process. See the proposed rate controller in [15] as example for VBR video. As a new approach, the fuzzy controller is selected for the proposed system because the nonlinear functions and the complexities that exist in rate control task can be simply included in the fuzzy rules and fuzzy membership functions (MSFs). Generally, a fuzzy controller can be designed based on the expert experiences or it can learn from the examples. Therefore, a fuzzy controller is a good option that makes use of the many heuristic results for video rate control. Moreover, according to the used block diagram shown in Figure 2, a controller is required to define a small quantized value based on rough measurements on the bit rate and buffer fullness. These properties make it fit to a fuzzy controller.

6

M. REZAEI, I. BOUAZIZI AND M. GABBOUJ

The fuzzy controller has two input signals that are normalized values of the buffer occupancy and the bit rate of P-pictures. The buffer occupancy is normalized by the buffer size and the bit rate of P-pictures is normalized by the target bit rate for Ppictures. While in VBR the consumed bit budget by P-pictures can be very different from the consumed bit budget by I-pictures, depending on the frequency of I-pictures in the bit stream, the target bit rate of P-pictures can be very different from the whole target bit rate. It is attempted to estimate a precise value for the target bit rate of P-pictures to be used for the normalization purpose. The fuzzy inputs are defined as (4) x1 = OB /SB µ ¶ BP F XIP − 1 x2 = 1+ (5) RT II where BP denotes the consumed bit budget by the previous encoded P-picture. II stands for the interval of periodic I-pictures in the bit stream in term of number of pictures. XIP indicates the coding complexity of I-pictures relative to P-pictures and it is computed as XIP = B I /B P

(6)

where B I and B P denote the average consumed bit budgets by the encoded I-pictures and P-pictures, respectively, in the current scene cut. If the previous encoded picture is an intra picture, the value of BP in (5) is reset to the value of BI /XIP . To suppress the fluctuations of QP results of short-term variations in complexity of video pictures, the low pass filter (LPF) smoothes the variation of BP before input to the fuzzy controller. The impulse response of LPF is ¢ ¡ (7) H(z) = m/ m + 1 − z −1

where m is a constant value and good results are obtained with m = 1.2. All the used fuzzy rules are summarized in Table 1. The content of table specifies the output of the fuzzy controller. The letters H, L, M and V correspond to the fuzzy descriptions High, Low, Medium and Very, respectively. The descriptor Very has been repeated to make new descriptors. The number before V shows the number of repetition. As an example from the table, it can be expressed as: if x1 is VL and x2 is H then output is 3VH (Very Very Very High). The input signals are specified by their fuzzy membership functions. Nine and seven membership functions have been used for the two inputs x1 and x2 , respectively. The fuzzy rules and membership functions have been designed based on experiences form our previous rate control algorithms presented in [12-17]. The asymmetric structures in the table of fuzzy rule and fuzzy MSFs are related to a number of facts which affect the operation of rate controller. The nonlinearity of the RD function and the difference between the bit budgets of I and P-pictures are two key points that cause the asymmetry in the structures. The other key point is that the gain of control loop is a function of buffer conditions. A more aggressive control is required when the buffer fullness is close to critical conditions to prevent underflow and overflow and a looser control is preferred when the buffer fullness is far from the critical conditions to prevent unnecessary variations in quality of the encoded video. After preliminary design of the fuzzy system, an optimization process was performed to fine tune the fuzzy membership functions. In the optimization process several parameters including average bit rate, average PSNR, average QP, and standard deviation of PSNR were considered. The final shapes of membership functions are shown in Figure 3. The desired central values for the output of fuzzy system correspond to the fuzzy rules in Table 1 are depicted in Table 2. A well-known and simple fuzzy system with two inputs using product inference engine, singleton fuzzifier, and center-average defuzzifier, as in [18], was used.

FUZZY JOINT ENCODING AND STATISTICAL MULTIPLEXING

f (x1 , x2 ) =

N2 N1 P P

i1 =1 i2 =1

7

y i1 i2 μAi1 (x1 ).μAi2 (x2 ) 1

N2 N1 P P

i1 =1 i2 =1

2

(8) μAi1 (x1 ).μAi2 (x2 ) 1

2

i where f (x1 , x2 ) denotes approximated output and {A1i , A2i , · · · , AN i }i=1,2 are fuzzy sets with {μAi1 (x1 )}1≤i1 ≤N1 and {μAi2 (x2 )}1≤i2 ≤N2 membership functions defined for inputs x1 1 1 and x2 , respectively. The central desired outputs denoted by y i1 i2 . More information about the derivation steps of the fuzzy system is presented in [18] and [19]. The output of fuzzy system is passed through a gain control block that adaptively tunes the gain of the feedback loop according to the buffer size and the video content properties as ∆QF = α × RT /SB × f (x1 , x2 ) (9) where α is a coefficient which can be used for fine tuning of the RCA according to the video content properties.

Table 1. Summarization of the IF-THEN fuzzy rules for IRCs

x2

VH H MH M ML L VL

6VH 5VH 4VH 3VH VVH VH H MH M 5VH 4VH 3VH VVH VH H MH M ML 4VH 3VH VVH VH H MH M ML L 3VH VVH VH H MH M ML L VL VVH VH H MH M ML L VL VVL VH H MH M ML L VL VVL 3VL H MH M ML L VL VVL 3VL 4VL XL

VVL

VL

L

ML x1

M

MH

H

VH

Table 2. Desired central values of fuzzy output in IRCs

x2

VH H MH M ML L VL

8 7 6 5 4 3 2

7 6 5 4 3 2 1

3VL VVL

6 5 4 3 2 1 0 VL

5 4 3 2 1 0 −1 L

4 3 2 1 0 −1 −2

ML x1

3 2 1 0 −1 −2 −3 M

2 1 0 −1 −2 −3 −4

MH

1 0 −1 −2 −3 −4 −5 H

0 −1 −2 −3 −4 −5 −6

VH

3.4. Output of IRC for I-pictures. The output of IRC for I-picture is computed based on the picture coding complexity and scene cut information. There are two types of Ipictures in the bit stream: periodic I-pictures which are placed in locations with a constant frequency and I- pictures which are inserted at the beginning of scene cuts. The output of IRC for both types of I-pictures is formulated as QI = QR + ∆QX

(10)

8

M. REZAEI, I. BOUAZIZI AND M. GABBOUJ

Figure 3. Membership functions of the IRCs fuzzy inputs where QI denotes the output and QR is a reference value for QI that is computed differently for the two types of I-pictures. The control of QI or the variation of QI around the reference QR is imposed by a controlling signal ∆QX . The controlling signal adapts the QP of I-pictures according to the coding complexity of video frame. While QR defines a reference value for the QP, the controlling signal makes small variations around the reference value. More details about the controlling signal and the reference QR are presented in the sequel. 3.4.1. Reference value. The reference value (QR ) for two types of I-pictures is handled differently. For the periodic I-pictures in which the subsequent pictures have a high degree of correlation in terms of content and complexity, the idea is to have I-pictures with a quality as close as possible to the quality of neighboring pictures. Implementing a low pass filter similar to (7) on QP of the previous encoded pictures gives a local average value which can be used as the reference value for the current I-picture. However, using a similar QP for encoding the I-picture and the neighboring P-pictures results in a higher quality for the I-picture than the P-pictures. This difference is acceptable and it is useful for overall quality. The low pass filter prevents larger differences that may have existed between the quality of I-picture and P-pictures. The I-pictures at scene cuts may or may not have correlation in terms of complexity and/or content with the previous encoded pictures. Therefore, any estimation independently of the previous encoded frames or only based on the previous encoded frames may lose the bit budget or the quality. From this point of view estimating a fit QP for an I-picture at scene cut is quite challenging. As a simple solution, the reference value for the first I-pictures at scene cut is calculated as ¢ ¡ (11) QR = Q + Qm /2

where Q is a local average as for frequent I-pictures and Qm is a constant QP in the middle range, e.g. (26-34) for H.264/AVC, as a global average over various video contents. The local average value or Q keeps the quality of the I-picture close to those of previous encoded pictures when there is some correlation between the two consequent scenes in terms of content. The Qm guarantees the allocation of a bit budget in the middle range if there is no correlation between consequent scenes in terms of coding complexity.

FUZZY JOINT ENCODING AND STATISTICAL MULTIPLEXING

9

3.4.2. Coding complexity adaptation. The complexity adaptation signal or ∆QX controls the output of IRC for the I-picture around the reference value according to the coding complexity of the picture. To compute the small variations around the reference value with low complexity, it is accurate enough to use a simple first-order RD model as R = X/D

(12)

where R and D denote the rate and distortion respectively. X stands for the coding complexity. However, for small variations of QP around the reference point, an approximated linear function between QP and distortion can be assumed and the RD model above can be rewritten as R = X/Q

(13)

Using the RD model (13), considering the average values of QP and complexity of Ipictures as a reference point, the complexity adaptation signal based on a drift from the reference can be derived as ¶ µ X −1 (14) ∆QX = AX QI X where QI denotes the average value of QP of all encoded I-pictures. AX is an experimentally defined constant value (typically about 0.3) called complexity adaptation factor. X denotes the coding complexity of the I-picture and X stands for the average value of coding complexity over all encoded I-pictures. Various criteria such as variance for the estimation of coding complexity can be used. An accurate measure for the coding complexity of I-pictures was proposed in our previous work [16]. 4. Fuzzy Joint Rate Controller. The JRC produces an output that is added to the IRCs outputs to compute the QPs used by the encoders. The output of JRC modifies the QPs to control the variations in the bit rate of the aggregate bit stream. It utilizes a fuzzy controller with feedbacks from the occupancy of the joint buffer and from the bit rate of aggregated bit stream. More details about the JRC are presented in the sequel. 4.1. Virtual joint buffer. The joint virtual buffer operates similarly to individual buffers used by IRCs but it is used for the aggregate bit stream. The buffer occupancy is updated after encoding a series of corresponding pictures (mth of each source) as OJB (m) = OJB (m − 1) −

N X i=1

Bi +

N X

Ri /F

(15)

i=1

where OJB denotes the joint buffer occupancy and Bi represents the consumed bits by the encoded picture of ith source. Ri indicates the target bit rate of the ith bit stream and F stands for the frame rate of the bit streams. 4.2. Joint rate controller. The output of JRC is defined by a fuzzy controller. While each bit stream uses an independent controller with a buffer constraint, without any other control, the multiplexed bit stream is constrained by a joint buffer with a size equal to the sum of the sizes of the individual buffers used by the IRCs. The idea is to use a virtual buffer with the size of SJB as smaller as possible and then use the JRC to operate only when the buffer condition is critical. The fuzzy controller has been designed in such a way that the JRC has a non-zero output only when the buffer state is critical. This

10

M. REZAEI, I. BOUAZIZI AND M. GABBOUJ

minimizes the interaction between encoders and also it minimizes the variations in the quality of each bit stream. The fuzzy controller has two input signal as y1 = OJB /SJB F y2 =

N P

(16)

Bi

i=1 N P

(17)

Ri

i=1

where SJB denotes the size of joint buffer.

Table 3. Summarization of the IF-THEN fuzzy rules for IRCs

y2

H M L

VH H MH H MH M MH M M VL

L

M M M

M M ML

M ML ML L L VL

ML M MH y1

H

VH

All the fuzzy rules are summarized in Table 3. The content of Table 3 specifies the output of the controller. The letters H, L, M and V correspond to the fuzzy descriptors High, Low, Medium and Very, respectively. As an example from the table it can be expressed as: if y1 is VL and y2 is M then output is H. Seven and three MSFs have been used for the two inputs y1 and y2 , respectively. The linguistic fuzzy rules and MSFs were designed based on some theoretic and experimental results such as in IRCs. Furthermore, an optimization process was performed for fine tuning of the fuzzy MSFs. The final shapes of the MSFs are shown in the Figure 4. The desired central values for the output of fuzzy system correspond to VL, L, ML, M, MH, H and VH in the Table 3 are −3, −2, −1, 0, 1, 2 and 3, respectively. A fuzzy system similar to (8) was used for the JRC. Moreover, the output of fuzzy system is tuned adaptively according to buffer size as ∆QJ = β × R/SJB × f (y1 , y2 )

(18)

where β is a constant coefficient typically about 0.3. When the number of multiplex services is small and the size of joint buffer is also relatively small, it is useful to make a time shift between the periodic IDR pictures across the bit streams to prevent unnecessary variations in QP and quality. 5. Simulation Results. To evaluate the performance of the proposed joint video encoding and multiplexing method a set of simulations were performed. A number of 4 long (60seconds) video sequences with the frame rate of 15fps, QVGA picture format and different contents were encoded by two methods for a target bit rate of 300kb/s for each. First, the sequences were encoded independently by independent rate controller such as used IRCs in the proposed system. Second, they were encoded with the proposed joint rate control system. Then, encapsulating, transmission and reception of DVB-H were simulated on the two sets of encoded bit streams for a constant bit rate channel with a bandwidth of 1200kb/s. The required decoder buffer size, decoder buffering delay and PSNR of luminance component were measured for two sets. Results of simulation are presented in the Table 4. The proposed method provides 38% reduction in the required decoder buffer size and 62% reduction in the decoder buffering delay at the expense of 0.02dB degradation in quality. Due to symmetric operation of the decoder buffer and

FUZZY JOINT ENCODING AND STATISTICAL MULTIPLEXING

11

Figure 4. Membership functions of the JRC fuzzy inputs the IP encapsulator buffer, the same percentage of reduction in the buffering delay of IP encapsulator is expected. Table 4. Simulation results on 4 video sequences (S1, S2, S3 and S4), 300kb/s for each, 15fps, QVGA, 1200kb/s channel bandwidth

Sequence S1 S2 S3 S4 Average

Independent Encoded PSNR Delay Buffer “dB” “s” “kbit” 38.86 0.56 269 41.41 0.37 190 37.02 0.39 211 40.08 1.25 477 39.35 0.65 287

Joint Encoded PSNR Delay Buffer “dB” “s” “kbit” 38.83 0.23 170 41.41 0.24 180 37.00 0.25 176 40.07 0.27 184 39.33 0.25 178

Sample graphical simulation results are depicted in Figure 5. The frame size and the PSNR of a bit stream have been depicted for the two cases: independent encoding and joint encoding and joint encoding. It can be seen how the PSNR graphs are similar. The bit rate and the PSNR are changed only when the buffer state is critical. The overall results show that the proposed method can control the bit rate of aggregated bit stream without considerable touch of video quality. Figure 6 shows sample reconstructed video frames encoded by the proposed method. The video sequences used for simulations include many scene cuts with very different contents in terms of coding complexity and motions that are very challenging for the rate control. It is notable that the known video sequences that are used for the standardization process cannot be used for the evaluation of proposed method because they have short lengths and homogenous contents. To evaluate the proposed method from the computational complexity point of view, the computational complexity of the IRC was compared with the presented rate control algorithm in the Joint Model (JM) of H.264/AVC standard [20]. The first 100 video frames of four known video sequences including Foreman, Carphone, News, and Hall were encoded by the two rate control algorithms. The consumed processing times by the

12

M. REZAEI, I. BOUAZIZI AND M. GABBOUJ

Figure 5. Simulation results of independent and joint video encoding

Figure 6. Sample reconstructed video frames encoded by the proposed method controllers were measured by a high accuracy using the clock of processor (Intel Pentium4, 2.8GHz). To minimize the measuring error results of time sharing operation of processor, the encoding was repeated 10 times and the minimum measured value was selected for each sequence. The measured results are shown in Table 5. The average results over the video sequences show that the JM rate controller consumes a processing time about 384μs (micro second) in average for each frame while the IRC consumes a processing time about 15μs in average for each frame. This numbers can be scaled to the number of bit streams to estimate the computational complexity of the whole rate control system. According

FUZZY JOINT ENCODING AND STATISTICAL MULTIPLEXING

13

to these results, there is a big difference between the computational complexity of the proposed joint rate control method and similar conventional methods. The overall simulation results show that that the proposed fuzzy joint encoding and multiplexing method can considerably decrease end-to-end delay of DVB-H broadcast system without any cost in service quality and by a very low computational complexity. Table 5. Comparison of the IRC and JM rate controller Processing Time ‘Micro Second’ JM IRC Foreman 39327 1493 Carphone 38136 1506 News 38116 1504 Hall 38379 1508 Average Over Sequences 38489 1503 Average Per Frame 384 15 Sequence

6. Conclusions. Utilizing fuzzy controllers, a method for joint video encoding and statistical multiplexing of multiple video bit streams was proposed that decreases end-to-end delay in a broadcast system in which a number of services are encoded and broadcasted simultaneously. In the proposed method, the advantages of statistical multiplexing are deployed while the broadcast services can have independent bit rates and quality of service. The proposed method has a low degree of computational complexity. It can decrease and control the buffering delays of broadcast system without any cost in overall quality of compressed video. While in a broadcast system, delay and bandwidth are considered as resources that can compensate each other, the proposed method can be used to decrease the overall bandwidth consumed by the broadcast services. Acknowledgment. This work was supported by Nokia and the Academy of Finland, Finnish Center of Excellence Program 2006-2011, under Project 213462. REFERENCES [1] ETSI, Digital video broadcasting (DVB): Transmission systems for handheld terminals, ETSI Standard, EN302304 V1.1.1, 2004. [2] M. Kornfeld, DVB-H: The emerging standard for mobile data communication, Proc. of the IEEE Symp. on Consumer Electronics, pp.193-198, 2004. [3] G. May, The IP datacast system-overview and mobility aspects, Proc. of the IEEE Symp. on Consumer Electronics, pp.509-514, 2004. [4] G. Faria, J. A. Henriksson, E. Stare and P. Talmola, DVB-H: Digital broadcast services to handheld devices, Proc. of IEEE, vol.94, no.1, 2006. [5] ETSI, Digital video broadcasting (DVB): Framing structure, channel coding and modulation for digital terrestrial television, ETSI Standard, EN300744 V1.5.1, 2004. [6] U. Ladebusch and C. A. Liss, Terrestrial DVB (DVB-T): A broadcast technology for stationary portable and mobile use, Proc. of the IEEE, vol.94, no.1, pp.183-193, 2006. [7] M. Rezaei, I. Bouazizi, V. K. M. Vadakital and M. Gabbouj, Optimal channel changing delay for mobile TV over DVB-H, Proc. of the IEEE Con. on Portable Information Devices Orlando, USA, pp.1-5, 2007. [8] T. V. Lakshman, A. Ortega and A. R. Reibman, VBR video: Tradeoffs and potentials, Proc. of the IEEE, vol.86, no.5, pp.952-973, 1998. [9] L. Boroczky, A. Y. Ngai and E. F. Westermann, Joint rate control with look-ahead for multiprogram video coding, IEEE Transactions on Circuits and Systems for Video Technology, vol.10, no.7, pp.1159-1163, 2000.

14

M. REZAEI, I. BOUAZIZI AND M. GABBOUJ

[10] L. Wang and A. Vincent, Bit allocation and constraints for joint coding of multiple video programs, IEEE Transactions on Circuits and Systems for Video Technology, vol.9, no.6, pp.949-959, 1999. [11] H. Xiong, J. Sun, S. Yu, C. Luo and J. Zhou, Design and implementation of multiplexing rate control in broadband access network TV transmission system, IEEE Transactions on Consumer Electronics, vol.50, no.3, pp.849-855, 2004. [12] M. Rezaei, M. Gabbouj and I. Bouazizi, Delay constrained fuzzy rate control for video streaming over DVB-H, Proc. of the IEEE Conf. on Intelligent Information Hiding and Multimedia Signal Processing, Pasadena, California, USA, pp.223-227, 2006. [13] M. Rezaei, M. M. Hannuksela and M. Gabbouj, Semi-fuzzy rate controller for variable bit rate video, IEEE Transactions on Circuits and Systems for Video Technology, vol.18, no.5, pp.633-645, 2008. [14] M. Rezaei, I. Bouazizi and M. Gabbouj, Fuzzy joint encoding and statistical multiplexing of multiple video sources with independent quality of services for streaming over DVB-H, Proc. of the IEEE Conf. on Intelligent Information Hiding and Multimedia Signal Processing, Kaohsiung, Taiwan, pp.542-545, 2007. [15] M. Rezaei, S. Wenger and M. Gabbouj, Video rate control for streaming and local recording optimized for mobile devices, Proc. of the IEEE Symp. on Personal Indoor and Mobile Radio Communications, Berlin, vol.4, pp.2284-2288, 2005. [16] M. Rezaei, S. Wenger and M. Gabbouj, Analyzed rate distortion model in standard video codecs for rate control, Proc. of the IEEE Workshop on Signal Processing Systems, Athens, Greece, pp.550-555, 2005. [17] M. Rezaei, M. M. Hannuksela and M. Gabbouj, Low-complexity fuzzy video rate controller for streaming, Proc. of the IEEE Conf. on Acoustic, Speech and Signal Processing, Toulouse, France, vol.2, pp.897-900, 2006. [18] L. X. Wang, Adaptive Fuzzy System and Control: Design and Stability Analysis, NJ: Prentice-Hall, Englewood Cliffs, 1994. [19] L. X. Wang, Stable adaptive fuzzy control of nonlinear systems, IEEE Trans. Fuzzy Systems, vol.1, no.2, pp.146-155, 1993. [20] G. Sullivan, T. Wiegand and K. P. Lim, Joint model reference encoding methods and decoding concealment methods, Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG, Document JVT-I049, San Diego, USA, 2003.

Suggest Documents