Robust Video Source Coding for Noisy Channels

5 downloads 0 Views 136KB Size Report
Transmission errors occurred in one frame will propagate to the following frames ... avoid retransmission and to reduce bit rate, intra-coding is only performed inĀ ...
Robust Video Source Coding for Noisy Channels Qianfu Jiang and Steven D. Blostein

3

Department of Electrical and Computer Engineering Queen's University, Kingston, Ontario, K7L 3N6, Canada Tel: (613) 533-6561

Fax: (613) 533-6615

Email: [email protected], [email protected]

Abstract | In video communication sytems based on motion-compensated prediction coding, transmission errors cause spatial and temporal distortion propagation during the reconstruction of the video sequence at the receiver. Two commonly used techniques to stop error propagation in the reconstructed sequence are periodically refreshing the image by intra-frame coding and retransmission. However, the high bitrate (or low compression ratio) makes frequent intra-frame refreshing very expensive, if not impossible, in bandlimited applications such as wireless video transmission. Retransmission involves bitrate increase and additional delay which is intolerable in real-time applications. In this paper, we present a novel coding mode for video transmission which we call transmitter receiver identical reference frame (TRIRF) based interframe coding mode. Under the assumption of the existence of a feedback channel, TRIRF-frame coding constructs a new type of reference frame from the correctly received data which is identical both at the receiver and the transmitter. Motion estimation and compensation are based on the TRIRF-frame. Simulations show that TRIRF-based inter-frame coding can prevent error propagation as e ectively as intracoding while improving compression eciency.

1 Introduction

Present-day video coding techniques are very ecient in compressing data, but they are also highly sensitive to transmission errors. To exploit temporal redundancy existing in the video image sequence, predictive coding is used, which means coding the current frame predictively as a di erence signal with respect to the previous frame. Transmission errors occurred in one frame will propagate to the following frames because the predicted values of the pixels in the current frame are changed. If motion compensation is used in prediction, errors occurring in one postion can result in errors in other positions, which means errors propagate both spatially and temporally in the reconstructed image sequence. This problem becomes severe for video transmission over wireless channels which has higher error rate than wireline channels. Fig. 1 shows an example of spatial-temporal error propagation 3 This work was supported by the Communications and Information Technology Ontario (CITO).

in motion-compensated prediction coding. The shaded areas denote corrupted pixels.

t

t+1

t+2

Figure 1: An example of spatial-temporal error propagation in motion-compensated prediction coding. To stop error propagation, a common technique [1][2] is to periodically switch from motion-compensated prediction coding to intra-frame coding to refresh the image plane. Unfortunately intra-frame coding typically requires many more bits than inter-frame coding because no temporal correlation is exploited. This makes frequent intra-frame refreshing very expensive, if not impossible, in band-limited applications such as wireless video transmission. In the presence of a feedback channel with the error detection capability available (header information, synchronization code, forward error correction code, etc.), the locations of the corrupted regions can be detected and sent back to the transmitter. The corresponding lost data is then retransmitted [4]. However, retransmission increases bitrate and causes additional delay which is intolerable in applications such as interactive video and live broadcasting, especially for channels with large transmission delay. In [3], the feedback information is used by the transmitter to reconstruct the spatial and temporal error propagation in the decoding process at the receiver. The regions in the current frame which are a ected by the transmission errors in the previous frames are then determined. To avoid retransmission and to reduce bit rate, intra-coding is only performed in those regions with severe visual distortion while the other regions in the current frame are still inter-frame coded. In this work, we propose a novel coding mode for video transmission which we call transmitter receiver identi-

cal reference frame (TRIRF) based inter-frame coding mode. TRIRF-frame coding performs motion estimation and compensation on the a new type of reference frame, which is identical both at the receiver and the transmitter and is updated based on the correctly received data. Experiments show that TRIRF-frame coding e ectively prevent transmission error propagation and considerably reduces bitrate compared to intra-frame coding.

regions in the TRIRF-frame remain as before. Once the TRIRF-frame is updated, motion estimation and compensation will be performed on the TRIRF-frame, instead of on frame t-1 as in conventional coding schemes. Fig. 2 shows this process, where the frames are represented by line segments.

2 TRIRF-frame Coding 2.1

Construction of the TRIRF-frame

Conventional motion-compensated prediction assumes error free transmission: the reference frame used for motion estimation and compensation at the encoder closely matches the reference frame used for motion compensation at the decoder. This assumption, of course, is not valid when channel errors occur. Error-correcting codes can provide limited error protection but can not guarantee error-free bit streams at the receiver. Intra-frame coding stops error propagation but results in much higher bitrate. We propose TRIRF-frame coding which provides a better trade-o between error resilience and compression eciency. Instead of using the previous frame as the reference frame for motion estimation and compensation, we propose to construct a new type of reference frame from the correctly received data which is indicated by the information sent through a feedback channel. The objectives are to keep both the reference frames at the transmitter and the receiver the same and to utilize the temporal correlation between the correctly received data and the current frame as much as possible for predictive coding. The main idea is to keep updating this new reference frame at both sides at the same time according to the feedback information. There are two basic assumptions for TRIRF-frame construction. The rst is the existence of a feedback channel and the information sent by the receiver arrives at the transmitter without error. We also assume that the receiver is capable of detecting transmission errors and can provide the locations of the corrupted regions in the reconstructed image. Depending on the error detection capability available, the information could be the location of a slice ( a group of blocks ) or a single block. The TRIRF-frame is constructed as follows. We denote the round trip delay by an integer nrd , which equals one plus the the number of frames encoded between the time when a frame is sent and the time when the feedback information about this frame comes back. In other words, at the time when the encoder is just about to code frame t, the feedback information about frame t 0 nrd has arrived and that feedback information about frame t 0 nrd + 1 has not. The feedback information shows the locations of the regions ( slices or blocks ) in frame t 0 nrd which are correcly received. These regions will then replace the spatially corresponding regions in TRIRF-frame. The other

t-n rd

t-n rd +1

t-1

updating

t

motion estimation and compensation TRIRF

Figure 2: TRIRF-frame updating and motion estimation and compensation for frame t. Note that TRIRF-frame updating at the receiver should be based on the error information of frame t 0 nrd before decoding frame t, rather than that of just-received frame t-1. Otherwise the reference frames on both sides will not be the same. Since motion-compensated prediction is based on TRIRFframe, which is constructed both at the transmitter and the receiver and updated at the same time using the same information, synchronization between both sides is maintained. Transmission errors occurring at one frame will not propagate to the following frames. 2.2

Performance of TRIRF-frame Coding

The TRIRF-frame contains regions from di erent reconstructed frames which are at least nrd frames old in time. However, these regions are the most recently con rmed data by the receiver on which the encoder can perform predictive coding because of the transmission delay, The eciency of motion compensation is determined by the correlation between the TRIRF-frame and the current frame, which depends on the video content, the transmission delay and the channel conditions. For channels with large transmission delay (large nrd ), such as GEOsatellite channels whose round-trip delay is about onequarter second, there may be much less correlation between the TRIRF-frame and the current frame than that in conventional inter-frame prediction, and the compression eciency may be considerably lower. However, since TRIRF-frame coding utilizes the successfully received information for prediction, the coding eciency will be still higher than intra-frame coding, which does not exploit any available information at all. For channels with small transmission delay (small nrd ), the coding eciency loss

may be small. For video sequences with small or regular motion, e.g. head and shoulder images for video phone and video conference applications, the loss of coding ef ciency may also be quite small even for large transmission delays. In short, TRIRF-frame coding can achieve a better trade-o between compression eciency and error resilience than conventional inter-frame coding and intra-frame coding. As a disadvantage, TRIRF-frame coding requires storing nrd + 1 frames. To save memory, some frames can be stored in compressed form and a separate decoder can be used to obtain TRIRF reconstructed blocks.

3 Increasing Compression Eciency by Using Multi-Mode Coding

In practical video communication systems, the traditional inter-frame coding mode can be replaced by the TRIRFbased inter-frame mode to enhance error resilience while keeping coding eciency for slow-motion video transmission over short-delay channels. However, for video communications over long-delay channels, because the correlation between the TRIRF-frame and the current frame decrease considerably due to the large temporal gap between them, we may have a large compression eciency loss compared to conventional inter-frame coding. With the feedback channel available, which is a basic assumption of TRIRF-frame coding, the error propagation in image sequence decoding at the receiver can be reconstructed at the transmitter. In [3], intra-frame coding is applied on the regions in the current frame a ected by errors in the previous frame propagating from in the frames before it. The errorneous regions in the previous frame are detected using a dependency table which speci es the temporal dependency among the blocks in consecutive frames. If the motion-compensated prediction of a block in the current frame, which is a displaced block in the previous frame, is severely damaged, the block will then be intra-frame coded. In this way, the propagation e ect of the errors in frame t 0 nrd will be eliminated in frame t, where nrd is de ned in the previous section. Note that if nrd > 2, there is still error propagation in nrd 0 1 frames. We use a similar scheme for video transmission over longdelay and time-varying channels, such as GEO-satellite channels. However, while [3] switches to intra-frame coding for damaged areas, we propose to update TRIRFframe and mantain inter-frame coding. Fig. 3 shows how we select between conventional inter-frame coding and the TRIRF-frame coding. When the feedback information about frame t 0 nrd arrives at the transmitter, the a ected areas in frame t-1 are determined based on the motion information. The una ected areas in frame t-1 are assumed error free and conventional inter-frame motion estimation and compensation can be performed on these areas (Of course the transmission errors in frame

error propagation reconstruction based on motion vectors

errorneous region due to error propagation

transmission error

t-n rd

unaffected areas

t-n rd +1

t-1

updating

t

motion estimation and compensation for the affected areas TRIRF

Figure 3: Hybrid inter-frame/TRIRF-frame motion estimation and compensation for frame t based on error propagation reconstruction. t 0 nrd + 1 could propagate to these regions but we can not know at this moment.). TRIRF-frame coding, due to its lower coding eciency, is only activated when the conventional inter-frame motion estimation involves the errorneous areas.

4 Simulations The main purpose of our simulations is to con rm the claims in Section 2 that the proposed TRIRF-frame coding prevents spatial-temporal error propagation as well as intra-frame coding but outperforms intra-frame coding in compression eciency. We also demonstrate the performance of the source coding scheme presented in Section 3 in a variety of conditions. We assume that the combination of channel-coding, channel and channel-decoding can be modeled by a memoryless binary symmetric channel (BSC) with the bit error rate (BER) denoted by . The channel is described by yn = xn 8 en ; n = 1; 2; 1 1 1, where xn denotes the output bit stream from source encoder, en denotes the error process, and yn denotes the input bit stream to source decoder. Two video codecs based on the two schemes described in Section 3 were implemented in software. The basic coding elements ( motion estimation and compensation, transform coding, quantization and entropy coding ) are simlilar to the video coding standard H.263 [2]. One codec, based on [3], is denoted by Inter/Intra in the simulation results. The other codec, based on the proposed TRIRF-frame coding, is denoted by Inter/TRIRF. Synchronization words are inserted at the beginning of each slice ( one block row ) to enable the decoder to resume synchronization. For error concealment, both decoders discard the damaged slices and replace them with the corresponding data from the previously reconstructed frame.

Lum-PSNR(dB) Inter/ Inter/ TRIRF Intra 27.8 27.3 24.3 23.6 35.3 35.4 31.5 31.4 30.7 30.7 30.0 30.0

Sequence Carphone Foreman Miss-Amer Mthr-Dotr Salesman Suzie

220

200

Inter/TRIRF

180

Inter/Intra

160

bit rate (kbits/sec)

The receiver sends back the slice identi cation number of the corrupted regions as the feedback information. Standard test sequences in CCIR 601 QCIF (1762144 resolution) format were tested. The original frame rate is 30 frames/sec. The sequences were coded at a frame rate of 10 frames/sec, which means for every three frames in the original sequence one frame was encoded. 25 Monte-Carlo simulations were run on each test sequence for di erent bit error sequences. We assume the round-trip frame delay nrd = 3 (encoded f rames). Table 1 shows the performance comparison averaged over 25 runs of the entire sequence for a BER of 1003 .

140

120

100

80

60

40

20

Bitrate (kb ps) Inter/ Inter/ TRIRF Intra 63.0 94.0 84.7 115.7 19.5 31.2 29.1 51.8 25.6 58.9 42.0 52.9

0

20

40

60 80 Frame number

100

120

140

Figure 5: Average simulation results for Carphone sequence. ( = 1003 ; P SN R = 27:8dB )

frame coding constructs a new type of reference frame from the correctly received data which is identical both at the receiver and the transmitter. Motion estimation and compensation are based on the TRIRF-frame. We Table 1: PSNR and bitrate comparisons averaged over 25 compare the compression and distortion recovery performance of TRIRF-frame based coding to that proposed in runs of the entire sequence.  = 1003. [3]. Simulations show that TRIRF-based inter-frame coding can prevent error propagation as e ectively as intraFig. 4 shows bitrate comparisons for the Mother and coding while improving compression eciency. Daughter sequence for the same reconstruction quality (PSNR). Fig. 5 shows bitrate comparisons for the Carphone sequence for the same reconstruction quality (PSNR). From Table 1, Fig. 4 and 5, it is obvious that Inter/TRIRF [1] Motion Picture Expert Group (JTC1/SC29/WG11) coding outperforms Inter/Intra coding, which con rms and Experts Group on ATM Video Coding (ITU-T what we predicted in previous sections. SG15), \Generic coding of moving pictures and associated audio MPEG-2", Draft International Standard 13818, ISO/IEC, Nov. 1994.

References

180

160

Inter/TRIRF

140

Inter/Intra

[2] Draft ITU-T Recommendation H.263, \Video coding for low bitrate communication", Draft, Dec. 1995.

bit rate (kbits/sec)

120

[3] Eckehard Steinbach, Niko Farber, and Bernd Girod, \Standard Compatible Extension of H.263 for Robust Video Transmission in Mobile Environments",

100

80

IEEE Trans. Circuits and Systems for Video Technology, Vol.7, No.6, pp.872-881, Dec. 1997.

60

40

20

0

0

50

100

150 200 Frame number

250

300

350

Figure 4: Average simulation results for Mother and Daughter sequence. ( = 1003; P SN R = 31:4dB )

5 Conclusions In this paper, we present a novel coding mode for video transmission, called TRIRF-frame coding. Under the assumption of the existence of a feedback channel, TRIRF-

[4] S. Lin, D.J. Costello, and M.J. Miller, \Automatic repeat error control schemes", IEEE Commun. Mag., Vol. 22, pp.5-17, 1984.