Robust Subband Video Coding with Leaky

0 downloads 0 Views 139KB Size Report
ing/modulation for transmission over noisy chan- nels. In 1], the ... channel-to-noise ratio (CSNR), there is an opti- mal value of ... noisy channels is reduced at the cost of less cod- ... able length coding, it is not possible to take the signi cance or the meaning of the individual bits into account when choosing the mapping. 2.2.
Robust Subband Video Coding with Leaky Prediction Arild Fuldseth Tor A. Ramstad Department of Telecommunications, Norwegian University of Science and Technology (NTNU) N-7034 Trondheim, Norway Telephone: +47 73 59 26 50. Fax: +47 73 59 26 40. email: ffuldseth,[email protected].

ABSTRACT

The use of leaky prediction in a video transmission system designed for noisy channels is investigated. For the proposed system, graceful degradation is achieved by nding good index maps between the quantized source coder parameters and the amplitude levels of a QAM signal constellation. The contribution of this paper, however, is to investigate the use of leaky prediction to further increase the robustness to channel noise. It is shown that by using a temporal prediction coecient slightly less than one, large improvements can be achieved with only a small degradation for the noiseless case. The performance of the proposed system is comparable to a reference system based on the H.263 video coder for high CSNR values, and degrades far more gracefully for low CSNR values.

1. INTRODUCTION

Recently, there has been much interest in combined source coding and channel coding/modulation for transmission over noisy channels. In [1], the index assignment problem was addressed by using simulated annealing and assuming a binary symmetric channel (BSC). The idea behind this method was to assign binary codewords to the quantizer indices such as to minimize the e ects of channel errors. In [2, 3, 4], the simulated annealing method was modi ed to optimize the index assignment (index map) with multilevel modulation instead of a BSC, and with the transmitted power as an additional constraint. In particular, the method was used to design a robust system for still image transmission by optimizing the index maps from the source coder parameter space to the amplitude levels of a quadrature amplitude modulation (QAM) signal constellation (e.g. 64-QAM). Furthermore, in [5] the techniques in [2, 3, 4] were extended to video coding. In this work, we address the problem of channel error propagation due to the temporal prediction feedback loop of the video coder structure. In particular, the robustness of the video transmission system of [5] is increased further by using leaky interframe prediction. This reduces the propagation of channel errors to subsequent frames at

the cost of less prediction gain. Thus, for a given channel-to-noise ratio (CSNR), there is an optimal value of the interframe prediction coecient corresponding to the best possible trade-o between the amount of error propagation and the prediction gain. In this paper, approximate optimal values of for various CSNR values are found experimentally. Furthermore, for the cases where the CSNR is not known at the encoder side, the effects of channel mismatch are addressed. Finally, the proposed system is compared to a traditional video communication system based on the H.263 video coder.

2. SYSTEM DESCRIPTIONS 2.1. Reference System

The reference system which is based on the H.263 video coder and QAM is illustrated in Figure 1. The H.263 coder was implemented according to [6] without any of the optional coding techniques, and without the chrominance information. In addition to the speci cations given in [6], resynchronization based on the group of block (GOB) header as well as error detection and error concealment techniques are used in the decoder. Error detection is based on detection of illegal codewords, while error concealment is performed by repeating information from the previous GOB, or from the previous frame whenever an error is detected. Furthermore, in order to reduce the propagation of errors through subsequent frames, NI intrablocks are introduced systematically in each frame at the encoder side. Note that the e ects of introducing intrablocks systematically are similar to the e ects of leaky prediction, since the error propagation for noisy channels is reduced at the cost of less coding eciency for a noiseless channel. The output bit stream from the H.263 coder is mapped to a 16-QAM signal constellation by traditional Gray coding. Note that since the H.263 coder uses variable length coding, it is not possible to take the signi cance or the meaning of the individual bits into account when choosing the mapping.

2.2. Proposed System

The proposed system, illustrated in Figure 2, uses subband decomposition and temporal prediction with motion compensation as in H.263. The subband signals are coded by subdividing each subband into blocks of 4  4 samples. Each block is

Video frames

- H.263 encoder

Bit stream

Gray - coding

QAM symbols

-

Figure 1. Reference system, (transmitter only). Block classi cation Quantizer scale factors Video frames -

- VQ3 Analysis - Classi i- lterbank cation - VQ2 6 - VQ1 ??? - Class 0 Synthesis

- MAP - MAP3 - MAP2 - MAP1

- QAM symbols -

lterbank



Motion   compensation 6 Motion - estimation 

... .... ... ....... .... .... ..

Frame delay

- ?i

Motion vectors

Figure 2. Proposed system, (transmitter only). The normalization and renormalization operations are assumed to be included in VQ1-VQ3. classi ed into one of four classes according to its classes are listed in Table 1. The QAM signal conmean squared value, and the subband samples bestellations are chosen with an odd number of amlonging to class 1-3 are normalized using the correplitudes in each dimension. This allows for power sponding class mean squared value. Next, the norecient transmission by assigning the codebook malized subband samples are quantized using vecvector having the highest probability to a chantor quantizers (VQs) of three di erent rates (meanel symbol with zero amplitude. As an example, consider class 2. The subband samples belongsured in bits per source sample), with the highing to this class are divided into vectors of length est rate for the class corresponding to the highL = 3, and quantized using a VQ codebook of size est mean squared value. The blocks having the N = 81. The codebook indices are then mapped smallest mean squared value (class 0) are neither to an 81 QAM signal constellation (K = 2) by the quantized nor transmitted. corresponding index map. Next, the VQ output indices are mapped to the amplitude levels of the QAM channel symbols by class L N K the index maps. The index maps were designed to 1 2 9 1 have the following two properties: First, in order 2 3 81 2 to minimize the e ects of channel errors, points 3 1 15 1 that are close in the QAM signal constellation (channel space) should correspond to points (codebook vectors) that are close in the source space. Table 1. Source vector dimension (L), codebook Second, in order to maximize the minimum dissize (N ), and channel space dimension (K ) for each tance, dmin , of the signal constellation for a xed class. The case K = 1 corresponds to the real or transmitted power, the most probable codebook imaginary axis of a QAM constellation. vectors should be mapped to the QAM points having the smallest amplitude. In this work, simuThe side information parameters consisting of lated annealing was used to design the index maps the motion vectors, the block classi cation table, as described in [5]. and the quantizer scale factors are particularly For the proposed system, the choices of source sensitive to channel noise. Also, robust mappings vector dimension (L), codebook size (N ), and are dicult to design in this case, since the neighbor properties for these parameters are not well channel space dimension (K ) for each of the three

de ned. In order to reduce the error rate for the side information, a sparse subset of an 81-QAM signal constellation is used. This reduces the error rate signi cantly for a given noise level. In addition, power ecient transmission is achieved by mapping the most probable values of the side information parameters to the QAM symbols having the smallest amplitudes.

3. LEAKY PREDICTION

As mentioned above, the optimal value of the leaky prediction coecient is a result of a trade-o between limiting the channel noise propagation and maximizing the interframe prediction gain. In [7], a similar optimization problem for a DPCM system with scalar quantization was studied. In the case of a 1st order predictor, it was shown that for a temporal correlation coecient , the normalized reconstruction error can be expressed as  = [q + m + n =( 2 ? 1)][1 + 2 ? 2 ] (1) where q and n correspond to the quantization and the channel noise respectively, and m is a

mutual term. Thus, the optimal value of must satisfy the following fth order equation

(1 ? 2 )2( ? )(q + m ) ? ( 2 ? 2 + )n = 0 (2) Assuming that the quantizer as well as the source and channel statistics are known, the optimal value of the prediction coecient, opt can be found numerically. A special case is for very noisy channels i.e. n  q + m . In this case we have [7]

q

opt ' (1 ? 1 ? 2)=

(3) A similar theoretical analysis can be applied to the proposed system, but some modi cations are needed to take the subband decomposition and the classi cation scheme into account. This will be a topic for further research. In this paper the optimal values of were estimated experimentally.

4. EXPERIMENTS

All our experiments were performed by coding 400 luminance frames of the QCIF Foreman sequence, and by simulating transmission over an additive white Gaussian noise (AWGN) channel with maximum likelihood detection on the receiver side. For the reference system, the simulations were conducted by assuming error free transmission of the picture header, ensuring correct resynchronization of the bit pattern at the beginning of each frame. Furthermore, the H.263 coder was implemented without rate control, using the average channel rate as a reference. All our experiments were conducted by averaging over 20 separate simulations of the entire sequence using di erent random noise sequences.

In the rst experiment, the reference system was simulated with a quantizer step size of 12 (QP = 6) and with various values of NI (the number of intra blocks per frame), resulting in various bit rates and channel symbol rates. The average number of bits per pixel (bpp) and QAM symbols per pixel (spp) for each value of NI are shown in Table 2. In Figure 3, the resulting peak signal-to-noise ratio (PSNR) of the reconstructed sequence is evaluated for various values of NI and the CSNR. As can be seen from the gure, the robustness to channel noise increases slightly with an increased value of NI . NI bpp spp 0 0.338 0.085 1 0.348 0.087 3 0.367 0.092 9 0.425 0.106 Table 2.Average bpp and spp for various values of NI Next, the proposed system was simulated with various values of at a channel rate of 0.092 spp. The results are illustrated in Figure 4. From the gure, we observe that the robustness to channel noise can be improved signi cantly by reducing from 1.0 to e.g. 0.99. On the other hand, the corresponding performance loss for high CSNR values is small. Furthermore, optimal values of can be estimated as follows. At CSNR values above 21 dB, should be above 0.99, approaching 1.0 as the CSNR increases. As the CSNR is reduced from 21 dB to 16 dB, opt decreases to about 0.90. Note that opt does not approach zero as CSNR decreases to 0. This can be seen from equation 3. Typically, the temporal correlation coecient  could be as high as 0.99 or higher. The resulting value of opt is in the order of 0.9 (equation 3). Although equation 3 does not apply directly to the problem at hand, it indicates an optimal value of that is much closer to 1.0 than to 0.0 as CSNR approaches zero. Finally, the proposed system was compared to the reference system. The reference system was simulated with NI = 3 resulting in a QAM symbol rate of 0.092 spp. The proposed system was simulated at the same spp and with = 0:99. The equivalent bit rate in terms of bpp for the proposed system is higher than for the reference system since the power ecient index maps allow for larger signal constellations (e.g. 81 QAM vs. 16 QAM). This is necessary in order to compensate for the less ecient xed length coding strategy of the proposed system. Note however, that since the performances of the two systems are compared for the same channel symbol rate and CSNR, the equivalent bpp is irrelevant. The results are illustrated in Figure 5. The performances of the two systems are quite close at CSNR values down to about 20 dB. Below this point, the reference system degrades abruptly, while the proposed system degrades gracefully.

35

30

30

PSNR (dB)

PSNR (dB)

35

25

25

20 12

14

16

18 CSNR (dB)

20

22

24

Figure 3. PSNR vs. CSNR for various values of NI , NI = 0 (||), NI = 1 ({ { {), NI = 3 ({{{), NI = 9 (    ). 35

14

16

18 CSNR (dB)

20

PSNR (dB) 25

14

16

18 CSNR (dB)

20

22

24

Figure 4. PSNR vs. CSNR for various values of (From top to bottom at CSNR = 24 dB: = 1.0, 0.99, 0.98, 0.95, 0.9,0.85, 0.80).

5. CONCLUSIONS For the proposed video transmission system we have demonstrated how the robustness to channel noise can be improved by using leaky interframe prediction matched to the channel statistics. For the cases where the channel statistics is not known on the encoder side, the robustness can be increased signi cantly at the cost of a small quality reduction for noiseless channels. Finally, the proposed system compares favorably to a reference system based on the H.263 video coder due to graceful degradation for poor channel conditions. The inherent robustness to channel noise makes the proposed system particularly useful for video transmission over time varying wireless channels such as in mobile and broadcasting applications. The ideas behind the proposed system might also be of interest for the functionalities of the new MPEG4 video coding standard for applications requiring error resilience.

22

24

Figure 5. Right: PSNR vs. CSNR, reference system, NI = 3 (||), proposed system, = 0:99 (- - -).

6. ACKNOWLEDGEMENT

The authors want to thank Gisle Bjntegaard at Telenor Research for suggesting the error detection and concealment techniques for the H.263 video coder, making it a realistic and challenging reference.

REFERENCES

30

20 12

20 12

[1] N. Farvardin, \A study of vector quantization for noisy channels," IEEE Trans. Inform. Theory, vol. 36, pp. 799{809, July 1990. [2] J. M. Lervik, A. Fuldseth, and T. A. Ramstad, \Combined image subband coding and multilevel modulation for communication over power- and bandwidth limited channels," in Proc. Workshop on Visual Signal Processing and Communications, (New Brunswick, NJ, USA), pp. 173{178, IEEE, Sept. 1994. [3] A. Fuldseth and J. M. Lervik, \Combined source and channel coding for channels with a power constraint and multilevel signaling," in Proc. ITG Conference, (Munchen, Germany), pp. 429{436, ITG, Oct. 1994. [4] J. M. Lervik, A. Grvlen, and T. A. Ramstad, \Robust digital signal compression and modulation exploiting the advantages of analog communication," in Proc. IEEE GLOBECOM, (Singapore), pp. 1044{1048, IEEE, Nov. 1995. [5] A. Fuldseth and T. A. Ramstad, \Combined video coding and multilevel modulation," in Proc. Int. Conf. on Image Processing, (Lausanne, Switzerland), Sept. 1996. to appear. [6] Telecommunication Standardization Sector, Study Group 15, Working Party 15/1, Expert's Group on Very Low Bitrate Videophone, Video Codec Test Model, TMN5, Jan. 1995. Source: Telenor Research. [7] K. Y. Chang and R. W. Donaldson, \Analysis, optimization, and sensitivity study of di erential PCM systems operating on noisy communication channels," IEEE Trans. Commun., vol. COM{20, pp. 338{350, June 1972.

Suggest Documents