➠
➡ OPTIMIZED SOURCE AND CHANNEL CODING FOR VIDEO TRANSMISSION OVER ADSL Nicola Franchi, Marco Fumagalli and Rosa Lancini CEFRIEL – Politecnico di Milano, via R.Fucini, 2 – 20135 Milano, Italy. E-mail:
[email protected]
ABSTRACT In this paper we tackle the problem of video transmission over ADSL channel. Due to the delay constraint of the transmission, the Shannon’s theorem is not strictly applicable and it seems challenging to evaluate which is the contribution of joint source/channel coding. The ADSL system is standardized to work at a Bit Error Rate (BER) of 10-7. In this work we relax this constraint in order to understand if this constraint (suitable for data transmission) is optimum also in the case of video transmission. The Aim of this work is twofold: first jointly optimize source and channel coding in order to find out the best point of work for the overall system and second compare in this scenario the single and multiple description video coding approaches.
1. INTRODUCTION In video transmission over noisy channel, both source and channel coding are used. Shannon’s Separation Principle states that these parts can be independently designed without loss in performance [5]. However, this theorem is based on two assumptions: an infinite block length and a perfect knowledge of the channel transmission behavior. As a result, it is difficult to satisfy both the assumptions for source data with real-time constraints (e.g., video signal) and time-variant channels. In this paper we tackle the problem of video transmission over ADSL channel. The transmission has delay constraints (even though not very tight); this assumption allows us to use a moderate block length coding. Under this limitation the Shannon’s theorem is not strictly applicable and there is room to apply the joint source/channel coding. Moreover, the video signal is a particular signal with a certain implicitly robustness against data loss, especially if a multiple description (MD) coding is applied [1], [3], [4]. The ADSL system is standardized to work at a BER equal to 10-7. In this work we relax this constraint allowing the system to work at any BER between 10-7 and 10-2 guaranteeing performance equal or better than that suggested by the standardization. That means wondering
0-7803-7965-9/03/$17.00 ©2003 IEEE
II - 45
whether the standard BER constraint (suitable for data transmission) is optimum also in the case of video transmission. The aim of this work is twofold: first jointly optimize source and channel coding to find out the best point of work for the overall system, second compare the performance of both the single (SD) and multiple description (MD) video coding approaches. This paper is organized as follows. In Section 2 we analyze each part of the ADSL video transmission system. Simulation results and conclusive considerations are respectively given in Sections 3 and 4. 2. ANALYSIS OF THE SYSTEM To formalize the problem we need to model the three main parts of the system, i.e., the transmission channel, which we intend to deliver video signal on, the channel coding used to make the transmission reliable and finally the video source coding. According to the ADSL standard, the available bandwidth for the downstream (expressed in Hz and with echo cancellation) goes from 12,938 kHz up to 1099,6875 kHz. The bandwidth is divided in subbands (4,3125 kHz each with useful bandwidth of 4,044 kHz) according to the Discrete MultiTone modulation (DMT). If S is the received power and Q the transmitted one, H is defined as the Transfer Function (TF) of the transmission channel.
S ( f ) = Q( f ) ⋅ H ( f )
2
(1)
H depends on both the working frequency f and the distance between the sender and the receiver, i.e., the distance between the Central Office (CO) and the Subscriber or Remote Terminal (RT). This distance can be expressed as in (2), where d is the distance in miles and ki are parameters whose values are listed in Table 1, depending on the diameter of the wire (gauge).
H (d , f ) = e − d ( k1
f + k2 f ) − jdk3 f
e
(2)
The transmission channel is afflicted by noise. In particular, the ADSL Working Group points out three different causes that generate noise: crosstalk, background and impulse noise.
ICME 2003
➡
➡ Table 1. Values of gauge parameters Gauge 22 24 26
K1 (*10-3) 3.0 3.8 4.8
K2 (*10-8) 0.035 -0.541 -1.709
K3 (*10-5) 4.865 4.883 4.907
Crosstalk noise is due to the electromagnetic interaction among near wires: in the Near End CrossTalk effect (NEXT) different sources at the user side disturb the transmission while in the Far End CrossTalk effect (FEXT) the channel is noised by the simultaneous transmissions of other users. The spectral power density function of NEXT and FEXT noises are reported in (3) and (4) where N is the number of interfering sources, f the frequency in Hz, d the distance in ft and k = 8*10-20.
N NEXT = 49 N FEXT = 49
0.6
1 ⋅ ⋅f 1.134 ⋅ 1013
n = 2m − 1
3 2
(3)
(7)
k = 2m − 1 − m ⋅ t D
0.6
⋅k ⋅d ⋅ f ⋅ H( f ) 2
2
(4)
The background noise is modeled as a white noise whose value is assumed to be about –140 dBm. Finally the electronic equipments introduce impulse noise that is often neglected in numerical simulations. Let us define the channel capacity (expressed in bit/sec) given the signal-to-noise ratio (SNR). According to Shannon’s theory and in order to achieve a certain BER, the maximum channel capacity (CMAX), in presence of a given SNR, is expressed in (5) in the case of QAM modulation. B is the physical bandwidth and ∆ parameter is related to BER according to (6).
3 S CQAM = B ⋅ log 2 1 + 2 ⋅ ∆ N
BER =
Mbit/s for the scenario A. The challenge is to guarantee, introducing a suitable channel coding and /or different source coding, a video quality at least as good as in the standard environment. Let us give the equation of channel coding as we use it in this work. We consider BCH block codes. Given an integer m, the formula (7) gives the relationship between n (block length), k (source information) and t (correcting power). We consider that the source data is transmitted in packet format. This assumption could simulate the IP transmission directly over an ADSL channel or, more realistically, the ATM packetization under the IP level. Given D the dimension of a packet, the Packet Loss Rate (PLR) is expressed in (8) where n and t have the same meaning that in (7).
2 2π
∞
∫e
−
x2 2
dx
(5) (6)
∆
In order to model the state of the channel as realistically as possible, the ADSL Noise Environment is applied. It consists on two alternative scenarios: the first (scenario A) proposes 24 wires with NEXT disturb, 24 with FEXT and the addition of a background noise of –140 dBm/Hz; the second (scenario B) proposes 10 wires with NEXT disturb, 10 with FEXT and the same background noise. For our simulations we use a constant transmitted power equal to -40 dBm/Hz. We take into account a distance equal to 2 miles. Figure 1 presents the channel capacity in bit/sec given the requested BER: the two lines refer to scenarios A and B. As we expected, the channel capacity increases when the BER increases, according to (5): for the given noise profile, at BER equal to 10-7 the capacity is about 1.4 Mbit/s while at BER = 10-2 it is more than 2.1
t n n n− x PLR = 1 − ∑ ⋅ (1 − BER ) ⋅ BER x x =0 x
(8)
PLR = 1 − [1 − BER ]
(9)
D
CU = C QAM ⋅
k n
(10)
If we consider in (8) the case of t = 0 - that means no channel protection – it is easy to obtain the expression (9); in fact the case of no channel coding is a special case of the more general expression (8). The insertion of channel coding has two opposite effects: on one hand it reduces the loss probability permitting a more reliable transmission, on the other hand it increases the required bandwidth. The formula (10) gives the relationship between the available channel capacity (CQAM) and the bit-rate useful for the source data (CU). One of the tasks of this paper is finding a trade off in order to optimize this among opposite trends. In this work we consider two different source-coding approaches: single description (SD) (e.g., MPEG and H.26x) and multiple description (MD) video coding [3], [4]. We propose two simple empirical models to describe the performance (video quality) of the above-mentioned coders in error-prone environment. In particular the models provide the Peak-SNR (PSNR) of the decoded video stream given two parameters: the PLR on the transmission channel and the PSNR value in error-free environment (PSNRSD0), provided by the classical ratedistortion curve modeled as in [2]. We highlight the proposed model is an empirical model that is not derived analytically. Instead, we focus on the input-output video codec behavior and we emphasize the simplicity and the usability over a complete theoretical description.
II - 46
➡
➡
Figure 1. Channel capacity (bit/s) versus BER in an ADSL channel.
Figure 2. SD coder: H.263 performance for the sequence ‘Foreman’ at 64 kbit/s.
Figure 3. SD coder: H.263 performance for the sequence ‘Foreman’ at 144 kbit/s.
Figure 4. MD coder: MDTC performance for the sequence ‘Foreman’ at 64 kbit/s.
Figure 5. MD coder: MDTC performance for the sequence ‘Foreman’ at 144 kbit/s.
Figure 6.CU versus PLR varying the correcting power t.
The video quality of a SD coder decreases at rising of PLR as a hyperbolical curve as in (11). The parameter A and α are empirically measured to fit the experimental results (minimizing the sum of MSE differences between the model and the measured points does the fitting). Figure 2 and Figure 3 show the experimental results for the H.263 coder for the sequence “Foreman” at 64 and 144 kbps respectively and the model with A=20 dB and α=0.6. The fitting seems to be quite accurate. Instead, a MD video coder (in our case we used the results reported in [4]) shows a stronger robustness against losses due to the insertion of a certain amount of redundancy. The video quality decreases logarithmically as in (12), where the first term is the gap between the SD and MD coder in error-free environment due to the redundancy insertion. The parameter β in the form β = 1 + C ⋅ ( PSNR SD − D ) compensates the increasing slope at rising of the bit-rate.
PSNRSD = A + PSNRMD =
PSNRSD 0 − A 1 + PLR α
PSNRSD 0 2 − B β − [log 2 (1 + PLR )] PSNRSD 0
(11) (12)
Figure 4 and Figure 5 show the experimental results for the MDTC coder [4] for “Foreman” sequence and the
model with B=100 dB, D=20 dB, and C=0.055. The fitting is not as accurate as in the SD case but more conservative. We tested the above model on several sequences and bitrate and we found congruent results.
3. SIMULATION RESULTS As we saw in the previous section, the decoded video quality depends on a great number of elements. This section proposes a simple framework to maximize the PSNR at the decoder side optimizing the main elements of the ADSL video transmission system. Looking at the behavior BER/channel capacity of the transmission channel (given e.g., in Figure 1), for each value of BER a corresponding available bandwidth is given. Figure 6 shows, for several values of BER, the relationship between the PLR and the useful bit-rate CU varying the correcting power t of the channel code (to graphic the figure we use the expressions (7)-(10)). Each point of these curves is a potential point of work of the system for a given physical behavior of the channel. As well, a generic point P is better than another if and only if it provides higher PSNR. For each point P (characterized by its coordinates (PLRP, CUP) on the plane of Figure 6), we measure the PSNRSDO identified by the CUP value
II - 47
➡
➠
Figure 7. SD coder. PSNR performance vs. correcting power.
Figure 8. MD coder. PSNR performance vs. correcting power.
on the rate-distortion curve in error-free environment: this value is the video quality in the case of no losses. In order to evaluate the impact of packet loss, we use the proposed model of SD or MD video coders. As a result, Figure 7 presents the PSNR of the decoded video using the model of SD coder as in (11), valued at the point PLRP. Similarly, Figure 8 shows the results by using the MD coder modeled in (12), valued at the same point PLRP. The best working points for the two coders are the higher value of PSNR. As it can be seen, in the best case the SD coder outperforms the MD one for about 0.5 dB. Hence the first result is that, even in non-ideal situation for Shannon’s theorem, the optimum choice for ADSL video transmission is using the SD video coder, which means the independent design of source and channel coding. This result doesn’t surprise because the ADSL is not a real packet channel (where losses are due to drops at routers), where theory states the MD coding presents better performance. The second result is that the BER value of 10-7 doesn’t seem to be the best point of work for ADSL transmission channel. From Figure 7 it can be seen that, working at BER equal to 10-3/10-5, it is possible to gain in video quality; in fact, the higher PSNR value with BER at 10-7 is not as high as those at 10-3/10-5 applying the appropriate channel coding. We believe this result is due to the behavior of video signal in error environment; in fact there is quite no difference in video performance between working at PLR=10-7 or 10-3. We made some assumptions for these simulations: the channel block code length n is equal to 255. Analogues testes with different values of n show that, according to Shannon theory, increasing the block length improves the overall performance; n=255 is a reasonable compromise between performance and delay. The packet dimension D is such that each frame is splitted in an equal number G of packets (in the figures G is equal to 16).
4. CONCLUSION In this paper we tackled the problem of video transmission over ADSL channel. We relaxed the constraint of working at BER=10-7 in order to understand the optimal point of work in the case of video transmission. Due to the delay constraint, the Shannon’s theorem is not strictly applicable and we optimized the source and channel coding comparing the single and multiple description video coding approaches. We found two main results: first even in non-ideal situation for Shannon’s theorem, the optimum choice for ADSL video transmission was using the SD video coder instead of MD. The second result was that the BER value of 10-7 is not the best point of work for ADSL transmission channel but, working at BER equal to 10-3/10-5, is possible to gain in video quality. We presented in this paper only a restricted part of the achieved experimental results due to space constraint. Further and more accurate investigations on video coder modeling and packetization strategy are in process. 5. REFERENCES [1] H. Coward, R. Knopp and S.D. Servetto, “On the Performance of a Natural Class of Joint Source-Channel Codes Based on Multiple Descriptions”, EPFL Technical Report Draft August 20, 2001. [2] K. Stuhlmuller, N. Farber, M. Link and B. Girod, “Analysis of Video Transmission over Lossy Channel”, IEEE Journal on Selected Areas in Communications, Vol. 18, NO. 6 June 2000. [3] M. Caramma, M. Fumagalli, R. Lancini, “Polyphase DownSampling Multiple Description Coding for IP transmission,” in SPIE 2001 – Visual Communications and Image Processing, San Jose, CA, USA. [4] A. Reibman, H. Jafarkhani, Y. Wang, M. Orchard and R. Puri, “Multiple-Description Video Coding Using MotionCompensated Temporal Prediction,” IEEE Trans. on CSVT, vol. 12, pp. 193–204, Mar. 2002. [5] T.M. Cover and J.A. Thomas, “Elements of Information Theory”. New York, Wiley, 1991.
II - 48