Joint Source-Channel Turbo Techniques for ... - Semantic Scholar

2 downloads 2983 Views 340KB Size Report
separate design and optimization of the source and channel coders can be .... Performance illustrations .... Good tutorials about these can be found in [26], [28].
PROCEEDINGS OF THE IEEE, VOL. 95, NO. 6, JUNE 2007

DRAFT-1

Joint Source-Channel Turbo Techniques for Discrete-Valued Sources: from Theory to Practice Xavier Jaspar, Graduate Student Member, IEEE, Christine Guillemot, Senior Member, IEEE and Luc Vandendorpe, Fellow, IEEE

Abstract— The principles which have been prevailing so far for designing communication systems rely on Shannon’s source and channel coding separation theorem [1]. This theorem states that source and channel optimum performance bounds can be approached as close as desired by designing independently the source and channel coding strategies. However, this theorem holds only under asymptotic conditions, where both codes are allowed infinite length and complexity. If the design of the system is constrained in terms of delay and complexity, if the sources are not stationary, or if the channels are non-ergodic, separate design and optimization of the source and channel coders can be largely suboptimal. For practical systems, joint source-channel (de)coding may reduce the end-to-end distortion. It is one of the aspects covered by the term cross-layer design, meaning a re-thinking of the layer separation principle. This article focuses on recent developments of joint source-channel turbo coding and decoding techniques, which are described in the framework of normal factor graphs. The scope is restricted to lossless compression and discrete-valued sources. The presented techniques can be applied to the quantized values of a lossy source codec but the quantizer itself and its impact are not considered. Index Terms— Channel code; cross-layer design; iterative decoding; joint source-channel code; source code; turbo principle

I. I NTRODUCTION

E

FFICIENT multimedia communication over time-varying mobile and/or wireless channels remains a challenging problem. A communication system is in general composed of a source coder, which maps the source symbols into a compressed binary representation and of a channel coder which maps the binary sequence into coded bits or waveforms for transmission. Similarly, the receiver is composed of a channel decoder and a source decoder. The source coder aims at reducing the bit rate to save bandwidth while the channel coder reintroduces some redundancy in order to protect the transmitted bits against potential channel noise. So, the two steps are quite antagonistic. Techniques where the source coder-decoder pair and the channel coder-decoder pair are designed and optimized independently are called tandem techniques [1]. The source code design in this case assumes that the error probability at the output of the channel decoder is Manuscript received August 25, 2006; revised February 26, 2007. This work was supported in part by the Federal Office for Scientific, Technical and Cultural Affairs, Belgium, under IAP contract No P5/11. The work of X. Jaspar is supported by the F.N.R.S., Belgium. X. Jaspar and L. Vandendorpe are with the Communications and Remote Sensing Laboratory, Universit´e Catholique de Louvain, B-1348 Louvain-laNeuve, Belgium (e-mail: [email protected]; [email protected]). C. Guillemot is with the IRISA-INRIA, F-35042 Rennes Cedex, France (e-mail: [email protected]).

zero. A non-zero error probability at the output of the channel decoder, which is unavoidable for finite-length channel codes, may have a strong effect on the end-to-end source distortion or source symbol error rate. Joint source-channel techniques, as the name suggests, try to optimize the transmission performance jointly, i.e., without a strict separation. Intuitively, since this optimization is a less constrained mathematical problem, its solution can only be better than or equal to tandem techniques. Joint source-channel techniques may hence reduce the source distortion and achieve better performance in the case of practical systems with constrained delay and complexity. This paper focuses on joint source-channel turbo techniques [2]–[24] and attempts to give a unified overview of this field. We have chosen the normal factor graphs [25] for the presentation of the different state models and decoding algorithms. These graphs are detailed in this special issue, in [26]. Together with the Sum-Product algorithm, they form [25]–[28] indeed a unifying framework to present turbo and turbo-like techniques. A factor graph is a graphical representation of a factorization, e.g., the factorization of the joint probability of all random variables involved in a transmission chain. As such, it is a powerful tool to analyze graphically the structure of stochastic dependencies and constraints between variables, as well as for reading out conditional independence relations in a model. In a sense, a factor graph is a substitute for equations and provides a clear overview of the transmission chain. As for the Sum-Product algorithm, its execution on factor graphs provides efficient (iterative) estimation algorithms. Actually, many decoding algorithms from the turbo/soft literature can be “re-discovered” with it and new ones can be developed. On Markov chains for example, it is equivalent to the BCJR algorithm [29], historically discovered before. The scope of the paper is restricted to discrete-valued sources. The presented techniques can still be applied to quantized values but the quantizer itself and its impact are not considered. Continuous-valued sources and quantizers in turbo systems are studied in [3], [8], [13], [22]. Considering initially no channel coder, several classical soft source estimation algorithms are presented, with the maximum a posteriori (MAP) and minimum mean square error (MMSE) criteria. These algorithms are described firstly with fixed length codes (FLCs) as a source code. FLCs make indeed the problem much simpler since the symbol boundaries in the noisy bit stream are known a priori. The general case of sources encoded with variable length codes (VLCs) and arithmetic codes (ACs) is then considered. With VLCs and

DRAFT-2

ACs, the segmentation of the received bit stream into the source symbols is random. Capturing this randomness requires state models of higher dimensions. Optimal and near-optimal decoding techniques, making use of different state models, are described for these codes. A range of mechanisms, e.g. soft synchronization markers [7], forbidden symbol [30], can also be added to further limit the decoder desynchronization phenomenon. Source codes with improved error recovery properties, e.g., reversible VLCs [31], [32] or with inherent synchronization properties [33], [34] are also interesting alternatives to classical VLCs for transmission over erroneous channels. Note that recovering a transmitted sequence of source symbols from noisy measurements is equivalent to infer the sequence of the transmission chain model states. This problem has already been addressed for Hidden Markov Models in non turbo systems in the past, for memoryless [35] and Markov [36] sources. In the full transmission chain, the bits produced by the source coder are generally further encoded by a channel coder. The transmitted bit stream can then be seen as produced by transitions on a global state model, product of the state models of the source and the channel coders. Optimal decoding could be performed on this product model [37], [38]. However, the state space dimension of this model explodes in practical cases. The concatenation of the two models have to be considered instead, leading to a cyclic factor graph. It was observed with turbo codes [39], [40] that efficient approximate estimators can be obtained by running a belief propagation algorithm on a cyclic factor graph, provided the cycles are long enough. It was also observed that the simple introduction of an interleaver between two codes can make short cycles become long [41]. An approximate iterative estimator can thus be designed, working alternately on the source coder model and the channel coder model, with significant gain in complexity. This is the turbo principle, which can be applied to other source estimation problems beyond source-channel decoding, such as the estimation of multiple source descriptions [42], [43]. The remainder of the paper is organized as follows. The transmission chain used throughout the paper is described in Section II. Soft source decoding is introduced in Section III for a simple source code, an FLC. This is a necessary building block of joint source-channel turbo decoding whose principle is presented in Section IV. Soft source decoding is further elaborated in Section V for other source codes and practical complexity issues are discussed. Section VI summarizes a few techniques and source codes able to improve the resiliency, along with some analysis results. Related to joint sourcechannel turbo decoding, some techniques of great interest exist and are presented in Section VII. Performance illustrations are provided in Section VIII for both theoretic and practical sources, notably real images and video signals. II. T URBO SOURCE - CHANNEL TRANSMISSION CHAIN Let us consider the transmission chain depicted in Fig. 1. Note that, in the following, capital letters indicate random variables and small letters realizations of these. A sub-sequence of variables Zi is denoted by Zm:n , (Zm , Zm+1 , . . . , Zn ).

PROCEEDINGS OF THE IEEE, VOL. 95, NO. 6, JUNE 2007

Source of S1:K Symbols

Source Coder U1:N Interleaver Π ′ U1:N Channel Coder

. .

Additive White Gaussian Noise C1:T

Y1:T

Joint Source-Channel Coder

Fig. 1. Joint source-channel coder. A forward error correcting code is chosen as channel code. Examples of source codes are given in Section V. The word “joint” refers to the joint design of the source and channel codes.

soft info ′ on U1:N Y1:T

. Fig. 2.

soft info on U1:N Source Decoder

Channel Decoder soft info ′ on U1:N

.

Interleaver Π−1

Interleaver Π

Sˇ1:K

soft info on U1:N

Joint Source-Channel Turbo Decoder Example of a “joint source-channel turbo decoder”.

The expectation of Z is denoted by E{Z} and the indicator function by I{.}, i.e., I{a} equals 1 if a is true, 0 otherwise. The probability P (Z = z) is abbreviated as P (z). For the sake of clarity, both the source and the channel considered are simple elements: a source of discrete symbols, with or without memory, and a memoryless additive white Gaussian noisy (AWGN) discrete channel. The source outputs sequences of K symbols Sk in each packet, where k is the time index with a symbol clock. The symbols take their values in a finite alphabet A. In the source coder, the sequence S1:K is coded into a sequence of bits U1:N . Examples of source codes will be given in Section V. The source coder can be concatenated with the channel coder either via a serial or a parallel turbo structure. In the sequel, we will focus on a serial turbo structure as depicted in Fig. 1. However, a parallel structure has also been considered in the literature [14]. The bits U1:N produced by the source coder are thus permuted through the interleaver Π to give the ′ bits U1:N . The role of the interleaver is twofold. Firstly, it increases the global code spreading, the so-called interleaving gain, hence the global code performance, under certain conditions on the source and channel codes. Secondly, it allows the use of iterative decoding as a low complexity approach to nearly optimal decoding [44]. In the channel coder, a forward error correcting code is used to protect the sequence of interleaved bits before transmission across the channel. Among the various codes available, let us draw attention to convolutional codes, low density parity check codes and turbo codes. All of them are covered in this special issue. The coded bits C1:T are finally sent across the channel. At the receiver side, in Fig. 2, the joint decoder aims at recovering the transmitted sequence S1:K of source symbols

JASPAR ET AL.: JOINT SOURCE-CHANNEL TURBO TECHNIQUES FOR DISCRETE-VALUED SOURCES: FROM THEORY TO PRACTICE

from the noisy measurements Y1:T at the output of the channel. We suppose in the following that both N and K are known at the receiver, unless otherwise stated. As explained in the introduction, to keep the decoding complexity within a tractable range, the optimal joint decoder is approximated by an iterative decoder with, as inputs, the noisy measurements Y1:T of the coded bits C1:T . The channel and source decoders are run iteratively. Refined information (probabilities) on the ′ bits U1:N or on U1:N is successively exchanged between the two decoders. As explained later, each decoder is a SoftIn/Soft-Out module based on, e.g., the BCJR algorithm [29] or the Sum-Product algorithm [26]. After a certain number of iterations, the source decoder computes an estimation Sˇ1:K of the symbols S1:K . If the number of iterations is 1, the decoder is roughly equivalent to a tandem decoder. The turbo decoder is further detailed in Section IV.

DRAFT-3

K 1 X E{|Sˇk − Sk |2 } K k=1 X sk P (sk |y1:T ) → sˇk =

MSE ,

s

Pk sk sk P (sk , y1:T ) . = P sk P (sk , y1:T )

(4)

These distortions, FER, SER, BER and MSE, can be minimized by running respectively the frame-MAP (maximum a posteriori), symbol-MAP, bit-MAP and minimum mean square error (MMSE) estimations, as summarized in (1)–(4). Another interesting distortion measure is the symbol error rate computed with the Levenshtein distance (SERL ), see [45]. This distortion is less sensitive than the SER to symbol deletion or insertion, which makes it useful for some audio applications, e.g., where a symbol deletion is acceptable because it is considered as a simple time shifting of the original signal.

III. S OFT SOURCE DECODING : PRINCIPLES At the receiver, the goal is to minimize the distortion of the decoded/estimated signal w.r.t. the original signal. In practice, several distortion measures may be used. With them are associated different decision criteria, different decoding algorithms and different decoding complexities. This section tries to summarize these aspects which are the foundations of Soft-In/Soft-Out source decoding, as used in turbo decoding. A. Different distortion measures - different decision criteria In the tandem receiver, the channel decoder is generally aimed at minimizing either the frame error rate (FER) or the bit error rate (BER) and the source decoder assumes no error in the decoded bits. In the joint source-channel receiver, the source and channel decoders are jointly optimized to minimize the desired distortion together. Also, the distortions are not limited to the BER or FER. The source symbol error rate (SER) and the mean square error (MSE) are measures which better reflect the quality of the reconstructed sequence of source symbols, especially when a VLC is used, in which case a single bit error can have a dramatic effect on the SER. FER , E{I{Sˇ1:K 6= S1:K }} → sˇ1:K = argmax P (s1:K |y1:T ) s1:K

= argmax P (s1:K , y1:T ) ,

(1)

s1:K

SER ,

K 1 X E{I{Sˇk 6= Sk }} K k=1

→ sˇk = argmax P (sk |y1:T ) sk

= argmax P (sk , y1:T ) ,

(2)

sk

BER ,

N 1 X ˇn 6= Un }} E{I{U N n=1

→u ˇn = argmax P (un |y1:T ) un

= argmax P (un , y1:T ) , un

(3)

B. State Models and Graphical Interpretation with FLC In the following, we use the normal (Forney-style) factor graphs [25] for the presentation, as explained in the introduction. Good tutorials about these can be found in [26], [28]. Consider the factor graph in Fig. 3(a). This represents the transmission chain of Fig. 1 with an FLC as source code and no channel code. Each edge represents a variable and each node a function involving the variables connected to it. The only variables here are the symbols Sk at the top of the graph and the bits Un at the bottom. Note that the symbols Sk are represented by three edges, instead of one, plus an equality constraint (a node with an equality sign at the center). This is a simple artifice from Forney’s factor graphs to link a variable to more than two functions: Sk is linked to fk−1 , fk and gk . Let us review the coding stage step by step considering the simple case of FLC. Each symbol is associated with a codeword and all codewords have the same fixed length — an example is given in Fig. 4. In Fig. 3(a), the value of the first symbol S1 has a probability given by P (S1 ). That probability is represented by the function cb . Then, the FLC maps S1 to a codeword of length M composed of the bits U1:M . This mapping is modeled by the function g1 = P (U1:M |S1 ). Each bit Un is then measured as yn at the output of the AWGN channel. This is the black node hn = P (yn |Un ). Given S1 , the next symbol S2 has a probability given by P (S2 |S1 ) — we assume that the memory of the source is modeled by a Markov chain. That probability is represented by the function f1 . And the process is repeated for each symbol. All the node definitions are summarized on the first row of Tab. I. This factor graph uses a symbol clock, i.e., the horizontal dependencies between the states Sk are defined symbol by symbol. A second model of the same transmission chain is possible and is depicted in Fig. 3(b), using a bit clock this time. The dependencies between the states Xn are defined bit by bit. Instead of considering a source of symbols which is transcoded into bits, we consider now a Markov chain producing a bit at each time instant and a symbol every M bits. If the source has no memory, the different values of the Markov states Xn (where n is the bit time index) are the internal nodes of the

DRAFT-4

PROCEEDINGS OF THE IEEE, VOL. 95, NO. 6, JUNE 2007

f1

S1

f2

S2

SK

S1

S2

SK

cb g1 U1

g2

... .

U2M

U2 UM

.

h1 h2

cb

UKM

...

...

h2M

hM

X0

gK

X1 XM f1 f2 fM U1 U2 UM

f1

S2 , N 2

hKM

f2

.

h1

U1

g2

U2 UN1 ...

.

h2

hM

h2M

SK1 +1:K2

SK0 +1:K1

h1 h2

UNK

UN2 ...

hN1

X 0 , K0

gK ...

cb

hKM

SKN −1 +1:KN

S1 , N1 , X1 f1 S2 , N2 , X2 f2

X 1 , K1 U1

U1

U2 UN1 ...

SK0 +1

.

h1 h2

UNK

UN2

hN1

...

ce

hN

...

SK1 SK1 +1

cb

...

SK2

SKN −1 +1

SKN

...

X 1 , K1 U1

f1

X N , KN f2

U2

UN

fN

ce

.

hNK

hN2

h2

X 0 , K0

gK

...

fN

UN

(d) Variable Length Coding and Bit Clock model

ce g2

X N , KN f2

U2

h1

.

SK , N K , X K

cb g1

f1

.

hNK

hN2

(c) Variable Length Coding and Symbol Clock model

.

(e) Arithmetic Coding and Symbol Clock model Fig. 3.

UKM

ce g1

.

U2M

(b) Fixed Length Coding and Bit Clock model

SK , N K

cb

.

XKM fKM

.

(a) Fixed Length Coding and Symbol Clock model S1 , N 1

X2M f2M

h1

h2

hN

(f) Arithmetic Coding and Bit Clock model

Normal factor graph of a source with memory, different codes and different clock models. The function definitions are given in Tab. I.

TABLE I D ETAILS ON THE DIFFERENT FACTOR GRAPHS OF F IG . 3. cb

ce

f

gk

hn

P (U(k−1)M +1:kM |Sk )

P (yn |Un )

Fig. 3(a)

P (S1 )

fk = P (Sk+1 |Sk )

Fig. 3(b)

I{X0 = R}

fn6=kM = P (Xn |Xn−1 ) P (Un |Xn , Xn−1 ) fn=kM = P (Xn |Xn−1 ) P (Un |Xn , Xn−1 ) ·P (Sk |Un , Xn−1 )

Fig. 3(c)

P (S1 ) ·I{N1 = l(S1 )}

I{NK = N }

fk = P (Sk+1 |Sk ) I{Nk+1 = Nk + l(Sk+1 )}

Fig. 3(d)

I{K0 = 0} ·I{X0 = R}

I{KN = K} ·I{XN = begin of codeword}

fn = P (Xn |Xn−1 ) P (Kn |Kn−1 , Xn ) ·P (Un |Xn , Xn−1 ) ·P (SKn−1 +1:Kn |Kn , Un , Xn−1 )

Fig. 3(e)

P (S1 ) P (N1 |S1 ) ·I{X1 = ∅}

system specific

fk = P (Sk+1 |Sk ) I{Xk+1 = Xk ∪ {Sk }} ·P (Nk+1 |Nk , Sk+1 , Xk+1 )

Fig. 3(f)

I{K0 = 0} ·I{X0 = ∅}

system specific

fn = P (Xn |Xn−1 ) P (Kn |Kn−1 , Xn ) ·I{Un ∈ Xn \ Xn−1 } ·P (SKn−1 +1:Kn |Kn , Un , Xn−1 )

P (yn |Un )

P (UNk−1 +1:Nk |Nk , Sk )

P (yn |Un )

P (yn |Un )

P (UNk−1 +1:Nk |Nk , Sk , Xk )

P (yn |Un )

P (yn |Un )

JASPAR ET AL.: JOINT SOURCE-CHANNEL TURBO TECHNIQUES FOR DISCRETE-VALUED SOURCES: FROM THEORY TO PRACTICE

R

R

0

1

0

1

0

1

R

R

R

R

I1

. .

0

1

0

1

0

1

R

I3

R

R

I1

I2

. .

I2

0

1

R

R

Fig. 4. Code trees of the FLC (00, 01, 10, 11) on the left and of the VLC (00, 10, 11, 010, 011) on the right. The internal nodes of the FLC are R, I1 and I2 . The internal nodes of the VLC are R, I1 , I2 and I3 . X1 X0 = R

X2

0

X3

1

· µy1 →g (y1 ) . . . µyn →g (yn ),

1

1

X5 = I1

1

X5 = I2

0

0 1

.

yn

(5)

0

1

.

probabilities, either on the symbols Sk or on the bits Un . These probabilities are efficiently, and elegantly, computed by the Sum-Product Algorithm (SPA) [26] on the factor graph of the transmission chain. 1) Sum-Product algorithm: The SPA is based on the SumProduct rule (SPR) which can be stated as follows: The message out of some node/function g(x, y1 , . . . , yn ) along the edge/variable x is the function X X g(x, y1 , . . . , yn ) ... µg→x (x) , y1

0

1 0

X1 X0 = R

X4

0

DRAFT-5

X2

0

X3

1 0

1

0

0

1

1 0

1

X5 = R X5 = I1

0 1

1

1

X5 = I2

1

X5 = I3

0

0

.

0

1

0

.

X4 0

0

1

where µyk →g (yk ) is the message that arrives at g along the edge/variable yk . Notice that the definition of the SPR is recursive: the SPR of µg→x requires to compute the SPR of µyn →g . The SPA is actually the recursive application of the SPR onto all variables/edges in the graph. The order of the edges processed by the SPA is called [25], [28] the messagepassing schedule. If the graph is acyclic, we can compute, at the end of the recursion,

1

Fig. 5. Example of the first five sections of two trellises, for an FLC on top and a VLC at the bottom. The codes and code trees are given in Fig. 4. These trellises correspond respectively to the factor graphs of Fig. 3(b) and 3(d), without the counter Kn and for a source without memory so that xn ∈ Tn .

FLC code tree, i.e., Xn = xn where xn takes its values in Tn , the set of possible internal nodes at time n. An example of code tree is given in Fig. 4. Note that P (xn+1 |xn ) can be deduced from the code tree. In the case of a Markov source (with memory), the states Xn are extended with the previous symbol, i.e., the Xn are the pairs (Tn , Sk ) where Tn ∈ Tn and k is the integer division n/M . The function definitions are given on the second row of Tab. I. The initial constraint cb in Tab. I indicates that we begin in the root node of the code tree, i.e., X0 = R. Note that one particularity is the strict emission of only one source symbol Sk every M bit positions, hence a different expression of fn when n is a multiple of M . Either model, symbol clock or bit clock, can be used almost equivalently. But the latter may allow simplifications and complexity reductions in some cases, as explained later. For Markov chains, another interesting graph is the trellis. An example is given on the top of Fig. 5, using the bit clock model of Fig. 3(b) and the FLC of Fig. 4. There is a node for each possible realization xn of each random variable/state Xn and there is a branch or edge linking two nodes if a transition from one node to the other is possible, i.e., if it has a non zero probability. Note the differences and similarities with factor graphs. During decoding, a metric is evaluated and associated with each branch, here P (xn+1 , yn+1 |xn ), measuring the joint probability of observing the available channel measure yn+1 with the current transition xn → xn+1 . C. Decoding algorithms In equations (2), (3) and (4), the symbol-MAP, the bit-MAP and the MMSE all make use of the same a posteriori or joint

P (x, y1:N ) = µf1 →x (x) µf2 →x (x)

(6)

for each variable/edge x in the graph, where f1 , f2 are the only two functions/nodes connected to x. We can notably compute P (un , y1:N ) and P (sk , y1:N ), whatever factor graph [Fig. 3(a) or 3(b)] is used. The complexity of the SPA depends on the number of executions of the SPR (i.e., to the factor graph size and structure), on the state space of each variable/edge x and on the complexity of each function/node in the graph. In this paper, each message on a given variable/edge x in (5) is actually a probability related to x (or a local value of a probability density function), as illustrated by the BCJR equations below. Hence, each message on x represents some piece of information on that very x, called soft information in contrast to the hard decisions exchanged in classical decoding. As they exchange soft information at both their input and output, the decoders considered here are commonly called Soft-In/Soft-Out decoders. An important property of the SPA is that its messagepassing schedule can be chosen freely. On acylic graphs, the cheapest one in number of operations is to process the graph in two phases, generally a centripedal phase from the leaf nodes of the graph to the root nodes and a centrifugal phase from the root nodes to the leaf nodes. On Markov chains, as in Fig. 3, the common practice is to perform the forward/backward propagation: a forward phase α from left to right and a backward phase β from right to left. 2) BCJR algorithm: This forward/backward propagation is exactly the BCJR algorithm [29] on the corresponding trellis. The decoding complexity can be evaluated from the factor graph or from the trellis. Based on the trellis, it is proportional to the number of nodes and branches. Here is a detailed example of the BCJR algorithm in four lengthy equations. Consider a Markov source encoded with an FLC and the factor graph of Fig. 3(b). The two phases are given by the application in both directions of the SPR onto the edges/variables xn , i.e., they are given by the messages µfn →xn and µfn+1 →xn . If we

DRAFT-6

PROCEEDINGS OF THE IEEE, VOL. 95, NO. 6, JUNE 2007

×

f2 U2

f2M UM

error correcting code

(7)

un =0,1

and channel

SK fKM

U2M

UKM

′ U2M

′ UKM

interleaver Π U1′

P (un |xn , xn−1 )P (yn |un ),

S2

fM

interleaver/interface

αn−1 (xn−1 )

X

f1 U1

αn (xn ) = P (xn , y1:n ) X P (xn−1 , y1:n−1 ) P (xn |xn−1 ) = | {z } x n−1

S1

source and FLC (bit clock)

denote αn (xn ) = µfn →xn (xn ) and βn (xn ) = µfn+1 →xn (xn ), the SPR gives

U2′

h2

h1

R1

′ UM

hM R2

hKM

h2M RM

R2M

RKM

. .

βn (xn ) = P (yn+1:N |xn ) X P (yn+2:N |xn+1 ) P (xn+1 |xn ) = | {z } x

i1

i2

iM

i2M

iKM

P (un+1 |xn+1 , xn )P (yn+1 |un+1 ), (8)

Fig. 6. Factor graph of the transmission chain of Fig. 1, with an FLC as source code [bit clock model, Fig. 3(b)] and a systematic convolutional code as channel code. Only the main variables are indicated. Channel measures are available on the systematic bits Un′ and parity bits Rn , as represented by the black functions/nodes hn and in , respectively.

initialized with α0 (x0 ) = I{x0 = R} and βN (xN ) = 1. Notice, in these relations, the factors from Tab. I. We can then evaluate (6) on un and sk to get the marginals required by (2)–(4), which can be developed as, using the SPR again,

For this example, the source code on the top is an FLC, modeled with the bit clock Markov chain of Fig. 3(b). The channel code at the bottom is a recursive systematic convolutional code (RSCC), modeled also as a Markov chain.

P (un , y1:N ) = µfn →un (un ) µhn →un (un ) X X αn−1 (xn−1 ) βn (xn ) =

A. Interleaver and cycles in the factor graph

n+1

×

βn+1 (xn+1 )

X

un+1 =0,1

xn−1

xn

× P (xn |xn−1 )P (un |xn , xn−1 ) P (yn |un ), (9) P (sk , y1:N ) = µfkM →sk (sk ) X X αkM −1 (xkM −1 ) βkM (xkM ) = xkM −1

xkM

× P (xkM |xkM −1 )

X

P (ukM |xkM , xkM −1 )

ukM

× P (sk |ukM , xkM −1 ) P (ykM |ukM ).

(10)

3) Viterbi algorithm: The frame-MAP, in eq. (1), is a special case which is tackled differently. Essentially, the frameMAP is the problem of finding in the trellis the path with the highest accumulated metric P (s1:K , y1:N ), or equivalently P (x0:N , y1:N ), during a single forward phase. This can be formulated as a Shortest Path Problem on the trellis. It is solved efficiently by the Viterbi algorithm [46] and requires less computations than the BCJR at the receiver (a forward phase instead of two forward/backward phases). For the trellises in Fig. 5, it recursively computes P (x0:n , y1:n ) = P ∗ (x0:n−1 , y1:n−1 )P (xn |xn−1 ) × P (yn |xn , xn−1 ), (11) ∗ P (x0:n , y1:n ) = max P (x0:n , y1:n ), (12) xn−1

initialized with P ∗ (x0 ) = I{x0 = R}. The quantity P ∗ (x0:n , y1:n ) is then the highest accumulated metric among all paths up to state xn . IV. T URBO DECODING To resist the multiple distortions potentially caused by the channel, the bits produced by the source coder are protected, after interleaving, by a channel coder in Fig. 1. The factor graph of the complete transmission chain is depicted in Fig. 6.

As explained in the introduction, one of the roles of the interleaver is to allow the use of iterative decoding as a low complexity approach to nearly optimal decoding — another role, code spreading, is explained later. Without interleaver, indeed, either the SPA behaves badly [28] due to the many short cycles in the factor graph between the two Markov chains (of the FLC and of the RSCC), or the two Markov chains are to be merged into a new chain of generally intractable complexity. With the interleaver, the short cycles become long and instead of computing, as in the ideal cycle free case, the exact probabilities in (6) for any edge/variable in the factor graph, we get approximations of these. In many cases, the longer the cycles, the better the reliability of the approximations will be. The SPA can thus be used as a good (approximate) alternative to the optimal estimation, as the success of turbo techniques has attested so far. Still, there is one inevitable effect of the (short and long) cycles on the SPA: The cycles introduce infinite algorithmic loops in the recursion of the SPR. As a consequence, the algorithm becomes iterative [28] and the resulting joint source-channel decoder belongs to the family of turbo decoders. B. Turbo decoder structure The message-passing schedule of the SPA can be chosen freely, as explained in the previous section. One common practice for each iteration is to consider first the forward/backward propagation on the RSCC Markov chain, then the SPR on the ′ edges (U1:N ) going through the interleaver Π, next a second forward/backward propagation on the FLC Markov chain, and ′ back again the SPR on the edges (U1:N ) through Π in the reverse direction. Theoretically, the decoding is iterated an infinite number of times. But in practice, it is repeated until some custom stopping condition is satisfied. In terms of decoding modules, this schedule leads to the (usual) structure illustrated in Fig. 2. The turbo decoder is

JASPAR ET AL.: JOINT SOURCE-CHANNEL TURBO TECHNIQUES FOR DISCRETE-VALUED SOURCES: FROM THEORY TO PRACTICE

composed of two modules/decoders, one for the FLC (source decoder) and one for the RSCC (channel decoder). Both modules are based on the SPA (forward/backward) or on the BCJR algorithm applied to their corresponding Markov chain. As inputs, they take the piece of soft information (probabil′ ities) on U1:N , provided previously by the other module — exactly the messages µfn →un or µ=→u′n . As outputs, they ′ produce “refined” information on U1:N (ideally independent of the inputs), which is sent to the other module through the interleaver (Π or Π−1 ) — the messages µ=→u′n or µfn →un , respectively. And the turbo decoder iterates between the two modules. At the last iteration (after some stopping criterion), the source decoder performs one of the estimations of the original signal presented in Section III. The success of joint source-channel turbo decoders has been shown in different contexts by many contributions in the literature [2]–[24]. A few selected simulation results corroborate this success in Section VIII. C. Convergence of turbo decoding With an iterative process is always associated the notion of convergence. Intuitively, the convergence of the turbo decoder depends on the amount of redundancy present at both sides of the interleaver. If, indeed, one module/decoder has neither a priori information (redundancy) nor channel measures available, then the iterations are almost useless and the turbo decoder does not perform much better than a tandem decoder. The convergence of turbo decoding was analyzed in [47] for turbo codes with the introduction of extrinsic information transfer (EXIT) chart. The reliability of the estimation of a bit is measured by the mutual information between that estimation and the original bit. More precisely, we have at the outputs of the source decoder and of the channel decoder, respectively    µfn →un (0) IS = I Un ; log ∈ [0, HU ], (13) (1) µ   fn →un  µ=→un (0) IC = I Un ; log ∈ [0, HU ], (14) µ=→un (1) where I{A; B} is the mutual information between A and B. The maximum value HU is the stationary entropy of a bit, i.e., HU = Hb (P (Un = 1)) (assumed independent of n) where Hb (.) is the binary entropy function. If IS and IC are close to HU (resp. 0), it means that the estimation is highly (resp. is not) correlated with the original signal, i.e., the estimation is good (resp. bad). The decoder starts with no estimation, I = 0. The channel decoder outputs a first estimation of reliability IC1 . With that estimation, the source decoder is able to produce IS1 . At the j th iteration, we have ICj and ISj . We are then interested in having/designing a system such that the sequence (IC1 , IC2 , IC3 , . . . ) or (IS1 , IS2 , IS3 , . . . ) or both converge to a value close to HU with the fewest iterations as possible. In the literature, the analysis and/or optimization of joint source-channel turbo decoders have been carried out with EXIT charts notably in [4], [12], [17], [21]–[23]. V. S OFT DECODING OF DIFFERENT SOURCE CODES One of the main advantages of FLCs, in Section III-B, is their robustness thanks to their inherent synchronization: the

DRAFT-7

symbol positions in the bit stream are perfectly known (and fixed) a priori. For arbitrary source distributions, however, the level of residual redundancy in FLCs may be high but can, in general, be reduced with variable length source codes at the cost of some sensitivity to desynchronization at the decoder. Only the state models and the corresponding factor graphs are described below. The decoding algorithms can be derived by running the SPA on the graphs, see Section III-C — a detailed decoding example is given for FLCs in (7)–(10). A. Variable Length Code (VLC) VLCs, also known as prefix-free VLCs or Huffman-like VLCs [48] assign a variable length codeword to each symbol. To achieve compression, shorter codewords are associated with more frequent symbols — in this tutorial, only the stationary probabilities P (Sk ) are considered for compression, not P (Sk |Sk−1 ). Unfortunately, VLCs are sensitive to desynchronization: a single bit error can generate a burst of several symbol errors before resynchronization. VLCs are an extension of FLCs and so are the associated factor graphs. Because we output a variable number of bits per symbol with the symbol clock model in Fig. 3(c), the bit position of the current symbol is not known without storing the number Nk of bits encoded up to the current symbol Sk . So the Markov states are now the pairs (Sk , Nk ), see the detailed model in Tab. I. While this does not increase the complexity at the encoder, it really does at the decoder under soft source decoding. Besides, notice that this affects the graph in a very unusual way: Its structure, i.e., the number of edges and their interconnection, is now random and depends on N1:K . With the bit clock model in Fig. 3(d), we have the symmetrical problem: at each bit position, either no symbol is output or only one, so the number of symbols is variable. Therefore, the symbol position of the current bit un is not known without storing the number Kn of symbols output up to the current bit position n. So the Markov states are now the pairs (Xn , Kn ) (see the function definitions in Tab. I). The states Xn correspond to those of the Balakirsky trellis [49]: The different values of Xn are the internal nodes of the VLC code tree, extended with the source memory if necessary (as with FLCs). An example of code tree and the corresponding trellis are given on the right of Fig. 4 and at the bottom of Fig. 5. Note that for a source without memory, the final constraint ce can be written as ce = I{KN = K}I{XN = R} — the node R is indeed related to the beginning of a new codeword in the code tree. Finally, the bit clock model may be simplified in the case of frame-MAP and bit-MAP estimations, as explained in Section V-D. B. Arithmetic Code (AC) While VLCs and FLCs assign a codeword to each symbol, arithmetic coding [50] encodes the entire symbol sequence S1:K into a single number x between 0.0 and 1.0. The binary representation of x is the sequence of transcoded bits U1:N . Optimal AC proceeds by successively subdividing a probability interval [lowk , upk [, initialized to [low0 , up0 [= [0.0, 1.0[, according to the symbol probabilities: Each value

DRAFT-8

sk of the current symbol Sk uniquely defines a sub-interval [lowk+1 , upk+1 [ of the current interval [lowk , upk [, proporup −lowk+1 = P (sk ). The number x is tional to P (sk ): k+1 upk −lowk any fraction falling in the final interval. At the decoder, x identifies the final interval from which all symbols can be reconstructed unambiguously. AC allows to reach near optimal compression for a given set of symbols and probabilities but is however very sensitive to desynchronization. Except in some rare cases, almost no re-synchronization is actually possible (on the contrary to VLCs) and a single bit error generally invalidates the rest of the frame. AC not only outputs a variable amount of bits per symbol but that amount can be zero (if Nk+1 = Nk ). In addition, the internal state of the arithmetic coder, hence the coded bits, depend on all previously encoded symbols. As a consequence, w.r.t. VLCs, there is an additional dimension to the model: We have to store the whole sequence of symbols coded up to the current symbol. For the symbol clock model, depicted in Fig. 3(e), the Markov states are now the triplets (Sk , Nk , Xk ) where Xk = {S1 , S2 , . . . , Sk−1 }. Note that Nk is unnecessary here since it can be deduced from (Sk , Xk ); however, we keep it so as to ease the comparison with VLCs. The model dependencies are summarized on the fifth row of Tab. I. Note that the final constraint ce is system specific: It depends on the way x is chosen within the final interval. The bit clock model can be constructed in a similar manner, as an extension of the VLC bit clock model [see Fig. 3(f) and Tab. I]. Instead of storing the internal nodes of the VLC code tree, i.e. the bits of the current codeword, the state Xn stores the whole sequence of bits output up to the current bit position n, Xn = {u1 , u2 , . . . , un }. The Markov states are the pairs (Xn , Kn ). This model is affected by a high increase in complexity, like the symbol clock model. These two simple models, bit clock and symbol clock, allow us to illustrate the very high complexity of optimal decoding of ACs. The state space of both models grows exponentially in the sequence length. A direct application of the SPA and BCJR algorithm is thus intractable. One has to rely instead on suboptimal decoding techniques such as sequential decoding with pruning (see Section V-D) to keep the complexity within a tractable range. An interesting alternative to AC, avoiding these complexity problems and presented below, is quasiarithmetic code. In practice, the bit clock and symbol clock models presented above are not used because they require infinite precision. Instead, probability intervals are considered as Markov states. Though these intervals do not really diminish the decoding complexity, they solve the numerical precision problem. As soon as all real numbers within the current interval share a common binary representation, that representation is output and the interval is rescaled accordingly. Further rescaling is also possible when the interval falls into [0.25, 0.75[. C. Quasi-Arithmetic (QA) Code A reduced precision implementation of arithmetic coding, called quasi-arithmetic (QA) coding, has been proposed in [51]. The QA coder operates integer subdivisions of an integer

PROCEEDINGS OF THE IEEE, VOL. 95, NO. 6, JUNE 2007

interval [0, Q[. The integer interval subdivisions lead to some approximation of the source distribution. The parameter Q controls both the state space dimension, hence the decoding complexity, and the source distribution approximation. It has been shown in [52] that, for a binary source, the variable Q can be limited to a small value (down to 4) at a small cost in terms of compression efficiency. The number of possible subdivisions of the interval [0, Q[ being finite, QA codes can be modeled as Finite State Machines. Trellis decoding, using e.g., the SPA or the BCJR algorithms, can thus be applied [16] with a reasonable complexity. As with AC, probability intervals can be considered as Markov states: [lowk , upk [ for the symbol clock model, [lowUn , upUn [ and [lowSKn , upSKn [ for the bit clock model. But the different probability intervals define integer segments of the interval [0, Q[ rather than fractional segments of the interval [0, 1[. D. Complexity issues and suboptimal decoding 1) Bit-level trellis and bit/symbol clock models: Both the symbol clock and the bit clock models can be used for the bit, symbol and frame optimal estimations presented in Section IIIA. However, for VLCs, these models in Fig. 3(c) and 3(d) have a complexity growing as a quadratic function of the sequence length, and even higher for ACs. That complexity is actually not tractable for typical sequence lengths. Some simplifications are possible with the bit clock model in the case of frame-MAP or bit-MAP estimations if the number K of symbols is not known at the receiver. In that case, all the symbols Sk and variables Kn can be removed from the factor graphs (Fig. 3(d) and 3(f)). For VLCs, these simplifications allow an optimal frame-MAP or bit-MAP decoding with a complexity approximately growing linearly as the product of K and the alphabet size |A|. The corresponding trellis is often called the “bit-level” trellis in the literature and was originally proposed in [49] — an example of such a trellis is given at the bottom of Fig. 5 for a source without memory. Unfortunately, in the case of symbol-MAP or MMSE, or if the pair (K, N ) is known and used as a termination constraint, the complexity hurdle remains for optimal estimation. In order to overcome it, most authors consider suboptimal methods. They (a) either use the bit-level trellis, (b) or apply suboptimal estimation methods such as sequential decoding [53] to the bit or symbol clock models. Method (a), together with the SPA or BCJR algorithm, amounts to optimally estimating with a suboptimal Hidden Markov model. Method (b) processes a suboptimal estimation on an optimal model which fully represents the whole transmission chain. 2) Aggregated State Model: An aggregated state model is introduced in [54], based on the bit clock model (Xn , Kn ). It is defined by both the internal state Xn of the VLC decoder (i.e., the internal node of the VLC code tree) and Mn = Kn mod T , the rest of the Euclidean division of the symbol clock Kn by a fixed parameter T , such that 1 ≤ T ≤ K. The state model is thus defined by the set of tuples (Xn , Mn ). The transitions which trigger a symbol, i.e. those which terminate in the state nε , modify the modulo Mn as Mn = Mn−1 + 1 mod T . Therefore, the model consists in aggregating the states

JASPAR ET AL.: JOINT SOURCE-CHANNEL TURBO TECHNIQUES FOR DISCRETE-VALUED SOURCES: FROM THEORY TO PRACTICE

of the bit/symbol trellis which are distant of T symbol clocks. If T = 1, the resulting trellis is equivalent to the usual bit-level trellis proposed in [49]. If T is greater or equal than the symbol sequence length K, the trellis is equivalent to the bit/symbol trellis. The intermediate values of this parameter allow to trade complexity against estimation accuracy. The intuition behind this state aggregation is that a desynchronization error will be detected if the difference between the numbers of transmitted and decoded symbols is a quantity which is not a multiple of T . The choice of the parameter T is motivated by the capability of VLC considered to quickly resynchronize. We come back on this issue in Section VI. The aggregation above deals with Kn . A similar principle has also been developed for Xn in [55] by compacting the VLC table. The codewords are grouped/aggregated into a minimum number of classes. The decoding algorithm may then work on a reduced number of classes, hence on a reduced number of states Xn′ , instead of working on the whole set of codewords. Note that these two state aggregations, Kn → Mn and Xn → Xn′ , are complementary and can be combined. 3) Pruning: Simplifying the model as above is sometimes not sufficient, especially with ACs. In such cases, suboptimal decoding has to be considered, as in [15], [17], [53]. Several techniques suggested originally for convolutional codes and based on the Viterbi algorithm can be used, often called sequential decoding. A few of them are covered here. To find in the trellis the path with the highest accumulated metric, the Viterbi algorithm recursively implements an exhaustive search by maintaining one best path, called survivor, in each state of the current trellis section. Each survivor is then extended in (11) with the transitions of the next trellis section and only the best resulting paths are kept in (12) in each state. Unfortunately, the number of survivors to keep in memory may be too large and leads to intractable complexity. To avoid that, a part of the survivors have to be pruned (discarded) in some ways. The M-algorithm [56] keeps only the M best survivors at each trellis section and the T-algorithm [57] keeps only the survivors whose likelihood is above a given threshold T , both algorithms according to a breadth first strategy. The stack algorithm [58] (metric first), and similarly the Fano algorithm [59] (depth first), always expands the best survivor only. For the T-algorithm and the stack algorithm, the number of survivors is further limited by a predefined maximum so as to bound the complexity. The stack algorithm has been extended in [60] to make it suitable for iterative decoding. Note finally that the same pruning principles can be applied to the SPA and to the BCJR algorithm. This is roughly equivalent to consider only the dominant terms in the SPR in (5). 4) Complexity-Performance Issues: Let us review shortly the complexity of the optimal estimations (1)–(4). They can be deduced from Fig. 3 and Tab. I. For FLCs, all estimations have a complexity growing linearly with K, the number of symbols. For VLCs, they grow quadratically with K, except in one case: If K is not known at the receiver (Section V-D.1), the frameand bit-MAP grow linearly with K — compared to FLCs, the decoding complexity is then multiplied by the codeword length of the FLC. For ACs, all estimations grow exponentially with K; only suboptimal estimations are affordable. In all cases,

DRAFT-9

the complexity can be of course reduced by using suboptimal decoding techniques such as described above. Concerning the choice of the estimation, the symbol-MMSE and symbol-MAP are usually the most interesting ones for the source distortion, while the bit- and frame-MAP have the lowest complexity. But such a comparison is simplistic. For example, the main concern at the receiver to reduce the distortion with VLCs and ACs is usually the decoder desynchronization, Section VI. And that desynchronization is already handled by the frame-MAP decoder, even if not optimally. Another example is when the application tolerates a time shifting of the source signal. Such a possible shifting is not taken into account by any of the estimations (1)–(4). In that case, no estimation is a priori better than the others. At last, concerning the performance of the codes, a comparison is beyond the scope of this paper. Let us summarize that there is no universal solution. On the one hand, FLCs are more robust and not sensitive to desynchronization. But on the other hand, VLCs and ACs save bandwidth which can be used to improve their robustness against channel errors. VI. S YNCHRONIZATION AND ROBUSTNESS OF E NTROPY CODES

In the case of VLCs and ACs, the decoder desynchronization problem prevails in the performance of the endto-end chain. Different solutions can be considered to fight against this phenomenon and to increase robustness, e.g., by introducing more redundancy in the bit stream or by better structuring the stream.

A. Synchronization mechanisms When soft decoding is used for VLCs or ACs, several mechanisms can be incorporated in the (de)coding process to help the decoder re-synchronization in presence of errors. 1) Termination constraint: If the number of symbols and/or bits transmitted are known by the decoder, termination constraints can be incorporated in the decoding process: e.g., one can ensure that the decoder produces the right number of symbols (KN = K), if known. All the paths in the trellis which do not lead to a valid sequence length are suppressed. The termination constraints, cb and ce in Fig. 3, allow to synchronize the decoding at both ends of the sequence. 2) Soft synchronization: Extra bits (small patterns), often called synchronization markers, can be incorporated at some known positions in the symbol stream to help achieving a proper segmentation of the received noisy bit stream into segments corresponding to the transmitted symbols. This extra information can take the form of dummy symbols (in the spirit of the techniques described in [30], [61]), or of dummy bit patterns which are inserted in the symbol or bit stream, respectively, at some known symbol clock positions. The procedure amounts to extending symbols at known positions with a predetermined suffix. These suffixes favor the likelihood of correctly synchronized sequences (i.e. paths in the trellis), and penalize the others.

DRAFT-10

3) Forbidden symbol and cyclic code: To detect and prune erroneous paths in soft AC decoding, the authors in [61] use a reserved interval corresponding to a so-called forbidden symbol. All paths hitting this interval are considered erroneous and pruned. This technique is analyzed in [62]. Similarly, the authors in [10] append to the VLC stream a cyclic redundancy check code. At the receiver, a list Viterbi decoding is applied and all paths that do not respect the cyclic code constraint are considered erroneous and pruned. B. Resynchronization properties of VLC The respective performance of the different mechanisms depends on the synchronization or error recovery properties of the VLC codes being used. Error recovery properties of VLCs have been first studied in [63], where a method is proposed to compute the so-called expected error span Es , i.e., the expected number of source symbols on which a single bit error propagates. For a given VLC, the lower Es , the better the error-resilience of this code is when hard decoding is applied at the decoder side. Later, this method has been adapted in [64] to compute the so-called synchronization gain/loss, i.e. the probability that the number of symbols in the transmitted and decoded sequences differ by a given amount ∆S when a single bit error occurs during the transmission. In [65] it is shown that the probability mass function (p.m.f.) of the synchronization gain/loss is a key feature of a VLC to analyze the error-resilience of such codes when soft decoding with length constraint is applied at the decoder side. Transfer functions defined on an error state diagram allow to estimate the probability that the number of symbols in the transmitted and decoded sequences differ by a given amount ∆S. In particular, the probability that the VLC decoder does not resynchronize in a strict sense (or equivalently P (∆S 6= 0)) is almost not altered by the state aggregation. Also, and surprisingly, the codes offering the best error resilience under soft decoding with constraint length are not those having the highest resynchronization probability, i.e., the highest P (∆S). C. Error correction properties of VLC Besides the possibility to resynchronize after an error, VLCs may also exhibit error correction capabilities. These capabilities are obtained through an increased Hamming distance between the two binary representations of any two different symbol sequences, at the cost of a compression loss (more redundancy in the bit stream). In other words, several bit errors must occur before the decoder selects a wrong path. Such VLCs are sometimes called variable length error correcting codes (VLECCs). The analysis of their correction properties was originally developed in [45] and is based on distance spectra as with error correcting codes. That analysis was then generalized and extended to turbo schemes in [66] to provide relatively tight performance bounds. One promising property of VLECCs in turbo systems is the possibility to obtain the so-called interleaving gain or to increase it, under certain assumptions [66]. The interleaving gain [44] is a key property of parallel turbo codes, which results in a decrease of the BER (bit error rate) as N −1 , where N is the sequence length. Interleaving gains on the SER and FER are possible as well.

PROCEEDINGS OF THE IEEE, VOL. 95, NO. 6, JUNE 2007

D. Recent advances in source coding 1) Reversible Variable Length Codes (RVLC): RVLCs [31] are a special case of VLECCs, increase the resiliency at a small cost in compression and are used in recent audio and video standards, such as AAC [67] and MPEG-4 [68]. RVLCs are both prefix- and suffix-free. They can thus be decoded in both directions. This helps the hard decoder to recover from a short error burst: As soon as an error is detected, we can restart decoding in the reverse direction from the end of the sequence and recover as much information as possible. In turbo systems, most RVLCs can show additional resiliency capabilities, thanks to interleaving gain (Section VI-C) — even when their free distance is df = 1. In particular, the probability of desynchronization of the decoder can tend to zero with long interleavers, i.e., with long sequences. Indeed, with any RVLC, a single bit error leads only to a single symbol error. In other words, two bit errors at least are necessary to desynchronize the decoder and to produce a burst of symbol errors. The probability of such two bit error events diminish [66] at least as N −2 for most RVLCs if we take, e.g., a parallel turbo code as channel code in Fig. 1. Then, the probabilities of symbol error bursts (N −2 ) and of symbol errors (N −1 ) tend to zero with long sequences (large N ), which is a useful resiliency property. 2) Multiplexed Codes: Effort has been dedicated in [33], [34] to design codes less sensitive to the decoder desynchronization while at the same time approaching the source entropy. A family of codes has been introduced, called multiplexed codes. They are appropriate when we have to encode two (or more) sources of information with different levels of priority. This relies on the fact that lossy compression systems of real signals generate very often such sources, e.g., texture and motion information for a video signal. The risk of “desynchronization” is then confined to the low priority information so as to make the high priority information insensitive to that desynchronization. A high priority source and a low priority source are considered and referred respectively as SH and SL . The idea consists in creating an FLC of N = 2c codewords for the SH source, in partitioning the set of codewords into subsets (or classes) Ci , i = 1 . . . Ω associated with the symbols ai of the alphabet A. Each class Ci contains Ni codewords. A symbol St = ai of the flow SH can be encoded with any c-bit codeword ci,q belonging to the class Ci . The redundancy inherent to each class is then used to represent information of the low priority source SL . With the realization of the sequence of symbols SH one associates a sequence of Ni valued variables indexing the codewords in the classes. In order to be multiplexed with symbols of SH , the lower priority bit stream b must be mapped into the sequence of Ni -valued variables associated with the realization of the high priority source SH [34]. To reach the source entropy, the cardinalities of the classes Ci must be chosen such that Ni = µi 2c where µi denotes the stationary probability of the symbol ai . Multiplexed codes can be also be designed to approach higher order entropies by conditioning the construction of the different classes to the realization of the previous symbols or to a given context [34].

JASPAR ET AL.: JOINT SOURCE-CHANNEL TURBO TECHNIQUES FOR DISCRETE-VALUED SOURCES: FROM THEORY TO PRACTICE

(a) Classical decoding, PSNR = 16.43 dB.

DRAFT-11

(b) Sequential decoding, PSNR = 31.91 dB.

Fig. 7. Performance of sequential decoding of arithmetic codes with JPEG-2000 coded images (courtesy of [15]), AWGN channel with Eb /N0 = 5 dB. The PSNR of the coded image is 37.41 dB.

VII. R ELATED AREAS Surrounding joint source-channel turbo techniques, various subjects of interest have been developed in the literature. In this section, we select and present briefly four of them: continuous-valued sources, source statistics estimation, multiple descriptions and joint source-channel turbo coding. So far, we have considered only discrete-valued sources. Several works have studied and integrated continuous-valued sources too in turbo systems, see [3], [8], [13], [22] and references therein. The quantization step is then part of the problem to improve the end to end distortion. As source code, these works have considered a fixed length code, so there is no desynchronization problem such as described in Section VI. The optimization of the code is made through the optimization of the bit mapping at the output of the quantizer, with the goal of improving the iterative decoding convergence, either by EXIT chart optimization [22] or by an error analysis under a perfect a priori assumption [13]. Great performance gains are obtained while the decoding complexity is kept low. We have also implicitly assumed, so far, that the source statistics are known by the receiver. Most studies make this assumption. In many practical situations, however, the source statistics must be estimated from the noisy channel measures. This is then a joint problem of source-channel decoding and source statistics estimation. If the source can be modeled with a few parameters, then only those parameters need to be estimated. Otherwise, the source statistics P (Sk ) and P (Sk+1 |Sk ) are to be estimated, which is a bit more complex. An estimation based on the Baum-Welch algorithm and integrated into a turbo decoder has been proposed in [9], [69]. The increase in complexity is kept quite small by using only byproducts of the SPA or BCJR algorithm. On especially bad channels, when the channel layer cannot handle the level of data corruption, it is sometimes very useful to consider particular source transformations that introduce much redundancy at the source level. Multiple descriptions coding, for example, creates several correlated representations of the signal and transmits them across different channels.

Basically, the setup can be optimized to achieve a sufficient quality of the reconstructed signal when a single representation is received, i.e., when a single channel worked, and to maximize the quality when all representations are received. This coding technique is easily integrated in the factor graph of the transmission chain, for example in [42] for soft source decoding. Moreover, if each description is followed by an interleaver and a channel coder, the SPA leads to a turbo decoder iterating between the different descriptions, providing great performance gains over tandem receivers as in [43]. At last, let us mention that compression techniques, using as source code either the syndrome former of an error correcting code or the parity bits of a high rate error correcting code, have been developed recently with low density parity check (LDPC) codes and turbo codes. See [70], [71] and references therein. These compression techniques either have built-in support for or are extensible to joint source-channel coding. VIII. P ERFORMANCE I LLUSTRATIONS A few scenarios are selected in this section to illustrate some of the performance improvements that can be provided by both soft decoding and joint source-channel turbo decoding. A. Soft source decoding of real signals The first attempts to use soft decoding of VLC in practical image and video coding systems have been made considering Huffman codes and RVLCs. The authors in [72] and [73] show the benefits of MAP decoding of RVLC and VLC encoded texture information in an MPEG-4 video compressed stream. The authors in [74] also apply sequential decoding with both soft and hard channel values to the decoding of startcodes and overhead information in an MPEG-4 video compressed stream. The predominance of ACs in emerging systems has led the community to address the problem of soft decoding of these codes in actual compression systems. E.g., sequential soft decoding of ACs has been introduced in the JPEG-2000 decoder in [15] and is in informative annex of JPEG-2000

DRAFT-12

PROCEEDINGS OF THE IEEE, VOL. 95, NO. 6, JUNE 2007

PSNR

40

30

20

Error free (63.63 kbps) Hard Decoding (63.63 kbps) Soft Decoding (61.93 kbps) Soft Decoding + Interleaving (61.93 kbps) 0

10

20

30

40

50

60

Fig. 9. Picture number 15 from the sequence FOREMAN. Quality obtained with WIFI traces, with hard decoding on the left and soft decoding (forbidden interval of 0.09, W = 32) on the right. 70

Picture index 0

0

10

Fig. 8. Performance of hard and soft sequential decoding of arithmetic codes with H.264 [68] and IEEE 802.11b error traces. For soft decoding, forbidden interval of 0.15 and W = 32.

10 Same channel code rate

Same global code rate

Different bandwidth

Same bandwidth

−1

−1

10

B. JSC turbo decoding of theoretical sources and real signals The benefits of soft decoding w.r.t. to hard decoding have just been exemplified. When a channel code is used, these benefits can be further increased by JSC turbo decoding. JSC turbo decoding was firstly envisaged in [2]. The good results obtained by testing different VLCs with a memoryless source, a convolutional code and an AWGN channel, rapidly motivated further research in the literature. To summarize the most related contributions, binary sources are considered in [9], [19], [24]. Sources with memory are considered in [7]– [9], [12], [24] and source semantics in [6], [20]. As source code, FLCs are used in [8], [22], VLCs in [2], [5], [7], [14], resilient VLCs in [4], [10]–[12], [18], [23], [66] and ACs in [15]–[17]. As channel code, capacity approaching turbo

−2

−2

10

Symbol Error Rate [−]

Symbol Error Rate [−]

part 11 dedicated to wireless applications. Fig. 7 shows the decoding results obtained with the Lena image encoded at 0.5 bpp and transmitted over an AWGN channel with a signalEb to-noise ( N ) of 5 dB. The standard JPEG-2000 decoder is 0 compared against the sequential decoding technique with W = 20 surviving paths. Soft sequential arithmetic decoding has also been considered for resilient decoding of H.264 video streams encoded with the CABAC (context-based adaptive binary arithmetic coding) algorithm [75]. The approach is used together with the data partitioning mode of the extended profile of H.264. The data partitioning mode consists in separating elements of a slice in three classes depending on their sensitivity to bit errors. The current data partition mode supports three partitions: The first one, referred to as data partition A (DPA) contains header and motion vector information which have a high sensitivity to bit errors while the two others are less sensitive and contain residual data, transform coefficients of Intra-coded blocks for DP-B, and coefficients of Inter-coded blocks for DP-C. Partitions B and C are thus decoded only if partition A has been received correctly. Fig. 8 shows the PSNR values obtained with the Foreman sequence with traces of bit errors induced by the IEEE 802.11b physical layer. The patterns corresponding to 2Mbps data rate and considered in [76] have been used. The average PSNR as well as the visual quality of the decoded sequence (Fig. 9) remain much better.

10

−3

10

−4

10

−5

10

−6

10

10

−3

10

−4

10

HVLC, tandem soft HVLC, JSC turbo RVLC, tandem soft RVLC, JSC turbo −2.5 −2 −1.5 −1 −0.5 Channel SNR − Es/N0 [dB]

−5

10

−6

10

HVLC, tandem soft HVLC, JSC turbo RVLC, tandem soft RVLC, JSC turbo 0.5 1 1.5 2 Channel SNR − Eb/N0 [dB]

2.5

Fig. 10. Performance comparison of JSC turbo decoding w.r.t. tandem decoding. Note that the channel SNR (signal to noise ratio) is measured as Es /N0 on the left and Eb /N0 on the right, where Es is the energy per coded bit, Eb the energy per entropy bit and N0 the noise spectral density.

codes are considered in [5], [6], [11] with no interleaver between the source code and the turbo code (unlike Fig. 1). Turbo codes with an interleaver and LDPCs are suggested in [18], [23] — the source and channel modules are better separated in the turbo decoder and the decoding complexity is lower. A parallel concatenation with a convolutional code is proposed in [14]. Suboptimal decoding algorithms, for lower receiver complexity, are proposed in [5], [12], [15], [17]. An analysis and/or optimization with EXIT charts is provided in [4], [12], [17], [21], [22]. Performance analysis and prediction with distance spectra are given in [66]. Let us consider firstly a theoretical source of independent symbols, based on the 26 letters of the English alphabet as in [11]. The symbols are encoded by a VLC. The resulting bits are framed to maximum N = 4000 bits, interleaved and protected by a rate- 21 parallel systematic turbo code before transmission across an AWGN channel. Let us have a look at how changing only the redundancy of the source/VLC may affect the global performance. This is illustrated on the left of Fig. 10, for two different VLCs: a HVLC (Huffman VLC) with a code rate rs = 0.99, i.e., 1% of redundancy; a RVLC (reversible VLC) with rs = 0.95, or 5% of redundancy, and with a free distance df = 2. Under

JASPAR ET AL.: JOINT SOURCE-CHANNEL TURBO TECHNIQUES FOR DISCRETE-VALUED SOURCES: FROM THEORY TO PRACTICE

0

10

VLC + turbo−code, tandem decoding RVLC + turbo−code, JSC decoding −1

Image Rate [−]

10

−2

10

−3

10

0.6

0.65

0.7

0.75

0.8 E /N [dB] b

0.85

0.9

0.95

1

0

Fig. 11. Solid: image error rate, i.e., rate of images with a PSNR degradation above 0dB. Dashed: rate of images with a PSNR degradation above 3dB.

tandem decoding, i.e., under iterative decoding of the turbo code and separate soft decoding of the source/VLC, the RVLC provides some improvement over the HVLC, as expected. This is the reason why RVLCs are used in standards with error resiliency needs, such as AAC and MPEG-4. Under JSC turbo decoding, the resiliency improvement of the RVLC becomes impressively huge, two orders of magnitude in SER. This is the consequence of an increased interleaving gain in the global code (RVLC + turbo code) that the JSC turbo decoder is able to exploit (Section VI-D.1, [66]). However, as the RVLC has 4% more of redundancy, the transmission scheme needs more bandwidth than the HVLC scheme. Therefore, we are also interested in the influence of the VLC when the global code rate r = rs rc = 1/2 is fixed, i.e., when the bandwidth and the transmission power are kept identical. To keep r constant, the channel code must thus be adjusted so that its code rate, hence its redundancy, is changed to rc = r/rs . In such a comparison setup, a redundancy allocation problem exists between source and channel codings, whose optimal solution is not obvious for a given family of source and channel codes. In our example, as illustrated on the right of Fig. 10, the RVLC still provides the lowest error rates under JSC turbo decoding, but only for channel SNRs (signal to noise ratio) above 1.2dB. There is thus a trade-off whose solution depends on the application needs: In our example, the RVLC scheme is the good candidate except for applications working at critically low channel SNRs. An important comment must be made. In both plots of Fig. 10, the tandem receiver performs quite badly on the RVLC scheme w.r.t. the JSC turbo receiver while, on the HVLC scheme, it performs very close to the JSC turbo receiver. This is related to the respective level of redundancy in the RVLC and in the HVLC, as it was already noted by Shannon in [1]: If some redundancy is left or intentionally introduced at the source code level (as in the RVLC scheme here), and if its distribution is known to the receiver, the tandem receiver could lead to a huge loss in performance w.r.t. the JSC receiver. On the contrary, if there is no redundancy left, these two receivers perform equally (as with the HVLC case). Finally, the transmission of images across an AWGN channel is considered in Fig. 11 with the parallel turbo code of [39] (interleavers of length N = 65536) — note that using another

DRAFT-13

turbo code would change/improve the error floors but not the general trends and the conclusions hereafter. This simulation illustrates the potentiality of JSC techniques with a capacity approaching error correcting code. Each image is processed by a DCT (discrete cosine transform) based compression, with a compression ratio similar to the JPEG standard. The DC component is coded with an FLC and decoded by an MMSE estimation (4). The AC components are run-length coded with HVLCs in the tandem system and with RVLCs in the JSC system, and decoded by a frame-MAP estimation (1). RVLCs have been chosen for the JSC system in order to benefit from the near zero desynchronization property described in section VI-D.1, a property that is useless with the tandem decoder. As we can see, the JSC system provides not only a better image error rate, but also a much smaller rate of images with a PSNR distortion above 3dB — actually, most error images of the JSC system have a PSNR quality almost not affected by the error(s). IX. C ONCLUSIONS The success of joint source-channel turbo decoding, attested by many contributions in the literature, benefits greatly from the respective usual suboptimalities, in practical contexts, of source and channel codes. Combining indeed the residual redundancy left by the source code with the non perfect error correcting power of the channel code offers generally great advantages in terms of end-to-end distortion. In this tutorial, much attention has been devoted to present the subject in a unified way, using the extensible framework of factor graphs. Based on this framework, the turbo principle can be easily introduced and detailed in the joint source-channel context, with different well-known source codes and several optimal soft source decoding algorithms. Some source codes, however, raise specific difficulties, such as high decoding complexity or decoder desynchronization, for which suboptimal decoding solutions and ways to improve resiliency exist. To summarize, the literature has addressed many problems in this field. As a conclusion, one of the next challenges is the adoption of such techniques in more standards, with notably a stronger cross-layer interaction between source and channel. ACKNOWLEDGMENT The authors would like to thank the anonymous reviewers for their valuable and constructive comments. R EFERENCES [1] C. E. Shannon, “A mathematical theory of communication,” Bell System Tech. Journal, vol. 27, pp. 379–423/623–656, July/Oct. 1948. [2] R. Bauer and J. Hagenauer, “Symbol-by-symbol MAP decoding of variable length codes,” in Proc. ITG SCC, Munich, Germany, Jan. 2000, pp. 111–116. [3] R. Perkert, M. Kaindl, and T. Hindelang, “Iterative source and channel decoding for GSM,” in Proc. IEEE ICASSP, Salt Lake City, UT, USA, May 2001. [4] J. Hagenauer and R. Bauer, “The turbo principle in joint source channel decoding of variable length codes,” in Proc. IEEE ITW, Cairns, Australia, Sept. 2001, pp. 128–130. [5] L. Guivarch, J.-C. Carlach, and P. Siohan, “Joint source-channel softdecoding of Huffman codes with turbo-codes,” in Proc. IEEE DCC, Snowbird, USA, Mar. 2000, pp. 88–92.

DRAFT-14

[6] Z. Peng, Y.-F. Huang, and D. J. Costello, “Turbo codes for image transmission - a joint channel and source decoding approach,” IEEE J. Select. Areas Commun., vol. 6, pp. 868–879, June 2000. [7] A. Guyader, E. Fabre, C. Guillemot, and M. Robert, “Joint sourcechannel turbo decoding of entropy-coded sources,” IEEE J. Select. Areas Commun., vol. 19, pp. 1680–1696, Sept. 2001. [8] N. G¨ortz, “On the iterative approximation of optimal joint sourcechannel decoding,” IEEE J. Select. Areas Commun., vol. 19, pp. 1662– 1670, Sept. 2001. [9] J. Garcia-Frias and J. D. Villasenor, “Joint turbo decoding and estimation of hidden Markov sources,” IEEE J. Select. Areas Commun., vol. 19, pp. 1671–1679, Sept. 2001. [10] A. Hedayat and A. Nosratinia, “List decoding of variable length codes with application in joint source channel coding,” in Asilomar Conf. on Sign., Syst. and Comp., Nov. 2002. [11] K. Lakovi´c and J. Villasenor, “Combining variable length codes and turbo codes,” in Proc. IEEE VTC, Birmingham, USA, May 2002, pp. 1719–1723. [12] R. Thobaben and J. Kliewer, “On iterative source-channel decoding for variable-length encoded markov sources using a bit-level trellis,” in Proc. IEEE SPAWC, Rome, Italy, June 2003, pp. 50–54. [13] N. G¨ortz, “Optimization of bit mappings for iterative source-channel decoding,” in Proc. Int. Symp. on Turbo Codes and Related Topics, Brest, France, Sept. 2003. [14] J. Kliewer and R. Thobaben, “Parallel concatenated joint source-channel coding,” IEE Elect. Letters, vol. 39, no. 23, pp. 1664–1666, Nov. 2003. [15] T. Guionnet and C. Guillemot, “Soft decoding and synchronization of arithmetic codes for image transmission over error-prone channels,” IEEE Trans. Image Processing, vol. 12, no. 12, pp. 1599–1609, Dec. 2003. [16] ——, “Soft and joint source-channel decoding of quasi-arithmetic codes,” Eurasip Journal on Applied Signal Processing, Mar. 2004. [17] M. Grangetto, B. Scanavino, and G. Olmo, “Joint source-channel iterative decoding of arithmetic codes,” in Proc. IEEE ICC, vol. 2, Paris, France, June 2004, pp. 886–890. [18] X. Jaspar and L. Vandendorpe, “New iterative decoding of variable length codes with turbo codes,” in Proc. IEEE ICC, vol. 5, Paris, France, June 2004, pp. 2606–2610. [19] G.-C. Zhu, F. Alajaji, J. Bajcsy, and P. Mitran, “Transmission of nonuniform memoryless sources via nonsystematic turbo codes,” IEEE Trans. Commun., vol. 8, no. 52, pp. 1344–1354, Aug. 2004. [20] H. Nguyen and P. Duhamel, “Iterative joint source-channel decoding of variable length encoded video sequences exploiting source semantics,” in Proc. IEEE ICIP, Singapor, Oct. 2004, pp. 3221–3224. [21] X. Jaspar and L. Vandendorpe, “Performance and convergence analysis of joint source-channel turbo schemes with variable length codes,” in Proc. IEEE ICASSP, vol. 3, Philadelphia, PA, USA, Mar. 2005, pp. 485–488. [22] M. Adrat and P. Vary, “Iterative source-channel decoding: Improved system design using EXIT charts,” Eurasip Journal on Applied Signal Processing, no. 6, pp. 928–947, May 2005. [23] C. Poulliat, D. Declercq, C. Lamy-Bergot, and I. Fijalkow, “Analysis and optimization of irregular LDPC codes for joint source-channel decoding,” IEEE Commun. Lett., vol. 9, pp. 1064–1066, Dec. 2005. [24] G.-C. Zhu and F. Alajaji, “Joint source-channel turbo coding for binary markov sources,” IEEE Trans. Wireless Commun., vol. 5, no. 5, pp. 1065–1075, May 2006. [25] G. D. Forney, “Codes on graphs: Normal realizations,” IEEE Trans. Inform. Theory, vol. 47, no. 2, pp. 520–548, Feb. 2001. [26] H.-A. Loeliger, J. Dauwels, J. Hu, S. Korl, L. Ping, and F. R. Kschischang, “The factor graph approach to model-based signal processing,” Proc. IEEE, vol. 95, no. 6, pp. 1295–1322, June 2007. [27] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and the sum-product algorithm,” IEEE Trans. Inform. Theory, vol. 47, no. 2, pp. 498–519, Feb. 2001. [28] H.-A. Loeliger, “An introduction to factor graphs,” IEEE Signal Processing Mag., vol. 21, no. 1, pp. 28–41, Jan. 2004. [29] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear codes for minimizing symbol error rate,” IEEE Trans. Inform. Theory, vol. 20, pp. 284–287, Mar. 1974. [30] C. Boyd, J. G. Cleary, S. A. Irvine, I. Rinsma-Melchert, and I. H. Witten, “Integrating error detection into arithmetic coding,” IEEE Trans. Commun., vol. 45, no. 1, pp. 1–3, Jan. 1997. [31] Y. Takishima, M. Wada, and H. Murakami, “Reversible variable length codes,” IEEE Trans. Commun., vol. 43, no. 2/3/4, pp. 158–162, Feb.-Apr. 1995.

PROCEEDINGS OF THE IEEE, VOL. 95, NO. 6, JUNE 2007

[32] K. Lakovi´c and J. Villasenor, “On design of error-correcting reversible variable length codes,” IEEE Commun. Lett., vol. 6, no. 8, pp. 337–339, Aug. 2002. [33] H. J´egou and C. Guillemot, “Robust multiplexed codes for compression of heterogeneous data,” IEEE Trans. Inform. Theory, vol. 51, no. 4, pp. 1393–1403, Apr. 2005. [34] ——, “Error-resilient first-order multiplexed source codes: performance bounds, design and decoding algorithms,” IEEE Trans. Signal Processing, vol. 54, no. 4, pp. 1483–1493, Apr. 2006. [35] K. Sayood and J. Borkenhagen, “Use of residual redundancy in the design of joint source/channel coders,” IEEE Trans. Commun., vol. 39, no. 6, pp. 838–846, June 1991. [36] D. J. Miller and M. Park, “A sequence-based approximate MMSE decoder for source coding over noisy channels using discrete hidden Markov models,” IEEE Trans. Commun., vol. 46, no. 2, pp. 222–231, Feb. 1998. [37] A. H. Murad and T. E. Fuja, “Robust transmission of variable-length encoded sources,” in Proc. IEEE WCNC, New Orleans, L.A., USA, Sept. 1999. [38] K. Lakovi´c, J. Villasenor, and R. Wesel, “Robust joint huffman and convolutional decoding,” in Proc. IEEE VTC, Amsterdam, The Netherlands, Sept. 1999, pp. 2551–2555. [39] C. Berrou and A. Glavieux, “Near optimum error correcting coding and decoding: Turbo-codes,” IEEE Trans. Commun., vol. 44, no. 10, pp. 1261–1271, Oct. 1996. [40] B. Frey and D. MacKay, “A revolution: belief propagation in graphs with cycles,” in Proc. of the Neural Inform. Processing Systems Conf., Dec. 1997. [41] X. Ge, D. Eppstein, and P. Smyth, “The distribution of loop lengths in graphical models for turbo decoding,” IEEE Trans. Inform. Theory, vol. 47, pp. 2549–2553, Sept. 2001. [42] T. Guionnet, C. Guillemot, and E. Fabre, “Soft decoding of multiple descriptions,” in IEEE ICME, Lausanne, Switzerland, Aug. 2002. [43] J. Barros, J. Hagenauer, and N. G¨ortz, “Turbo cross decoding of multiple descriptions,” in IEEE ICC, New York, US, Apr. 2002. [44] S. Benedetto and G. Montorsi, “Unveiling turbo codes: some results on parallel concatenated coding schemes,” IEEE Trans. Inform. Theory, vol. 42, no. 2, pp. 409–428, Mar. 1996. [45] V. Buttigieg, “Variable-length error-correcting codes,” Ph.D. dissertation, Department of Electrical Engineering, University of Manchester, England, 1995. [46] D. Forney, G., “The viterbi algorithm,” Proc. IEEE, vol. 61, pp. 268– 278, Mar. 1973. [47] S. ten Brink, “Convergence behavior of iteratively decoded parallel concatenated codes,” IEEE Trans. Commun., vol. 49, no. 10, pp. 1727– 1737, Oct. 2001. [48] D. A. Huffman, “A method for the construction of minimum-redundancy codes,” in Proc. IRE, vol. 40, 1952, pp. 1098–1101. [49] V. B. Balakirsky, “Joint source-channel coding with variable length codes,” in Proc. IEEE ISIT, Ulm, Germany, July 1997, p. 491. [50] J. J. Rissanen, “Generalized kraft inequality and arithmetic coding,” IBM J. Res. Develop., vol. 20, pp. 198–203, May 1976. [51] P. G. Howard and J. S. Vitter, Image and Text Compression. Kluwer Academic Publisher, 1992, pp. 85–112. [52] ——, “Design and analysis of fast text compression based on quasiarithmetic coding,” in Proc. IEEE DCC, Snowbird, Utah, Mar. 1993, pp. 98–107. [53] S. K. M. Bystrom and A. Kopansky, “Soft source decoding with applications,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 10, pp. 1108–1120, Oct. 2001. [54] H. J´egou, S. Malinowski, and C. Guillemot, “Trellis state aggregation for soft decoding of variable length codes,” in Proc. IEEE workshop on signal processing systems, SIPS, Nov. 2005. [55] G. Mohammad-Khani, C.-M. Lee, M. Kieffer, and P. Duhamel, “Simplification of VLC tables with application to ML and MAP decoding algorithms,” IEEE Trans. Commun., vol. 54, pp. 1835–1844, Oct. 2006. [56] J. Anderson and S. Mohan, “Sequential coding algorithms: A survey and cost analysis,” IEEE Trans. Commun., vol. 32, pp. 169–176, Feb. 1984. [57] S. J. Simmons, “Breadth-first trellis decoding with adaptive effort,” IEEE Trans. Commun., vol. 38, pp. 3–12, Jan. 1990. [58] F. Jelinek, “Fast sequential decoding algorithm using a stack,” IBM Journal Research and Devel., vol. 13, pp. 675–685, Nov. 1969. [59] R. M. Fano, “A heuristic discussion of probabilistic decoding,” IEEE Trans. Inform. Theory, vol. 9, pp. 64–74, Apr. 1963. [60] J. Hagenauer and C. Kuhn, “The List-Sequential (LISS) algorithm and its application,” IEEE Trans. Commun., vol. 55, pp. 918–928, May 2007.

JASPAR ET AL.: JOINT SOURCE-CHANNEL TURBO TECHNIQUES FOR DISCRETE-VALUED SOURCES: FROM THEORY TO PRACTICE

[61] B. Pettijohn, M. Hoffman, and K.Sayood, “Joint source/channel coding using arithmetic codes,” IEEE Trans. Commun., vol. 49, no. 5, May 2001. [62] S. Ben-Jamaa, C. Weidmann, and M. Kieffer, “Asymptotic errorcorrecting performance of joint source-channel schemes based on arithmetic coding,” in Proc. IEEE MMSP, Victoria, Canada, Oct. 2006. [63] J. Maxted and J. Robinson, “Error recovery for variables length codes,” IEEE Trans. Inform. Theory, vol. IT-31, no. 6, pp. 794–801, Nov. 1985. [64] P. F. Swaszek and P. DiCicco, “More on the error recovery for variable length codes,” IEEE Trans. Inform. Theory, vol. IT-41, no. 6, pp. 2064– 2071, Nov. 1995. [65] S. Malinowski, H. J´egou, and C. Guillemot, “On the link between the synchronization recovery and soft decoding of variable length codes,” in Proc. Int. Symp. on Turbo Codes and Related Topics, Nov. 2006. [66] X. Jaspar and L. Vandendorpe, “Design and performance analysis of joint source-channel turbo schemes with variable length codes,” in Proc. IEEE ICC, vol. 1, Seoul, Korea, May 2005, pp. 526–530. [67] Information technology – Generic coding of moving pictures and associated audio information – Part 7: Advanced Audio Coding (AAC), ISO/IEC 13818-7. [68] Information technology – Coding of audio-visual objects, ISO/IEC 14496. [69] C. Weidmann and P. Siohan, “D´ecodage conjoint source-canal avec estimation en ligne de la source,” in Proc. CORESA, Lyon, France, Jan. 2003, in french. [70] G. Caire, S. Shamai, and S. Verd´u, “Noiseless data compression with low-density parity-check codes,” in Advances in Network Information Theory, ser. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, G. K. P. Gupta and A. J. van Wijngaarden, Eds. American Mathematical Society, 2004, vol. 66, pp. 263–284. [71] N. D¨utsch, G. Sebastian, J. Garcia-Frias, and J. Hagenauer, “Source model aided lossless turbo source coding,” in Proc. Int. Symp. on Turbo Codes and Related Topics, Munich, Germany, Apr. 2006. [72] K. Subbalakshmi and Q. Chen, “Joint source-channel decoding for MPEG-4 coded video over wireless channels,” in Proc. IASTED Wireless and Optical Communications, July 2002, pp. 617–622. [73] L. Perros-Meilhac and C. Lamy, “Huffman tree based metric derivation for a low complexity sequential soft VLC decoding,” in Proc. IEEE ICC, May 2002, pp. 783–787. [74] A. Kopansky and M. Bystrom, “Sequential decoding of MPEG-4 coded bitstreams for error resilience,” in Proc. Information Sciences and Systems, Mar. 1999. [75] M. Jeanne, C. Guillemot, T. Guionnet, and F. Pauchet, “Error resilient decoding of context-based adaptive binary arithmetic codes,” Signal, Image and Video Processing, vol. 1, no. 1, pp. 77–87, Apr. 2007. [76] S. Khayam, S. Karande, H. Radha, and D. Loguinov, “Performance analysis and modeling of errors and losses over 802.11b lans for highbitrate real-time multimedia,” Signal Processing: Image Communication, vol. 18, no. 7, pp. 575–595, Aug. 2003.

Xavier Jaspar was born in Ottignies-Louvain-laNeuve, Belgium, in 1980. He worked on joint source-channel coding in 2002 as a visiting scientist at the Electrical and Computer Engineering Department at McGill University, Montreal, Canada. He received the Engineering degree in Applied Mathematics (summa cum laude) from the Universit´e catholique de Louvain (UCL), Louvain-la-Neuve, Belgium, in 2003. He is currently working toward the Ph.D. degree in the Communications and Remote Sensing Laboratory of UCL. His research interests include turbo(-like) techniques and joint source-channel coding/decoding.

DRAFT-15

Christine Guillemot is currently ’Directeur de Recherche’ at INRIA, in charge of the TEMICS research group dealing with image modelling, processing, video communication and watermarking. She holds a PhD degree from ENST (Ecole Nationale Superieure des Telecommunications) Paris. From 1985 to October 1997, she has been with FRANCE TELECOM/CNET, where she has been involved in various projects in the domain of coding for TV, HDTV and multimedia applications, and coordinated a few (e.g. the European RACE-HAMLET project). From january 1990 to mid 1991, she has worked at Bellcore, NJ, USA, as a visiting scientist. Her research interests are signal and image processing, video coding, and joint source and channel coding for video transmission over the Internet and over wireless networks. From 2000 to 2003, she has served as Associate Editor for IEEE Trans. on Image Processing, and is currently Associate Editor for IEEE Trans. on Circuits and Systems for Video Technology. She is a member of the IEEE IMDSP and of the IEEE MMSP technical committes.

Luc Vandendorpe was born in Mouscron, Belgium in 1962. He received the Electrical Engineering degree (summa cum laude) and the Ph. D. degree from the Universit´e catholique de Louvain (UCL) Louvain-la- Neuve, Belgium in 1985 and 1991 respectively. Since 1985, L. Vandendorpe is with the Communications and Remote Sensing Laboratory of UCL. In 1992, he was a Visiting Scientist and Research Fellow at the Telecommunications and Traffic Control Systems Group of the Delft Technical University, The Netherlands. Presently he is full Professor and head of the EE department of UCL. He is mainly interested in digital communication systems : equalization, joint detection/synchronization for CDMA, OFDM (multicarrier), MIMO and turbo-based communications systems (UMTS, xDSL, WLAN, etc.) and joint source/channel (de)coding. In 1990, he was co-recipient of the Biennal Alcatel-Bell Award from the Belgian NSF. In 2000 he was co-recipient (with J. Louveaux and F. Deryck) of the Biennal Siemens Award from the Belgian NSF. L. Vandendorpe is or has been TPC member for IEEE VTC Fall 1999, IEEE Globecom 2003 Communications Theory Symposium, the 2003 Turbo Symposium, IEEE VTC Fall 2003, IEEE SPAWC 2005 and IEEE SPAWC 2006. He was co-technical chair (with P. Duhamel) for IEEE ICASSP 2006. He is an elected member of the Sensor Array and Multichannel Signal Processing committee of the Signal Processing Society and associate editor of the IEEE Trans. on Signal Processing. L. Vandendorpe is a Fellow of the IEEE.

Suggest Documents