Digital Fountain Codes System Model and Performance over AWGN ...

Digital Fountain Codes System Model and Performance over AWGN and Rayleigh Fading Channels Weizheng Huang, Student Member, IEEE, Huanlin Li, and Jeffrey Dill, Member, IEEE The School of Electrical Engineering and Computer Science, Ohio University Athens, OH 45701, USA

ABSTRACT Digital fountain codes were originally packet-level forward erasure codes, but researchers have found that they are good at correcting bit errors. Modern LDPC and turbo codes have been shown to approach the Shannon limit. However, they are fixed rate codes and cannot well satisfy the Internet services that may encounter variable loss rates or asynchronous data access. For both packet and bit levels, fountain codes share the same belief propagation decoding mechanism with LDPC and turbo codes. We propose a general fountain codes system model over noisy channels. Conventional LT codes work poorly on AWGN and Rayleigh fading channels but Raptor codes offers near capacity performance if their pre-codes are a WiMAX LDPC code or our left-regular LDPC code. The code performance can be approved if we select more reliable received symbols for decoding. Our simulation results support the model design. Keywords: fountain code, LDPC code, Tanner graph, belief propagation, soft decoding, degree distribution.

1.

INTRODUCTION

The “digital fountain” concept was first published in 1998 [2]. The basic idea behind this name is that different receivers can reconstruct same source data from any subset of collected symbols, without or almost without need of retransmission. Besides, data access and transmission are successfully initiated at random time, without the knowledge of channel state and potentially with no limitation of user population. LT (Luby Transform) codes, the first class of practical fountain codes, were initially designed for the Binary Erasure Channel (BEC). Although LT codes are famous for simplicity, good performance and optimality, they do not support fast encoding or decoding for large dimension. Raptor Codes, a later class of fountain codes, can achieve linear coding complexity and vanishing probability of error. Today digital fountain codes have been applied to multimedia communications, especially in the context of reliable multicast/broadcast and point-to-multipoint that often encounter asynchronous user access, random reception pause or interruption and variable or unknown loss rate. Some researchers have demonstrated LT and Raptor codes’ performance upon noisy channels [3]–[9]. However, as far as we know, not much has been published about fountain code application on additive white Gaussian noise (AWGN) or fading channels. In this work we give a picture of fountain codes on noisy channels. We present binary Raptor codes whose pre-codes are WiMAX LDPC (low-density parity-check) codes or LDPC codes created by our new method that guarantee H matrices of girth at least 6. In Section II, we recall harddecoding LT codes. In Section III, we review Raptor codes. In Section IV, we introduce our general system model for fountain codes over noisy channels. In Section V, we present simulation results and analysis. We make conclusion in Section VI.

2.

LT CODES

LT codes [10], the first fountain codes, are proven to be practical on the BEC. The LT code generates a limitless stream of encoded symbols, which are independent and identically distributed (i.i.d.), on the fly or in advance. The LT decoder can recover the source data from arbitrarily collected encoded symbols with small decoding overhead. This coding method acclimatizes the Internet environment characterized by variable or unknown packet loss rates. The process of LT encoding [10] is that from a degree distribution randomly choose a number of input symbols as the neighbors of an encoded symbol and then the encoded symbol is output as the XOR of its neighbors. At the receiver, the neighboring information of each collected encoded symbol is utilized to construct a Tanner graph [11] and the hard-decoder propagates in this bipartite graph the value of every degree-1 encoded symbol, in each iteration, to its neighbors, until all source symbols are recovered [10] [12]. If there is no degree-1 encoded symbol before the end of recovery, the decoding fails and the user may guess the undetermined source symbols or collect more symbols to decode. Note that the decoder is assumed to have the knowledge of the degree and neighboring indices of each received symbol. The model of hard-decoding LT codes on BEC is as follows. The encoder needs a generator matrix G to create encoded symbols, where k is the code dimension and m is unlimited. The encoded symbols are produced on the fly and it is equivalent to establishing G by generating i.i.d. columns, as many as needed, whose Hamming weights are determined by the Robust Soliton distribution proposed by Luby [10] and whose weight positions are uniformly distributed. The encoded symbols are transmitted once requested, but some of them are erased. At the receiver, with the n received symbols, the decoder builds a generator matrix G ( ) and G is actually made by deleting those columns of G corresponding to the erased symbols. Apparently, the n received symbols are a random and variable subset of the m transmitted symbols. It is the Robust Soliton distribution that guarantees that with small overhead all source symbols are covered by encoded symbols and there is at least one degree-1 encoded symbol in each decoding iteration. Although these hard-decoding LT codes are very capable of correcting erasures regardless of loss rate, their performance can be poor if the decisions on some received symbols are wrong because the error is propagated throughout the message-passing decoding like a rolling snowball causing disastrous degradation. A remedy is stated in [23] to improve LT codes on AWGNC (AWGN channels) by setting a threshold at the decoder to erase those received bits that are more likely unreliable, but this harddecoding method cannot achieve significant coding gain.

3.

intermediate sequence , , of fixed length N. If only an LT code is applied, the pre-code encoder and decoder are not needed, that is, and the LT decoder outputs . Similarly, in the case of pre-code only (PCO), the LT encoder and decoder are removed and that is a model for linear block codes. The intermediate symbols are turned into a limitless stream , , by the LT code. The vector , a modulated form of , is sent to the multiple users. A requester accesses the data at an arbitrary time and collects encoded symbols without need of the start of the source file.

RAPTOR CODES

Raptor codes are an extension of LT codes with linear encoding and decoding complexity, which were developed by Shokrollahi [13] and Maymounkov [14] independently. A Raptor code is a concatenated code with an outer pre-code, which is a fixed rate erasure code like LDPC codes, and an inner LT code. The precode can be multistage. The inner LT code is also called “weakened LT code” [25] and usually cannot recover all its input symbols, but the pre-code corrects all the other erasures. The degree of the LT output encoded symbol is chosen from a distribution proposed by Shokrollahi. Raptor codes have been adopted by The 3rd Generation Partnership Project (3GPP) [15] for reliable data delivery in mobile broadcast and multicast.

In this model, the transmitted symbols are modified by channel gain and/or AWGN. The channel gain is a non-negative real vector , , and its elements are in ordinal manner the gains of the symbols of . The gain vector can be a constant or approximate any fading models. The Gaussian noise, with zero mean and two-sided power spectral density (PSD) ⁄2, is also in form of vector: , , . We assume the channel gain and the noise are constants over the entire slot of the i-th transmitted symbol. At the receiver, the output of the matched filter is sampled at the end of the symbol duration. At this point, the Gaussian noise is still zero mean and its variance ⁄2. Before input to the decoder, these samples are is ‘selected’ by the receiver pattern , , with 0,1 that stands by erasure, communication loss or symbol acceptance, which depends on the particular channel model.

Why are Raptor codes concatenated with the weakened LT code, not the conventional LT code? The reason lies in coding speed, or computation cost. Computation cost can be defined by the degree of the encoded symbol. For example, to calculate an encoded bit of degree i, i bitwise additions are required, so the cost is i. In [10], due to the Robust Soliton distribution and the minimum number of collected symbols suggested by Luby, for large dimension k the LT code is no longer efficient since in average an encoded symbol needs at least a number of bitwise operations on the order of ln ⁄ , where 0 1. Raptor codes solve this problem by achieving linear encoding and decoding complexity. Shokrollahi proved that the Raptor coding cost is on the order of ln 1⁄ for each symbol, where 0.

The fountain decoder utilizes the classic belief propagation (BP) technique applying Tanner graphs, which is standardized in LDPC decoding [18] – [20]. For simplicity all symbols in this system are bits and they are transmitted in form of BPSK (binary phase shift keying). The belief propagated in the decoding procedure is log-likelihood ratio (LLR) [21] of binary random variable { 1} in GF(2), given by

Raptor codes with soft decoding have been reported to offer good error-correcting performance on noisy channels [3] [6] [16] [17] and can obtain more coding gain and significant lower error floors than Luby’ LT codes.

ln 4.

Where ln · is the natural logarithm and is the probability that X takes on the value x. Assume 0 1 and, according to the derivation in [19], the observed LLR from the channel for the received bit is given by 4 ⁄ (2)

SYSTEM MODEL

Using the system designs in [4] and [16] for reference, we give a wired and/or wireless source-to-multi-destination system: fountain-encoded symbols are sent from a single transmitter to multiple synchronous or asynchronous receivers on a noisy channel. Acknowledgement feedback is permitted so that data transmission can be terminated if no user requests. All synchronizations are perfect. Handoffs and communication loss occur at arbitrary time. This system is shown in Figure 1. u Pre-code encoder

I

LT encoder

s

Mod. & Tx

t

Channel gain

Note that, for error-free transmission and reception, ( 1) for 1( 0).

Channel h g

Hard decision

mv

mv,I Pre-code decoder mI,v

Receiver pattern b r LT decoder

Matched filter user f

t = jT

Figure 1. System Model This system is general for linear block codes, LT and Raptor codes on AWGN and/or fading channels. Consider a source file of length k symbols requested by multiple , , users. When a Raptor code is adopted, is pre-encoded into an

1

Consider a Raptor code of rate ⁄ at an arbitrary user f illustrated in Figure 2. The Tanner graph of the LT generator matrix is built on the fly. The received output bits are the check nodes (c-nodes) , , of the LT Tanner graph. The pre-code C is a rate ⁄ LDPC code of dimension k and its H matrix is known by the decoder in advance. The variable nodes (v-nodes) , , and , , are the same intermediate bits in the two graphs. and exchange updated belief with each other in every decoding iteration. , , are c-nodes of the Tanner graph of the pre-code’s H matrix. All edges are bidirectional message-passing ways between v-nodes and c-nodes.

AWGN

(1)

[16] gives a concise and complete description of Raptor decoding that starts at the LT decoder by passing LLRs from the v-nodes to their neighboring c-nodes and, at the end of each LT decoding iteration, forwards its updated LLR . , to Taking the newly incoming messages as observed LLRs for , , , the LDPC decoder propagates belief for one iteration and then makes hard decision on , , . If the decision does not match any valid codeword, forwards its LLR , to for next iteration. adds to its incoming LLRs from c, nodes in the previous LT iteration. This propagation is kept repeated until a valid codeword is decided from , , or the assigned maximum number of iterations are exhausted.

threshold at the matched filter output to deny all output samples whose absolute values are smaller than the threshold. Recall that error-free sample values are ±1 for source bits 1 and 0. At low signal-to-noise ratio (SNR), the performance of the both methods is very poor. Although they can achieve some coding gains at high SNR, the performance is unstable and the second method bears very high complexity. As stated in Section IV, in the first decoding iteration only degree-1 encoded bits can send out useful belief. Due to the Robust Soliton distribution, the fraction of degree-1 encoded bits is very small so the probability that a v-node has more than one degree-1 neighbor is also very small. If a v-node has only one degree-1 neighbor and its observed LLR does not match the corresponding source bit, the v-node will propagate this error to other nodes.

LT decoder LDPC decoder Received bits Intermediate bits Check nodes r1 1 v1 I1 m I ,v

v2

m

c1

I2

m v2, I

r3

vN-1 rn-1

vN

m IN, v 1

c2

IN-1

m vN, I 1

m IN, v

cN-k

IN

m vN, I

rn

Figure 2. Raptor Decoder Configuration Here we need to point out that, within this model, soft decoding (noisy channel) and hard decoding (BEC) are the same in nature. Take the LT decoder as an example. Since the v-nodes bear no LLRs at the beginning of decoding, they send zero messages to their neighbors so this first half iteration is trivial and at this moment all edges are still bidirectional zero-LLRpassing. In the second half of the first iteration, only degree-1 encoded bits can send non-zero LLRs to the v-nodes. One can verify this by looking at the equation computing the message sent from an output bit to a v-node in the l-th iteration [16], 2tanh

,

tanh ⁄2 ∏

tanh

,

⁄2

0

10

FOUNTAIN CODES PERFORMANCE

We investigate binary soft-decoding LT and Raptor codes on the AWGNC and memoryless Rayleigh fading channel. Luby’s LT codes suffer high complexity and large-deviation bit error performance, but Raptor codes achieve impressive coding gains if the pre-code is a rate ½ LDPC code standardized by WiMAX [22] or created by our own approach based on splitting-andfilling technique [26]. If we chose those more reliable received symbols to decode, the fountain codes yield more coding gains. We proposed two remedies for LT codes: (1) discard less reliable received bits, (2) low code rate. The first scheme sets a

Uncoded BPSK WiMAX code Left-regular code

-1

(3)

where tanh · and tanh · are hyperbolic tangent function and inverse hyperbolic tangent function, L is the observed LLR of and is the LLR sent to the received encoded bit from , all its neighboring v-nodes except . Therefore, in the first iteration some v-nodes receive new LLRs and in the second iteration they send updated LLRs to their neighbors but the other v-nodes still send zero. According to Eq. (3), an erased encoded bit is useless for the LT decoder because it always forces its outgoing messages to zero. The decoder repeats this recursive process. More and more edges achieve non-zero LLRpassing, so more and more v-nodes are granted LLRs and these messages are kept updated iteration after iteration. However, for the channel model of BEC only, it is much more convenient to propagate bit values than LLRs. The reason lies in error-free reception and according to Eq. (1) the only message propagated on the Tanner graph is 1 ∞ which is absolute certainty. It is easy to see that, on the same LT generator Tanner graph, the path of bit value propagation for hard decoding is the same as the route of non-zero LLR flow for soft decoding.

5.

Raptor codes are a good solution for both coding gain and complexity. We design rate 0.1 Raptor codes with two kinds of rate ½ pre-codes of dimension 1152. The first kind is a WiMAX LDPC code that has a systematic and irregular 1152×2304 H matrix. This code has been well created and proven to achieve good near capacity performance on AWGNC. The second kind is a nonsystematic LDPC code with a left-regular 1153×2304 H matrix whose column Hamming weight is 4. This pseudorandom H matrix was constructed by our own splitting-and filling approach and it has girth of at least 6 and its stopping distance is 6. As the column degree of the matrix is even, its rank is 1152 by modulo-2 arithmetic and its corresponding generator matrix is of size 1152×2304. The performance of these two LDPC codes on AWGNC is shown in Figure 3.

10 Bit Error Rate

m

r2

1 v ,I 2 I ,v

-2

10

-3

10

-4

10

0

1

2 3 SNR (E b/N0)

4

5

Figure 3. LDPC Codes on AWGNC (30 decoding iterations) According to Shokrollahi’s proposed degree polynomial [13], for the LT output bits we set the following degree distribution, Ω

0.3077

0.3462

0.1154

0.0577

0.0231 0.0165 0.0124 0.0346 0.0096 0.0077 0.0692 (4) Rate 0.1 Raptor codes on AWGNC with the above two kinds of pre-codes were simulated for 1152 source bits and the result of 100 trials is illustrated in Figure 4. The left side shows that, after 15 decoding iterations, the bit error rate (BER) curves of the two Raptor codes are close at low SNR. At BER of 10-3 the Raptor code with our left-regular LDPC code has about 0.5 dB more coding gain than this LDPC code itself. In order for more coding gain, we set threshold as 0.6 at decoder to erase less reliable incoming bits (the overall code rate is still 0.1). The results after 8 iterations are illustrated in the right of Figure 4 and after 15 iterations the BER’s can be as low as 10-3 at 0 dB. In case of memoryless Rayleigh fading occurring on AWGNC, the two Raptor codes achieve very good and close bit error performances, as illustrated in Figure 5.

0

10

-1

-1

10

10

-2

-2

10 Bit Error Rate

Bit Error Rate

10

-3

10

-4

10

[6]

-3

10

-4

10 Uncoded BPSK WiMAX pre-code Left-regular pre-code

-5

10

[7] Erasure threshold: 0.6

-5

10

-6

10

[5]

0

10

[8]

-6

0

1

2

10

3

0

1

Eb/N0

2

3

Eb/N0

Figure 4. Raptor Codes on AWGNC 0

10

-1

-1

10

10

-2

[10]

-2

10

Bit Error Rate

Bit Error Rate

[9]

0

10

-3

10

-4

10

-5

10

-3

10

[11]

-4

10

[12]

-5

10

10 WiMAX pre-code

Left-regular pre-code

-6

10

-6

2

3

4

5

10

2

Eb/N0

3

4

5

[13]

Eb/N0

Figure 5. Raptor Codes on Rayleigh Fading Channel

[14] [15]

6.

CONCLUSION AND FUTURE WORK

We have briefly described LT codes and Raptor codes. We focus on soft-decoding Raptor codes over AWGN channel. Our simulation result shows that, based on our system model, the rate ½ WiMAX LDPC code and our own regular LDPC code are very good pre-codes for the Raptor code on AWGNC. A proper symbol acceptance threshold can help achieve better performance. Future work will include investigation of optimal degree distribution for soft-decoding LT codes, high rate Raptor codes, and investigation of fountain codes on other fading channels. Symbol acceptance threshold can be sensitive to code performance so threshold design could be interesting.

[16]

[17]

[18] [19]

[20]

REFERENCES [21]

D. J. Costello and G. D. Forney, “Channel Coding: The Road to Channel Capacity,” Proc. IEEE 2007, vol. 95, pp. 1150-1177. [2] J. W. Byers, M. Luby, M. Mitzenmacher, and A. Rege, “A digital fountain approach to reliable distribution of bulk data,” in Proc. SIGCOMM Comput. Comm. Rev., Vol. 28, No. 4, 1998, pp. 56-67. [3] R. Palanki, and J. S. Yedidia, “Rateless codes on noisy channels,” IEEE Proc. Inf. Theory, June 27 - July 2, 2004. [4] T. Stockhammer, H. Jenkac, T. Mayer, and W. Xu, “Soft decoding of LT-codes for wireless broadcast,” in Proc. IST Mobile, Dresden, Germany, 2005. [1]

[22]

[23]

[24]

[25]

R. Y. S. Tee, T. D. Nguyen, L. L. Yang, and L. Hanzo, “Serially concatenated Luby Transform coding and bitinterleaved coded modulation using iterative decoding for the wireless internet,” in IEEE VTC'06 Spring, 7-10 May 2006, Melbourne, Australia. Y. Ma, D. Yuan, and H. Zhang, “Fountain codes and applications to reliable wireless broadcast system,” in IEEE Inform. Theory Workshop, Chengdu China, 22-26 Oct. 2006, pp. 66-70. T. D. Nguyen, L. L. Yang, and L. Hanzo, “Systematic Luby Transform codes and their soft decoding,” in IEEE SiPS'07, 17-19 October 2007, Shanghai, China. D. T. Nguyen and L. Hanzo, “An optimal degree distribution design and a conditional random integer generator for the systematic luby transform coded wireless internet,” in IEEE WCNC'08, 31 March - 3 April 2008, Las Vegas, Nevada, USA. W. Yao, L. Chen, H. Li, and H. Xu, “Research on fountain codes in deep space communication,” Congress on Image and Signal Processing, Vol. 2, 27-30 May 2008, pp. 219224. M. Luby, “LT codes,” in Proceedings of The 43rd Annual IEEE Symposium on Fountations of Computer Science, pp. 271-282, November 16-19, 2002. R. Tanner, “A recursive approach to low complexity codes,” IEEE Trans. Information Theory, Vol. 27, pp. 533547, September 1981. D. J. C. MacKay, Information Theory, Inference, and Learning Algorithms, Version 7.0, Cambridge University Press, United Kingdom, 2004. A. Shokrollahi, “Raptor codes,”, IEEE Trans. Inf. Theory, Vol. 52, No. 6, pp. 2551-2567, June 2006. P. Maymounkov, Online codes, Technical Report TR2002833, New York University, November 2002. The official home page of 3GPP, http://www.3gpp.org/ B. Sivasubramanian and H. Leib, “Fixed-rate Raptor code performance over correlated Rayleigh fading channels,” Canadian Conference on Electrical and Computer Engineering, 22-26 April 2007, pp. 912-915. W. Yao, L. Chen, H. Li, and H. Xu, “Research on fountain codes in deep space communication,” 2008 Congress on Image and Signal Processing, Vol. 2, 27-30 May 2008, pp. 219-224. A. Shokrollahi, “LDPC codes: An introduction,” Digital Fountain, Inc., April 2003. W. E. Ryan, “An introduction to LDPC codes,” in CRC Handbook for Coding and Signal Processing for Recoding Systems (B. Vasic, ed.), CRC Press, 2004. X.-Y. Hu, E. Eleftheriou, D.-M. Arnold, and A. Dholakia, “Efficient implementations of the sum-product algorithm for decoding LDPC codes,” Proc. IEEE Globecom Conf. 2001, pp. 1036–1036E, Nov. 2001. J. Hagenauer, E. Offer, and L. Papke, “Iterative decoding of binary block and convolutional codes”, IEEE Trans. Inf. Theory, vol. 42, March 1996, pp. 429-445. T. W. Gerken, Implementation of LDPC Codes Using The IEEE 802.16e Standard, Problem report, West Virginia University, Morgantown, WV, 2008. H. Wang, Hardware Designs for LT Coding, MSc Thesis, Dept of Electrical Engineering, Delft University of Technology, 2004, pp. 19-21. O. Etesami and A. Shokrollahi, “Raptor codes on binary memoryless symmetric channels,” IEEE Proc. Inf. Theory, vol. 52, May 2006, pp. 2033-2051. D. J. C. MacKay, “Fountain codes,” IEE Proc. Communications, vol. 152, Dec. 2005, pp. 1062-1068.

[26] H. Li, W. Huang and J. Dill, “Construction of irregular

LDPC codes with low error floors,” 2010 International Conf. Computing, Comm. and Control Tech. accepted.