2013 13th Canadian Workshop on Information Theory
Improved Systematic Fountain Codes in AWGN Channel Khaled F. Hayajneh, Student Member, IEEE
Shahram Yousefi, Senior Member, IEEE
Dept. of Electrical & Computer Eng. Queen’s University Kingston, Ontario, Canada Email:
[email protected]
Dept. of Electrical & Computer Eng. Queen’s University Kingston, Ontario, Canada Email:
[email protected]
Abstract—Fountain codes are typically defined solely based on the right degree distributions which are Soliton or Soliton-like in most cases. In this paper, we consider Fountain encoders for the Gaussian channel and improve their performance by shaping the resulting left degree distributions away from Poisson. The proposed Fountain encoders achieve lower left degree variance at the same average degree. This in turn improves the growth of the so-called ripple in the BP decoding. The new encoders outperform the traditional ones in terms of overhead, error rate, and decoding complexity.
I. I NTRODUCTION Fountain codes were introduced to transfer information over lossy channels such as the Internet [1]. Due to lost packets, buffer overflows, and with the aid of other codes such as cyclic redundancy check (CRC) codes, for instance, one can assume that packets arrived at the transmitter are error-free. In such erasure channel models, packets are lost or perfect. Luby-transform (LT) codes are the first realization of the digital fountain paradigm. This is a fully scalable approach for multicast or broadcast of large amount of data with unprecedented efficiency and complexity. In a digital fountain, a transmitter generates a potentially limitless number of encoded1 symbols from k input symbols2 according to a carefully-designed degree distribution. A receiver recovers the k input symbols from any set n = (1 + )k of received coded bits where is the percentage overhead. LT codes are shown to be asymptotically optimal and universal in the sense that they achieve the capacity of any binary erasure channel (BEC) no matter what the erasure rate is. In other words, they transmit the data reliably at minimal overhead [2]. The degree distribution is at the heart of the design of LT codes. Luby originally introduced the ideal soliton distribution (ISD). The special design of this distribution makes it impractical for finite k where the decoder fails immaturely in most cases. Robust soliton distribution (RSD) is then introduced via modifying the ISD to improve the decodability [2]. These are only optimal in BEC and ever since their introduction, much research has been devoted to design and analysis of similar codes over noisy channels such as the binary symmetric (BSC), binary-input additive white Gaussian noise (BIAWGN), and fading channels. The other research direction having received much attention is the design and analysis of improved codes for finite lengths over the BEC. 1 In the literature, coded, encoded, codeword or output bits represent those of a digital fountain sent through the channel. We use these terms interchangeably. 2 Similarly, the terms input, information, message, or source bits are intended to refer to those of a data source which are to be coded for transmission.
978-1-4799-0634-5/13/$31.00 ©2013 IEEE
RSD is not optimal for finite and practical values of k and improvements in terms of realized rate, R = nk (or overhead), complexity, and error performance (bit error rate: BER) are possible and reported for many cases. Fountain codes present the decoder with a linear system of equations with k unknowns which can be solved optimally with a Gaussian elimination (GE) decoder. The cubic complexity of GE is a deterrent: we resort to the sub-optimal maximum-a-posteriori (MAP) decoding based on the belief or probability propagation (BP/PP) algorithm. This is typically accomplished on a generator-based (G-based) Tanner/factor graph (TG) representation of the code. The complexity of such decoding is proportional with the edge complexity of the bipartite graph used at the expense of decoding failures even at times when the system of equations would return a solution with the optimal GE. In most cases, the failure is due to the fact that the decoder is stuck in a subgraph such as a stopping set, trapping set, or absorbing set depending on the channel model. Although based on the common foundation of greedy local message passings on a graph, the BP decoder is different for different channels. Different message update rules on both sides of the bipartite graph cause the process to be more expensive progressively from BEC to BSC and BIAWGN channels [3]–[6]. A large subset of degree distributions designed for Fountain coding share many similar properties with the RSD. These schemes are carefully designed to provide an appropriate number of degree-1, degree-2, and a maximum or spike degree dmax . These mostly fit under the umbrella of Soliton-like codes as introduced by Liau et al. [7]. Degree-1 output nodes of the graph (coded bits equal to a single source bit) are required to kick-off the BP decoding right from the onset and are also needed to keep the decoder progressing in the following iterations of BP. Such nodes help construct the ripple of the decoder. They pass their confident values to their only information node neighbour decoding it and in turn transforming some degree-2 output nodes to become degree-1 in the subsequent iteration. The decoder fails if the ripple is empty at any BP iteration before we have recovered all the k information bits. The notion of ripple was originally coined for BEC decoding as explained above but can be extended for other channels as we will discuss later. This is in fact what gave their name to Soliton codes thanks to their resemblance to undying Soliton waves in optics. In ISD, theoretically and asymptotically, the BEC ripple has a unitary size which is ideal: it keeps the decoding on at the lowest complexity and redundancy, yet is clearly very risky in practice as the smallest variation would cause the ripple to
148
2013 13th Canadian Workshop on Information Theory
die. The degree-1 nodes play equally important roles in noisy channels such as the BIAWGN channel considered in this article. In the context of classical coding, these nodes are simply referred to as systematic coordinates. In classical codes, such coordinates (i.e., systematic codes) lend themselves to lower-complexity decoders. Thanks to their special role in the BP decoding, it is not surprising that systematic LT codes have received special attention. A number of works have shown how systematic codes with naturally higher degree-1 coded nodes can outperform nonsystematic codes at various rates and channel conditions [5], [8], [9]. Van et al. show that the desired number of degree-1 nodes depends further on the number of BP iterations used [10]. In particular, as the number of iterations increases, the percentage of systematic bits needed decreases. The other benefit of using systematic bits is that when a decoder fails too prematurely due to a stopping/trapping set, we are more likely to have extracted a large fraction of the data. A great deal of previous work has focused on the design and analysis of codes and decoders in the light of the above-mentioned graph anomalies. Hussain et al. modify the nonsystematic LT encoding protocol to reduce the error floor caused by them [11]. The work of MirRezaei et al. is an example of a recent work on improving both the code structure and the decoder via handling the trapping/absorbing sets [12]. In all of the above works and in fact almost everywhere is previous literature, the focus has been solely on the right degree distribution (RDD), that is, on the distribution of degrees for coded bits/nodes. In LT coding, for any given degree d picked from the RSD, d information nodes are sampled uniformly at random from all k available without memory. This lack of control means that the resulting left degree distribution (LDD) is Poisson [2]. The Poisson LDD is left intact mostly to protect the optimality and universality of the RSD. Our attention here is on practical codes for finite k: the optimality and universality do not hold here anyways. In fact, it has been proven that universal codes do not exist for noisy channels [13]. In this work, we propose a new Fountain coding protocol aiming at improving the LDD. In an attempt to better the ripple behaviour in BIAWGN channel, we shape the LDD to outperform the original RSD. It should be noted that the proposed encoding scheme can be used in conjunction with any RDD to improve the realized rate/overhead and/or the BER. There have been a number of improvements to RSD ever since the original work of Luby. We do not need a comparison with these schemes as similar improvements can be reported. Rather, we focus on showing the potential of our approach for the generic systematic LT as an example. The rest of this paper is organized as follows. In section II, we provide the system model and briefly review the traditional LT coding over BIAWGN channel. We propose the new Fountain encoding in section III. Simulation results are provided in section IV followed by concluding remarks in section V. II. S YSTEM M ODEL We consider the transmission of block of information of size B bits through a BIAWGN channel. The set {0, 1} is mapped to ±1 levels with antipodal signalling for transmission in zero-mean white Gaussian noise with a two-sided power
spectral density of σ 2 = N20 W/Hz. The entire data is first parsed into generations each of size k bits: we will have l = dB k e generations each requiring their own session. In what follows, we review the systematic LT coding and decoding over BIAWGN. Further details can be found in [5], [8], [10]. A. Encoding Systematic LT Codes For a given generation with k source bits u = (u1 , u2 , ..., uk ), the LT encoder generates a potentially limitless output stream c = (c1 , c2 , ...). For the systematic implementation, the encoder first relays (u1 , u2 , ..., uk ) followed by the parity bits encoded as follows: 1) choose a degree d by sampling the RDD Ω(d), 2) choose d source symbols uniformly at random from the k available, 3) perform bitwise X-OR of the d source symbols chosen above: this is the output bit to be transmitted. ci = ui1 ⊕ ui2 ⊕ ... ⊕ uid
(1)
By repeating the above steps in an i.i.d. (independent and identically distributed) manner, the output symbols are generated one at a time. The output sequence will be c = (u1 , u2 , ..., uk , ck+1 , ck+2 , ...) up to a permutation. The received sequence at the output of the BIAWGN channel is y = (y1 , y2 , ...) where yi = ci + ηi is the ith noisy bit in which ηi ∼ N (0, σ 2 ). B. LT decoding over AWGN The BP decoder of a Fountain code in general involves multiple attempts. The first attempt in the most optimistic scenario is clearly when n = k output symbols have been received (or slightly larger than k). This depends on the channel model where for instance in a cleaner BEC, n = k is reasonable while in fading, one has to look at the accumulated mutual information as more bits are received and try the first decoding attempt when a threshold related to capacity is met. As the BP decoder of an LT code in AWGN progresses, it is as straightforward as the case of BP decoding on the paritycheck-matrix-based (H-based) TG to decide how many BP iterations are enough. In H-based codes such as low-density parity-check (LDPC) codes, the stopping criterion is having all the check equations satisfied or reaching a maximum number of iterations. The former is not possible on a G-based graph unless an external CRC code is utilized to do error detection on top. This is in fact a common component of practical systems based on Fountain codes. We further assume that the decoder is synched with the encoder and the same TG seen at the encoder is available on the decoder side for a BP implementation. This is a simple assumption which can be realized in a number of ways [2]. The decoder TG is first initialized with all-zero loglikelihood ratios (LLRs). Then the channel state information (CSI) in the form of initial LLRs is fed onto the coded nodes (right nodes). These are: LLR(ci ) = log
Pr(x = 1|yi ) . Pr(x = 0|yi )
(2)
Where x is the transmitted symbol and yi is the received signal of encoded symbol i.
149
2013 13th Canadian Workshop on Information Theory
Source Symbols
u1
u2
u3
u4
ut,1
ut,2
ut,d
. .. Lso(ut,1)
. ..
Output Symbols
c1
c2
c3
c4
c5
c6
Lso(ut,2)
Los(ut,d)
cn
LLR(ct) Fig. 1: The TG of a systematic LT code with k = 4: upper nodes represent the source (left) bits; the middle stack represents the output (right) bits; and the lower stack is simply a replication of the output bits to hold the initial LLRs.
ct Fig. 2: Output symbol to source symbol message update for an output node ct .
ut
These are shown in Fig. 1 at the bottom stack via the replicated nodes (to hold the CSI for all the iterations). The decoder needs to know the composition of output symbols. Luby suggests this information can be carried in the header of each packet. There are also a variety of ways to share this information between transmitter and receiver, which are beyond the scope of this paper. The LT decoder soft values are set to values corresponding to the modulator soft outputs such as log-likelihood ratios (LLR). Each output symbol ci takes on the initial LLR value as in equation (2).
1) Right to left or output symbol to source symbol messages: in this step, the processed LLR’s or beliefs are sent from output symbols to source symbols according to: d Y L (u ) LLR(c ) SO t t j . tanh LOS (uti ) = 2 artanh tanh 2 2
2) Left to right or source symbol to output symbol messages: In this step, the processed LLRs are sent from source symbols to output symbols according to: LSO (uti ) =
d X
LOS (uti )
(4)
j=1,j6=i
where t is between 1 and k and d is the degree of the source symbol ut as shown in figure 3. At any point in the iterations the estimate a-posteriori probability (APP) for each source node in the LLR domain will be simply the sum of its
ct,2
ct,1
ct,d
Fig. 3: Source symbol to output symbol message update for a source node ut .
incoming LLR messages: d X
LOS (ut,j ).
(5)
j=1
j=1,j6=i
(3) where t is between 1 and n and d is the degree of the output node ct as shown in Figure 2.
Lso(ut,d)
. ..
Then, the decoding iterations begin. Every right and left message passing counts as one BP iteration. The message updates are based on the following :
Los(ut,2)
Los(ut,1)
Thus, an estimate of the corresponding source symbol ut is determined by performing a hard limit on the above LLR. The decoder continues the message updates according to the above two equations until a codeword is found (using a CRC) or the maximum iteration is reached. There exist more sophisticated stopping criteria which are beyond the scope of this work [12]. If the stopping criterion is met while the decoder has not been successful, T more noisy coded bits are collected and added to the graph and the next decoding attempt starts in the same manner: n → n + T . Generally, there are two approaches to continue with the decoding from an attempt to the next. In message-reset decoders, the messages are set back to zero while in incremental decoders, we continue with the latest LLRs from the last iteration of the previous attempt.
150
2013 13th Canadian Workshop on Information Theory
III. A NEW F OUNTAIN ENCODING PROTOCOL Designing good RDDs must consider two issues: • Some output symbols must have high degrees to ensure that there are no unconnected source symbols: this helps guarantee graph connectivity. • Some output symbols must have low degrees (some unity) in order to keep the encoding and decoding complexity as small as possible while we also guarantee that the decoding process can get started and keep going. From equation (3), we can see at the first iteration, only degree-1 output symbols pass their LLR’s to the left successfully. These LLR’s propagate through the graph immediately. If we do not have any degree-1 output symbols, we can not start the decoding: this is identical to the BEC case. In systematic LT codes, due to the large number of degree-1 output symbols, the LLR’s pass through the graph faster and the decoder needs fewer number of iterations in general. Given this, we can define the AWGN ripple as follows: Definition: The AWGN ripple of an LT Tanner graph in each iteration includes the coded nodes whose outgoing messages to the left are nonzero. As the iterations proceed, a good graph will allow the cardinality of the ripple to grow from 0 to the maximum of n fast and efficiently. The sooner we arrive at that point, the more effective the message passings will be. That will allow for a faster convergence towards the decoder output. On average, for nonsystematic LT codes at k = 100, the decoder takes around 10 iterations to reach the maximum ripple size while in the case of systematic codes, the process is faster and the decoder takes 2-3 iterations to reach the maximum ripple size. In nonsystematic codes as k increases, the number of iterations needed to reach the maximum ripple size increases very quickly while in systematic codes, the number of iterations required essentially remains the same. Our RDD is kept the same as the traditional cases [2]. The ISD is given as: ( 1 , for d = 1 ρ(d) = k 1 (6) , for d = 2, 3, ..., k. d(d−1) The RSD is then introduced √ to increase the BEC ripple size from unity to S = c ln(k/δ) k for a constant c and the decoding failure rate of δ. The distribution is superimposed on the positive function S 1 , for d = 1, 2, ..., (k/S) − 1 k d S τ (d) = k log(S/δ) , for d = k/S 0 , for d > k/S
(7)
to form the RSD, Ω(d): Ω(d) = P
ρ(d) + τ (d) β
(8)
where β = d ρ(d) + τ (d) is the normalization constant. Our new encoding protocol aims at improving the AWGN ripple dynamics while still using the carefully-designed RDDs such a RSD or any other Soliton-like scheme. This can be done by not necessarily choosing the source symbols in an i.i.d. manner. In fact, we propose to introduce memory in this process. The only cost to the encoder is a stack keeping track
of all instantaneous source node degrees as we proceed with the encodings. We would like to introduce memory in our bit selections to increase the rate of change in the cardinality of the AWGN ripple as much as possible. This will improve the decoding allowing for better realized rates, lower BERs, and lower decoding complexities due to the success at lower iterations. We set out the following optimization problem: find the best 1 ≤ d ≤ dmax for which the following can be performed: (A) we choose d nodes from the lowest-degree source nodes, or, (B) we choose d nodes from the highest-degree source nodes. Extensive optimization for various values of k with the RSD showed that the optimal degree is d = 2 with scenario (A). The results are not surprising as degree-2 nodes constitute the most probable nodes and thus have the highest impact. Also, scenario (A), can be shown to result in a smaller variance in the LDD: this can improve the performance up to a point where negative phenomena cancel the positive returns. We do know that a uniform LDD is not a good choice but a reduction in the variance from that of a Poisson while keeping the average intact is a good direction. The analytical results of this process are outside the scope of this short conference presentation. In summary, we purpose the following LDD-shaping protocol (LDDSP). For each parity bit: 1) Randomly choose a degree d form the desired Solitonlike RDD Ω(d), 2) • If d = 2, choose the source symbols with lowest degrees to form the coded bit. In the case of ties, choose uniformly at random. • If d 6= 2, choose d source symbols uniformly at random among all the source symbols. 3) Transmit the bitwise X-OR of the d source symbols through the channel. ci = ui1 ⊕ ui2 ⊕ ... ⊕ uid
(9)
4) Repeat from (1) to (3) until a sufficient number of output symbols is obtained at the receiver to succeed in the decoding. An extra positive effect of this new protocol is that it diminishes the chance to have unused source symbols in nonsystematic codes. IV. P ERFORMANCE A NALYSIS Simulations have been performed in order to compare the proposed scheme (LDDSP) with LT code without LDD shaping. For each simulation set, the RSD has been set to that with c = 0.01 and δ = 0.5. For systematic LT codes with 1000 packets, Figure 4 displays BER versus the inverse of the rate (normalized overhead) for different SNRs at k = 100 with maximum number of iterations of it = 20. The figure shows that the proposed scheme outperforms the LT code with original RSD in all the regions of SNR. Also, we can observe that as the SNR increases the proposed scheme becomes much better in terms of BER. Figure 5 shows the systematic RSD with and without shaping at k = 100 and R = 1/2 for 1000 packets. We can see that the proposed scheme outperforms the traditional RSD code specially at high SNRs. Figure 6 shows the left degree distribution for the traditional RSD and the proposed scheme. From the figure, we can see
151
2013 13th Canadian Workshop on Information Theory
0.25
−1
10
LT LDDSP
LT, SNR=0dB LDDSP, SNR=0dB LT, SNR=2dB LDDSP, SNR=2dB LT, SNR=4dB LDDSP, SNR=4dB
−2
10
0.2
Percentage
BER
0.15
−3
10
0.1
−4
10
0.05
0
−5
10
1
1.2
1.4
1.6
1.8
2 2.2 R−1=n/k
2.4
2.6
2.8
3
0
2
4
6
8
10 Degree
12
14
16
18
20
Fig. 6: Left Degree Distributions of LT code and LDDSP. Fig. 4: BER versus inverse of the rate at k=100. easy to implement and it can be used with any right degree distribution.
0
10
ACKNOWLEDGMENT The work described in this paper was supported by Yarmouk University, Jordan and NSERC, Canada. The authors thank Mehrdad Valipour and Hossein Khonsari for constructive comments that improved the quality of this paper.
−1
10
−2
BER
10
R EFERENCES −3
10
−4
10
−5
10
−2
LT, it=1 LDDSP, it=1 LT, it=5 LDDSP, it=5 LT, it=10 LDDSP, it=10 LT, it=20 LDDSP, it=20 −1
0
1
2 SNR (dB)
3
4
5
6
Fig. 5: BER versus SNR at k = 100 and n = 200.
that the proposed scheme has reduced the degree variance while the average is seemingly intact. These simulations have been performed at k = 100, n = 150 and for 1000 packets. V. C ONCLUSION In this work, we improve the performance of Fountain codes by shaping the left degree distribution away from Poisson. The proposed Fountain encoders achieve lower left degree variance at the same average degree. This is due to the fact that the edge complexity is controlled from the right by RSD or any other Soliton-like RDD. This in turn improves the growth of the AWGN ripple in the BP decoding. The new encoders outperform the traditional ones with the use of memory at a very small cost of a degree stack. Simulation results have been obtained to show the performance improvements in terms of SNR, overhead, and BER. Also, the proposed scheme is
[1] J. W. Byers, M. Luby, M. Mitzenmacher, and A. Rege, “A digital Fountain approach to reliable distribution of bulk data,” in Proceedings of the ACM SIGCOMM , Vancouver, BC, Canada, pp. 56-67 Aug. 1998. [2] M. Luby, “LT codes,” in Proceedings of the 43rd Ann. IEEE Symposium on Foundations of Computer Science, pp. 271-280, 2002. [3] R. Palanki and J.S. Yedidia, “Rateless codes on noisy channels,” in Proceedings of the International Symposium on Information Theory (ISIT) 2004, pp. 37, Chicago, IL, Jun. 2004. [4] J. Castura and Y. Mao, “Rateless coding over fading channels,” IEEE Communications Letters, vol. 10, no. 1, pp. 46-48, Jan 2006. [5] T. D. Nguyen, L. L. Yang, and L. Hanzo,“Systematic Luby Transform Codes and Their Soft Decoding,” IEEE Workshop on Signal Processing Systems (SiPS), pp. 67-72, October 2007. [6] T. Stockgammer, H. Jenkac, T. Mayer, and W. Xu, “Soft decoding of LT-codes for wireless broadcast,” in Proc. IST Mobile, 2005. [7] A. Liau, S. Yousefi, and I. Kim, “Binary soliton-like rateless coding for the y-network,” IEEE Trans. Commun., vol.59, no.12, pp.3217-3222, December 2011. [8] X. Yuan and L. Ping,“On systematic LT codes,” IEEE Communications Letters,vol. 12, no. 9, pp 681-683, Sep. 2008. [9] T. L. Grobler, E. R. Ackermann, J. C. Olivier, and A. J. van Zyl, “Systematic Luby Transform codes as incremental redundancy scheme,” AFRICON, 2011, pp. 1-5, 13-15 Sept. 2011. [10] Van Hoan Tran, Kuen-Tsair Lay, and Lun-Chung Peng, “Modified LT coding with systematic connections,” in International Conference on AntiCounterfeiting, Security and Identification (ASID), 2012, pp. 1-4, 24-26 Aug. 2012. [11] I. Hussain, M. Xiao, and L.K. Rasmussen, “Error Floor Analysis of LT Codes over the Additive White Gaussian Noise Channel,” in IEEE Global Telecommunications Conference (GLOBECOM 2011), pp.1-5, 59 Dec. 2011. [12] M. MirRezaei and S. Yousefi, “Absorbing sets of fountain codes over noisy channels,” in Proc. QBSC2012, May 2012, Kingston, ON, Canada. [13] O. Etesami and A. Shokrollahi, “Raptor codes on binary memoryless symmetric channels,” IEEE Transactions on Information Theory, vol. 52, no. 5, pp. 2033-2051, May 2006.
152