Subblock-Constrained Codes for Real-Time ...

0 downloads 0 Views 539KB Size Report
Apr 18, 2016 - Note that if xL. 1 denotes a given subblock of length L, then the composition of xL. 1 is the empirical distribution PxL. 1 on X defined by PxL. 1.
1

Subblock-Constrained Codes for Real-Time Simultaneous Energy and Information Transfer Anshoo Tandon, Student Member, IEEE, Mehul Motani, Senior Member, IEEE, and Lav R. Varshney, Senior Member, IEEE

Abstract Consider an energy-harvesting receiver that uses the same received signal both for decoding information and for harvesting energy, which is employed to power its circuitry. In the scenario where the receiver has limited battery size, a signal with bursty energy content may cause power outage at the receiver since the battery will drain during intervals with low signal energy. In this paper, we analyze subblock energy-constrained codes (SECCs) which ensure that sufficient energy is carried within every subblock duration. We consider discrete memoryless channels and characterize the SECC capacity and the SECC error exponent, and provide useful bounds for these values. We also study constant subblockcomposition codes (CSCCs), which can be viewed as a subclass of SECCs where all subblocks in every codeword have the same fixed composition, and this subblock-composition is chosen to maximize the rate of information transfer while meeting the energy requirement. Compared to constant composition codes (CCCs), we show that CSCCs incur a rate loss and that the error exponent for CSCCs is also related to the error exponent for CCCs by the same rate loss term. We exploit the CSCC code structure to obtain a necessary and sufficient condition on the subblock length for avoiding receiver outage. Further, for CSCC sequences, we present a tight lower bound on the average energy per symbol within a sliding time window. We provide numerical examples highlighting the tradeoff between delivery of sufficient energy to the receiver and achieving high information transfer rates. It is observed that the ability to use energy in real-time imposes less of penalty than the ability to use information in real-time.

A. Tandon and M. Motani are with the Department of Electrical and Computer Engineering, National University of Singapore, Singapore 117583 (email: [email protected], [email protected]). L. R. Varshney is with the Department of Electrical and Computer Engineering and the Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA (email: [email protected]). This work was supported in part by the National Research Foundation Singapore under Grant No. NRF-CRP-8-2011-01, and by Systems on Nanoscale Information fabriCs (SONIC), one of the six SRC STARnet Centers, sponsored by MARCO and DARPA. Some results in this paper were presented in part at the IEEE SECON 2014 Workshop on Energy Harvesting Communications, June, 2014 [1], and at the 2015 International Symposium on Information Theory (ISIT), June, 2015 [2].

April 18, 2016

DRAFT

2

I. I NTRODUCTION The study of simultaneous information and energy transfer is relevant for communication from a powered transmitter to a receiver which uses the same received signal both for decoding information and for extracting energy to power its circuitry. This has applications ranging from wireline [3], [4] to wireless [5], [6] communications. The use of the electric power distribution grid for low-speed information transfer has been successfully employed for decades [7]. With wireless power transfer, most practical applications [8], [9] have focused on near-field, short-distance energy transmission due to low efficiency and health concerns associated with long-distance and high-power transmissions [10], [11]. Biomedical applications of wireless energy and information transfer have been proposed through the use of human implants that receive data and energy through inductive coupling [12], [13]. The fundamental tradeoff between reliable communication and delivery of energy at the receiver, in an information-theoretic setting, was first characterized in [14] using a general capacity-power function, where transmitted codewords were constrained to have average received energy exceeding a threshold. This tradeoff between capacity and energy delivery was extended for frequency-selective channels in [15]. Since then, there have been numerous extensions of the capacity-power function in various settings, e.g. [5], [6], [16]–[18]. For practical applications of simultaneous energy and information transfer from a powered transmitter to an energy harvesting receiver, imposing only an average received power constraint may not be sufficient; we may also need to regularize the transferred energy content. This is because a codeword satisfying the average power constraint may still cause power outage at the receiver if the energy content in the codeword is bursty, since a receiver battery with small capacity may drain during periods of low signal energy. In order to regularize the energy content in the signal, we adopt a subblock-constrained approach where codewords are divided into smaller subblocks, and every subblock must carry energy exceeding a given threshold. The subblock length and the energy threshold may be chosen to meet the energy requirement at the receiver. This subblock approach to real-time energy transfer may also be relevant in scenario where symbols are batch-processed at the receiver. In this paper, we consider a discrete memoryless channel (DMC) and characterize the capacity and the error exponent when each subblock is constrained to carry sufficient energy. We assume that the receiver harvests energy as a function of the transmitted symbol. Since different symbols in the input alphabet X may correspond to different energy levels, the requirement of sufficient energy within a subblock imposes a constraint on the subblock composition. Note that if xL 1 denotes a given subblock of length L, then on X defined by PxL1 (x) , the composition of xL 1 is the empirical distribution PxL 1

April 18, 2016

N (x) L ,

x ∈ X , where

DRAFT

3

N (x) is the number of occurrences of symbol x in xL 1.

An alternative to the subblock-constraint is the sliding-window constraint, where each codeword provides sufficient energy within a sliding time window of certain duration. This approach was adopted in [17], [18], where the use of runlength codes was proposed. In [19], capacity bounds were presented under a sliding window energy constraint. Note that the sliding-window constraint is relatively stricter than the subblock-constraint which corresponds to the case where the windows are non-overlapping. A. Contribution To meet the real-time energy requirement at a receiver which uses the received signal to simultaneously harvest energy and decode information, we introduce subblock energy-constrained codes (SECCs) which ensure that sufficient energy is carried within every subblock duration (Sec. III). For a given energy requirement, we characterize the SECC capacity (Sec. III-A) and the SECC error exponent (Sec. III-B). We analyze a subclass of SECCs, called constant subblock-composition codes (CSCCs), where all subblocks have the same composition (Sec. IV) . Compared to general SECCs, the CSCCs have richer symmetry properties and ensure constant energy within each subblock. We exploit these properties to (i) derive a necessary and sufficient condition to avoid power outage at the receiver (Sec. IV-A), (ii) present a tight lower bound on the average energy per symbol within a sliding time window (Sec. IV-B), (iii) obtain a relatively easily computable capacity expression (Sec. IV-C), and (iv) derive bounds for the rate penalty relative to constant composition codes (Sec. IV-D). We also provide numerical results highlighting the tradeoff between delivery of sufficient energy to the receiver and achieving high information rates (Sec. V). It is observed that the ability to use energy in real-time imposes less of a penalty than the ability to use information in real-time. B. Related Work Codes with different constraints on the codewords have been suggested in the past, depending on the constraints at the transmitter, the properties of the communication channel, or the properties of the storage medium. For magnetic storage [20], codewords are usually designed to meet a runlength constraint [21]. The capacity using runlength-limited codes on binary symmetric channels was analyzed in [22]–[24]. A class of binary block codes called multiply constant-weight codes (MCWC) was explored in [25], [26] for application in low-cost authentication of electronic devices based on physically unclonable functions (PUFs) [27]. PUFs give a unique signature to an electronic device by exploiting the inherent randomness and process variations in manufacturing. In a MCWC, each codeword of length mn is partitioned into m equal parts, and each part has constant weight w. MCWCs are applied as challenge words for device

April 18, 2016

DRAFT

4

TRANSMITTER

CHANNEL

Fig. 1. Simultaneous information and energy transfer from a transmitter to an energy-harvesting receiver

authentication, and codes with large minimum distance are desired to improve the reliability of PUF response [25]. Note that MCWCs form a sub-class of CSCCs with binary input alphabet, and are also useful for optical storage [28] and power line communications (PLC) [29]. Permutation codes [30], [31], in conjunction with multiple frequency shift keying (MFSK), have been proposed for PLC so that information transfer does not interfere with the primary function of power delivery. Here, where each codeword is a permutation of M different frequencies, with each frequency viewed as an input symbol. Higher rates of information transfer, at the cost of local variation in power, are achieved using constant composition codes [32] with the set of M frequencies forming the input alphabet. When the codeword length is a multiple of the alphabet size, the composition may be chosen so each frequency occurs an equal number of times in each codeword [33]. The codewords employed by an energy harvesting transmitter are constrained by the instantaneous energy available for transmission. The capacity of these constrained codes over an additive white Gaussian noise (AWGN) channel has been analyzed when the energy storage capability at the transmitter is zero [34], infinite [35], or a finite quantity [36], [37]. The AWGN capacity with processing cost at an energy harvesting transmitter was characterized in [38]. The DMC capacity using an energy harvesting transmitter equipped with a finite energy buffer was analyzed in [39]. A comprehensive summary of the recent contributions in the broad area of energy harvesting communications was provided in [6]. II. S YSTEM M ODEL Consider communication from a transmitter to a receiver where the receiver uses the received signal both for decoding information and for harvesting energy (see Fig. 1). We model the effective communication channel from the modulator at the transmitter to the information decoder at the receiver as a DMC, a reasonable model for certain practical communication systems. Consider, for instance, a digital modulator at the transmitter producing symbols from a signal constellation X = {x1 , . . . , xr }. At the receiver, the signal power is split between the energy harvesting module and the information processing module in a fixed power ratio. The input to the information decoder comprises one of s quantized values

April 18, 2016

DRAFT

5

n XmL

X(m-1)L+1

X2L

L

XL+2 XL+1 XL

L

X2 X1

L

Fig. 2. Transmitted codeword partitioned into subblocks of length L.

Y = {y1 , . . . , ys }, fed by a quantizer in the information processing path. The effective channel is thus

a DMC with input alphabet X , output alphabet Y , and channel transition probabilities given by the likelihood Pr(y|x) for x ∈ X , y ∈ Y . Note that the effective channels seen by the information decoder and the energy harvester may differ due to their respective pre-processing stages. In [40], practical receiver architectures for simultaneous information and energy transfer were defined: an integrated receiver architecture has shared radio frequency chains between the energy harvester and the information decoder, whereas a separated architecture has different chains. In our work, we assume a generic receiver architecture where the received signal power is split between the energy harvesting path and the information processing path with a static power-splitting ratio. We let b(x) denote the energy harvested when x ∈ X is transmitted, b : X → [0, ∞). The map b is assumed

to be time-invariant, and reflects the scenario where the statistical nature of the effective communication channel is due to the noise in the receiver circuitry, which does not affect the harvested energy. The quantification of b helps to abstract the problem of code design for simultaneous energy and information transfer from specific implementation details based on a chosen receiver architecture. To meet the real-time energy requirement at the receiver, we partition the transmitted codeword into equal-sized subblocks (see Fig. 2) and require that transmitted symbols be chosen so the harvested energy in each subblock exceeds a given threshold. This threshold is a function of the energy consumption by the receiver circuitry including the information decoder. We assume that the subblock length, denoted L, is fixed while the codeword length, denoted n, can be made arbitrarily large by increasing the number of subblocks within each codeword. If a transmitted codeword is denoted (X1 , X2 , . . . , Xn ), then the constraint on sufficient energy within each subblock is: L

 1X b X(j−1)L+i ≥ B, j = 1, 2, . . . , k, L

(1)

i=1

where j is the subblock index, B is the required energy per symbol at the receiver, and k is the number of subblocks in a codeword. Subblock energy constraint (1) becomes trivial if b(x) is same for all x ∈ X (e.g. in phase-shift keying). However, the constraint is non-trivial when b-values are not constant (e.g. in on-off keying) and threshold B satisfies bmin < B < bmax , where bmin = minx∈X b(x), bmax = maxx∈X b(x).

April 18, 2016

DRAFT

6

We impose subblock energy constraint (1) because a codeword satisfying only the codeword energy Pn constraint, i=1 b(Xi ) ≥ LB , may still cause power outage at the receiver if the energy content in the codeword is bursty, since a receiver battery with small capacity may drain during periods of low signal energy. The receiver energy update equation after i channel uses is given by E(i + 1) = min (Emax , [E(i) + b(Xi ) − B]+ ), where E(i) denotes the energy level at the receiver after the comple-

tion of i − 1 channel uses, Emax denotes the receiver energy storage capacity, and [z]+ , max(z, 0). We say an outage occurs during the ith channel use if E(i) + b(Xi ) < B , while an overflow occurs if E(i) + b(Xi ) − B > Emax . Constraint (1) ensures sufficient energy is carried within every subblock duration, where the subblock length is chosen to avoid outage. III. S UBBLOCK E NERGY-C ONSTRAINED C ODES When Nj (x) denotes the number of occurrences of symbol x in subblock j within a codeword, the subblock energy constraint (1) can equivalently be expressed as X x∈X

b(x)

X Nj (x) = b(x)Pj (x) ≥ B, L

(2)

x∈X

where Pj (x) , Nj (x)/L, and Pj denotes the subblock-composition for the j th subblock. A subblock energy-constrained code (SECC) is defined as one in which all codewords are partitioned into length-L subblocks and the composition of each subblock is chosen to satisfy (2). Let PL denote the set of all compositions for input sequences of length L. For a given type P ∈ PL , the set of sequences in X L with composition P is denoted by TPL and is called the type class or composition class of P . Thus, the j th subblock in SECC, having subblock-composition Pj , may be viewed as an element of TPLj . Since all subblocks in SECC satisfy (2), the composition of each subblock belongs to the set ΓL B which is defined as ΓL B , {P ∈ PL : EP [b(X)] ≥ B}.

(3)

We will let J denote the number of distinct compositions in ΓL B , and let these compositions be denoted Pj , 1 ≤ j ≤ J , and so ΓL B = {P1 , . . . , PJ }. These J compositions will be used to provide bounds for

the SECC capacity (Sec. III-A) and the SECC error exponent (Sec. III-B). When SECCs are employed on DMC W : X → Y , we may view the L uses of the channel as a single use of the induced vector channel having input alphabet A=

[ L B

P ∈Γ

April 18, 2016

TPL =

[

TPLj ,

(4)

1≤j≤J

DRAFT

7

and output alphabet Y L . Since the underlying scalar channel W is a DMC, the induced vector channel, W L : A → Y L , is also a DMC with transition probabilities given by W L (y1L |xL 1) =

L Y

L L W (yi |xi ), xL 1 ∈ A, y1 ∈ Y .

(5)

i=1

A. SECC Capacity Let the codeword length, n, be of the form n = kL, where k is an integer denoting the number of subblocks in each codeword. We wish to quantify performance limits when the subblock length L is fixed and k → ∞. For SECC, each kL-length codeword may be viewed as an element of Ak , and the received k k k word belongs to the set Y L . A kL-length SECC block code for a channel W L : Ak → Y L is a pair of mappings (f, φ) where f maps a finite message set M into Ak , and φ maps Y kL into M. The P probability of erroneous transmission of message m ∈ M is em , 1 − y1kL :φ(y1kL )=m W kL (y1kL |f (m)), the maximum probability of error of the code (f, φ) is e , maxm∈M em , while the rate of this code is 1 kL

log |M|. We call a kL-length SECC block code with maximum probability of error upper bounded

by  as a (kL, )-SECC code. Definition 1. For a fixed subblock length L, and for 0 ≤  < 1, a non-negative number R is an -SECC achievable rate for the channel W kL : Ak → Y kL if for every δ > 0 and every sufficiently large k there exist (kL, )-SECC codes with rate exceeding R − δ . A number R is an SECC achievable rate if it is -SECC achievable for all 0 <  < 1, and the supremum of SECC achievable rates is called the SECC

capacity of channel W . The induced vector channel W L (5) is itself a DMC with input alphabet A and output alphabet Y L , and hence its capacity is maxPX L :X1L ∈A I(X1L ; Y1L ), where the maximization is over the distribution 1

of X1L ∈ A. Since kL uses of channel W correspond to k uses of W L , the SECC capacity, denoted L CSECC (B), is 1/L times the capacity of this vector channel, and so L CSECC (B) =

I(X1L ; Y1L ) , L PX L : X1 ∈A 1 maxL

(6)

where the maximization is over the distribution of subblocks over the set A, and this set is related to the required energy per symbol, B , via ΓL B (see (4)). Finding a capacity-achieving input distribution in (6) is not always straightforward, and one may have to resort to the Blahut-Arimoto algorithm [41], [42]. If UA denotes the uniform distribution of X1L over A, then maximum rate achievable with UA , denoted L CULA (B), acts as a lower bound for CSECC (B), and is expressed as 1 X 1 CULA (B) = |TQL | PY1L (y1L ) log − H(Y |X), L PY1L (y1L )

(7)

Q∈QL

April 18, 2016

DRAFT

8

where QL is the set of all compositions for output vectors of length L, only one representative output 1 P vector y1L is chosen from each type class TQL , PY1L (y1L ) = |A| W L (y1L |xL 1 ), and H(Y |X) is xL 1 ∈A P L P | evaluated using the pairwise distribution PXY (x, y) = P ∈ΓLB |T|A| P (x)W (y|x). This follows from the fact that when X1L is uniformly distributed over A and vectors y1L , y˜1L belong to the same type class, then PY1L (y1L ) = PY1L (˜ y1L ), because the columns W L (y1L |·) and W L (˜ y1L |·) of the induced vector channel are permutations of each other. Here, the pairwise probability, PXY (Xi = x, Yi = y), satisfies PXY (Xi = x, Yi = y) =

X |T L | P P (x)W (y|x), 1 ≤ i ≤ L, |A| L

(8)

P ∈ΓB

P

because PXY (Xi = x, Yi = y) =

P ∈ΓL B

Pr(Xi = x, Yi = y | X1L ∈ TPL ) Pr(X1L ∈ TPL ), and if X1L is

uniformly distributed over TPL , we have PXY (Xi = x, Yi = y) = P (x)W (y|x), 1 ≤ i ≤ L.

(9)

An interesting, and somewhat counterintuitive, fact is that even when the underlying scalar DMC is symmetric, it is possible that its induced vector channel using SECC is not symmetric. This is formalized in the following theorem which is proved by providing a counterexample. Theorem 1. There exist symmetric channels W : X → Y where a uniform distribution of X1L over A does not achieve SECC capacity. Proof: See Appendix A. Remark: Although the uniform distribution does not necessarily achieve SECC capacity, for any given channel W , there exists a SECC capacity-achieving input distribution PX∗ L which satisfies PX∗ L (xL 1) = 1

1

L PX∗ L (˜ xL ˜L 1 ) whenever x1 and x 1 belong to the same type class. This claim follows from the fact that if the 1

Blahut-Arimoto algorithm (for the finding the capacity-achieving input distribution for the induced vector channel W L ), is initialized with a uniform distribution over A, then the probabilities corresponding to input xL ˜L 1 and x 1 will remain equal after every iteration of the algorithm. This is because for any output L L L xL ) : y L ∈ T L }, L subblock-composition Q we have the set equality {W L (y1L |xL 1 1 1 ) : y1 ∈ TQ } = {W (y1 |˜ Q L L L y L |xL ) : xL ∈ and for any input subblock-composition Pj we have {W L (y1L |xL 1 ) : x1 ∈ TPj } = {W (˜ 1 1 1

TPLj } when y1L ∈ TQL , y˜1L ∈ TQL .

Note that elements of the random vector X1L , in general, are not independent as X1L belongs to the constrained set A. However, under a capacity-achieving input distribution, the elements of X1L are identically distributed. If PX∗ L is a SECC capacity-achieving input distribution, and 1

cj ,

X L 1

x ∈T

April 18, 2016

L PX∗ 1L (xL 1 ), Pj ∈ ΓB , j ∈ {1, . . . , J},

(10)

L Pj

DRAFT

9

then each element Xi in X1L has identical distribution P˜ with P˜ (x) , Pr(Xi = x) =

J X

Pr(X1L ∈ TPLj ) Pr(Xi = x|X1L ∈ TPLj )

(11)

j=1 (a)

=

J X

Pr(X1L ∈ TPLj )Pj (x) =

j=1

J X

cj Pj (x),

(12)

j=1

where (a) follows because all vectors in TPLj have equal probability, and the fraction of vectors in TPLj with Xi = x is Pj (x). The distribution P˜ for each letter in a SECC codeword will be used to bound the SECC capacity (Theorem 2), and the SECC error exponent (Theorem 3). Theorem 2. The SECC capacity is bounded as L I(P˜ , W ) − r˜ ≤ CSECC (B) ≤ I(P˜ , W )

(13)

where I(P˜ , W ) , HP˜ (X) − HP˜ ×W (X|Y ) = H(P˜ ) − HP˜ ×W (X|Y ), and r˜ , H(P˜ ) −

J X j=1

cj

log |TPLj | L

J



1X 1 cj log . L cj

(14)

j=1

Proof: See Appendix B. Each letter in a SECC codeword has distribution P˜ , and Theorem 2 shows that r˜ is an upper bound on rate penalty due to the constraint that each subblock belongs to A. Fig. 4 in Section V presents a L corresponding numerical example, where CSECC (B) and its bounds are plotted as a function of L. Also

refer to the paragraph following Theorem 7 for a discussion on the asymptotic behavior of SECC capacity bounds.

B. SECC error exponent It is well known [43, Thm. 10.2] that for every R > δ > 0 there exists an n-length constant composition code of rate R such that all codewords have composition P , and the maximum probability of error is upper bounded by exp[−n(Er (R, P, W ) − δ)] for every DMC W , whenever n is sufficiently large. Here Er (R, P, W ), characterizing the exponential rate of decay of the probability of error with the blocklength,

is the random coding exponent function defined as [43]  Er (R, P, W ) , min D(V ||W |P ) + [I(P, V ) − R]+ , V

V ranging over all channels V : X → Y , and D(V ||W |P ) =

P

x,y

(15)

P (x)V (y|x) log (V (y|x)/W (y|x)).

As discussed earlier, the L uses of channel W with SECC may be viewed as a single use of the vector channel W L (5). Thus, each n-length SECC codeword may be viewed as a sequence of n/L super-letters

April 18, 2016

DRAFT

10

to be transmitted on vector channel W L . Since rate R for the scalar channel corresponds to rate LR for the vector channel, for every δ > 0 there exists a SECC code with rate R and codewords comprising n/L super-letters, for which the maximum probability of error on W L can be upper bounded as [43] " !# h n i Er (LR, PX∗ L , W L ) ∗ L 1 Pe ≤ exp − Er (LR, PX1L , W ) − Lδ = exp −n −δ , (16) L L

whenever n is sufficiently large, and PX∗ L denotes a SECC capacity-achieving distribution. The following 1

theorem shows that the exponent Er (LR, PX∗ L , W L )/L is lower bounded by Er (R + r˜, P˜ , W ), where P˜ 1

and r˜ are given by (12) and (14), respectively. Theorem 3.

Er (LR, PX∗ L , W L ) 1

L

≥ Er (R + r˜, P˜ , W ).

Proof: See Appendix C. IV. C ONSTANT S UBBLOCK -C OMPOSITION C ODES Constant composition codes (CCCs), where all codewords are required to have the same composition, were first used by Fano [44] to derive error exponents, and shown to be sufficient to achieve the capacity for any DMC, without incurring a rate penalty. For the case where each codeword is partitioned into equal-sized subblocks and input constraints imposed per subblock (as in (1)), it is natural to ask if a rate penalty is incurred when all the subblocks in every codeword are required to have the same composition, where this fixed subblock-composition is chosen to satisfy the respective subblock constraints. In this section, we investigate constant subblock-composition codes (CSCCs), where all subblocks in every codeword have the same fixed composition, and compare the capacities for SECCs, CSCCs, and CCCs. We also show that the structural properties of CSCCs can be exploited to obtain more insightful characterizations compared to general SECCs, e.g. a necessary and sufficient condition on the subblock length to avoid outage (Sec. IV-A), a tight lower bound on the average energy per symbol within a sliding time window (Sec. IV-B), capacity-achieving input distribution (Sec. IV-C), and rate penalty bounds relative to constant composition codes (Sec. IV-D). Most of the results in this section are demonstrated numerically in Section V. A. Condition on L to avoid outage Because all subblocks in every codeword have the same composition in a CSCC, the energy content carried in every subblock is constant. This property is used in the next theorem to give a necessary and sufficient condition on the subblock length to avoid power outage at the receiver for all possible CSCC sequences, where we employ the notation: X/ = {x ∈ X | b(x) < B}, X. = {x ∈ X | b(x) ≥ B}. Recall

April 18, 2016

DRAFT

11

that an outage occurs during the ith channel use if E(i) + b(Xi ) < B , where E(i) denotes the energy level at the receiver after the completion of i − 1 channel uses. Theorem 4. A necessary and sufficient condition to avoid outage for all possible CSCC codeword P sequences, with subblock-composition P satisfying x∈X b(x)P (x) ≥ B , is Emax , x∈X/ 2P (x) (B − b(x)) P and that the initial energy level satisfies E(1) ≥ x∈X/ LP (x) (B − b(x)). L≤ P

(17)

Proof: See Appendix D. Note that L is also related to X. via P satisfying the constraint

P

x∈X

b(x)P (x) ≥ B . Also note that

if I , {1, 2, . . . , L} and I/ , {i ∈ I|Xi ∈ X/ }, then the sum of energy decrements over the reception P P of the first CSCC subblock is i∈I/ (B − b(Xi )) = x∈X/ LP (x)(B − b(x)), and the condition on the initial energy level E(1) prevents outage when symbols with low energy are stacked at the beginning of the subblock. This initial energy level may be ensured by transmitting a preamble, consisting of symbols with high energy content, before the transmission of codewords. This preamble has bounded length and hence does not affect the channel capacity. B. Sliding window energy constraint An alternative to the subblock-constraint is the sliding-window constraint, where each codeword provides sufficient energy within a sliding time window of certain duration. A codeword which satisfies the subblock energy constraint (1) may not carry sufficient energy within a sliding time window of length T = L. We present some notation towards bounding the energy per symbol within a sliding time window.

Let X1n = {X1 , . . . , Xn } denote a sequence of length n, and define βt (X1n , T )

t+T −1 1 X , b(Xi ), T

(18)

i=t

with t ∈ {1, 2, . . . , n − T + 1}. Then βt (X1n , T ) denotes the average energy per symbol in X1n within a window of length T starting at time index t. Let S denote a set of sequences. The average energy per symbol within a sliding window over all sequences in S is lower bounded by ξ(S, T ) , min min βt (X1n , T ). n X1 ∈S

t

(19)

Let CPL denote the set of all CSCC sequences having subblock length L and composition P . We will characterize the exact value of ξ(S, T ) for the set S = CPL . Note that ξ(CPL , T ) gives a measure of least average energy per symbol within a sliding time window of length T over all CSCC sequences with subblock length L and composition P .

April 18, 2016

DRAFT

12

Let P be a given subblock composition, X˜ , {x ∈ X | P (x) > 0}, and let x1 , . . . , xm denote the distinct elements in X˜ with 0 ≤ b(x1 ) ≤ b(x2 ) ≤ · · · ≤ b(xm ). Let x0 be a dummy symbol with P (x0 ) = b(x0 ) = 0. Then ξ(CPL , T ) is characterized via the following theorem.

Theorem 5. ˜ − δT , ξ(CPL , T ) = B

(20)

˜ , Pm P (xi )b(xi ) is the energy per symbol within each subblock, and δT is given by (22) or where B i=1

(24) below, depending on whether T ≤ L or T > L, respectively. i) T ≤ L: Let k1 be the largest positive integer for which u1 = T − 2L

kX 1 −1

P (xi ) > 0.

(21)

i=0

Then

k1 −1 2L X u1 ˜ δT = B − b(xi )P (xi ) − b(xk1 ). T T

(22)

i=0

ii) T > L: Let qT and rT denote the quotient and remainder, respectively, when T is divided by L, i.e., T = qT L + rT . Let k2 and k3 be largest positive integers, respectively, for which u2 = rT − 2L

kX 2 −1

P (xi ) > 0 , u3 = L + rT − 2L

i=0

kX 3 −1

P (xi ) > 0.

(23)

i=0

Then δT = max{ω2 , ω3 }, where

(24)

k2 −1 k3 −1 rT ˜ 2L X L + rT ˜ 2L X u2 u3 ω2 = B− b(xi )P (xi ) − b(xk2 ), ω3 = B− b(xi )P (xi ) − b(xk3 ). T T T T T T i=0

i=0

Proof: See Appendix E For CSCCs, the subblock composition P may be chosen to ensure sufficient energy with every subblock duration. The δT term in Theorem 5 denotes the penalty in the average energy per symbol when the subblock constraint is replaced by a sliding window constraint, for a window size equal to T . ˜ , and hence δT ≥ 0, because ξ(C L , T ) is the least value for the average energy Remark: ξ(CPL , T ) ≤ B P ˜ is the expected average energy per per symbol within a sliding window over all sequences in CPL , while B

symbol within any given time window, where the expectation is taken over all sequences in CPL . Further, for CSCCs with unbounded blocklength, it follows from (24) that δT → 0 as T → ∞. The following corollary for binary CSCC sequences indicates that ξ(CPL , T ), as a function of T , varies ˜ as T becomes large. in a zig-zag manner while approaching B

April 18, 2016

DRAFT

13

Corollary 1. Consider binary CSCC sequences where every subblock of L bits contains exactly l1 ones, ˜ = l1 /L and ξ(C L , T ) satisfies the following: with 1 ≤ l1 < L, and let b(0) = 0, b(1) = 1. Then B P •

ξ(CPL , T ) = 0 for 1 ≤ T ≤ 2(L − l1 ).



ξ(CPL , T ) is strictly increasing in T within time intervals [(j + 2)L − 2l1 , (j + 2)L − l1 ], j ∈ Z+ .

ξ(CPL , T ) is strictly decreasing in T within time intervals [(j + 2)L − l1 , (j + 3)L − 2l1 ], j ∈ Z+ . (j + 1)l1 jl1 , and ξ(CPL , (j + 2)L − l1 ) = . Further, ξ(CPL , (j + 2)L − 2l1 ) = jL + 2(L − l1 ) (j + 1)L + (L − l1 ) •

Proof: See Appendix F. Thus, for a given T , the number of ones in each subblock, l1 , may be chosen appropriately to ensure that the average energy per symbol in a sliding window exceeds a threshold. Remark: Corollary 1 shows that subblock constraints may be employed to ensure sufficient energy within a sliding window. An alternate approach to meeting the sliding window constraint is to use type-1 (d, k) run-length limited (RLL) sequences [17]–[19], which require that the number of ones between

successive zeros are at least d and at most k . In this regard, two different constraints are said to be equivalent if they induce the same set of constrained sequences. In general, it can be shown that the constraint of having at least d ones in a sliding window of length T = d + 1 is equivalent to the type-1 (d, ∞) RLL constraint [19]. On the other hand, it can also be shown that the constraint of having at least d ones in a sliding window of length T ≥ d + 2 is not equivalent to any type-1 RLL constraint.

Theorem 5 can easily be extended for constant composition codes (CCCs). In this case, the average energy per symbol in a sliding window of length T over all CCC sequences with composition P and length n is lower bounded by k4 −1 u4 n X b(xi )P (xi ) + b(xk4 ), T T

(25)

i=0

where T ≤ n, and k4 is the largest positive integer for which u4 = T − n

Pk4 −1 i=0

P (xi ) > 0. Further,

there exist CCC sequences with composition P which meet the lower bound in (25) with equality. In the following subsections, we characterize the CSCC capacity and the CSCC error exponent, and compare them with the CCC capacity and CCC error exponent, respectively.

C. CSCC Capacity In a CSCC with subblock-composition P , and subblock length L, every subblock may be viewed as an element of TPL . Thus for CSCCs, the L uses of channel W induces a vector channel W L with input alphabet TPL and output alphabet Y L . Now, similar to SECCs, we assume that the CSCC codeword length is of the form n = kL, where k is an integer denoting the number of subblocks in each codeword, and

April 18, 2016

DRAFT

14

k k k define a kL-length CSCC block code for a channel W L : TPL → Y L as a pair of mappings k (f, φ) where f maps a finite message set M into TPL , and φ maps Y kL into M. Analogous to SECCs, L the CSCC capacity using subblock length L and composition P , CCSCC (P ), can be expressed as ! PL L) L; Y L) H(Y |X ) H(Y I(X i i L 1 1 1 , (26) = max − i=1 CCSCC (P ) = max L L L PX L :X1L ∈TPL PX L :X1L ∈TPL 1 1

where the maximization is over the distribution of X1L ∈ TPL . The following theorem shows that the maximum is achieved when X1L is uniformly distributed over TPL . Theorem 6. The CSCC capacity using subblock length L and composition P is obtained via a uniform distribution of the input vectors in TPL . Proof: See Appendix G. Remark: In contrast with SECC where the induced vector channel is not necessarily symmetric even when the underlying scalar channel is symmetric (see Theorem 1), the proof of Theorem 6 shows that induced vector channel using CSCC is always symmetric (even when the underlying scalar channel is not symmetric). This is because all the input subblock vectors in CSCC have the same composition, while general SECCs allow different subblocks to have different compositions. The CSCC capacity using subblock length L and composition P can be computed as 1 X 1 L CCSCC (P ) = |TQL | PY1L (y1L ) log − H(Y |X), L PY1L (y1L )

(27)

Q∈QL

where QL denotes the set of all compositions for length L output sequences, only one representative output P vector y1L is chosen from every type class TQL , PY1L (y1L ) = |T1L | xL1 ∈TPL W L (y1L |xL 1 ), and H(Y |X) is P

evaluated using the pairwise distribution PXY (x, y) = P (x)W (y|x). In a general SECC, different subblocks may have different compositions, while in a CSCC, all subblocks have the same fixed subblock-composition. When a CSCC is required to satisfy the subblock energy constraint (1), this fixed subblock-composition for all the subblocks is chosen from ΓL B to maximize the L information rate. Thus the CSCC capacity with subblock energy constraint (1), denoted CCSCC (B), is L L CCSCC (B) = maxL CCSCC (P ). P ∈ΓB

(28)

|X |−1 [43], and hence the maximization in (28) is over at L As ΓL B ⊂ PL , we have |ΓB | ≤ |PL | ≤ (L + 1)

most (L + 1)|X |−1 distinct subblock-compositions. D. Rate Comparison When a CSCC is designed to satisfy the subblock energy constraint (1), the constant subblockL composition is chosen as P ∗ = arg maxP ∈ΓLB CCSCC (P ) (see (28)) and each subblock belongs to the set

April 18, 2016

DRAFT

15

TPL∗ . In contrast, a general SECC has the flexibility of choosing different subblocks with different compoS L L TP . Because CSECC sitions: each subblock in a general SECC belongs to a richer set A = (B) (6) P ∈ΓL B

is obtained via optimizing the subblock distribution over this richer set, it follows that L L CCSCC (B) ≤ CSECC (B).

(29)

The capacity for DMC W using input distribution P is IP ×W (X; Y ). Since constant composition codes (CCCs) achieve the capacity for any given DMC, the CCC capacity using codeword-composition P , denoted CCCC (P ), is [43], [44] CCCC (P ) = IP ×W (X; Y ) = HP (X) − HP ×W (X|Y ).

If we impose the energy constraint per codeword,

1 n

Pn

i=1 b(Xi )

(30)

≥ B , with blocklength n, then the

capacity with this constraint can be achieved by CCCs [43]. Thus, if CCCC (B) denotes the capacity using CCC when the average energy per symbol is at least B , then [14], [43] CCCC (B) =

max P : EP [b(X)]≥B

CCCC (P ) =

max

I(X; Y ).

PX : EPX [b(X)]≥B

(31)

Since the subblock energy constraint (1) is stricter than the energy constraint per codeword, we have L CSECC (B) ≤ CCCC (B), and hence combining this with (29), we obtain L L (B) ≤ CCCC (B). CCSCC (B) ≤ CSECC

(32)

Although a CCC with codeword-composition P ensures that the average energy per symbol is at least B , it may violate the subblock energy constraint subblock (1). We seek to quantify the rate penalty L CCCC (P ) − CCSCC (P ), which is the price we pay for meeting the real-time energy requirement within

every subblock duration, compared to the less constrained energy requirement per codeword. In CSCC, since each subblock X1L is uniformly distributed over TPL , we have [43, p. 26] H(X1L ) = log |TPL | = LH(P ) − L r(L, P ),

where r(L, P ) denotes a function of L and P given as [43, p. 26] X s(P ) − 1 1 ϑ(L, P ) r(L, P ) = log(2πL) + log P (a) + s(P ), 2L 2L 12L ln 2

(33)

(34)

a:P (a)>0

with s(P ) denoting the number of elements x ∈ X with P (x) > 0, and 0 ≤ ϑ(L, P ) ≤ 1 and the exact value of ϑ(L, P ) in this range is chosen so that (33) is satisfied. The following theorem shows that the rate penalty by using CSCC, relative to CCC, is upper bounded by r(L, P ). Theorem 7. The rate penalty is bounded as L 0 ≤ CCCC (P ) − CCSCC (P ) ≤ r(L, P ).

April 18, 2016

(35)

DRAFT

16

Proof: When X1L is uniformly distributed over TPL ,  1 H(X1L ) − H(X1L |Y1L ) L L  1X (a) = H(P ) − r(L, P ) − H Xi |Y1L , X1i−1 L

L CCSCC (P ) =

(36) (37)

i=1 L

(b)

≥ H(P ) − r(L, P ) −

1X H(Xi |Yi ) L

(38)

i=1

(c)

= H(P ) − r(L, P ) − H(X|Y )

(39)

(d)

= CCCC (P ) − r(L, P ),

(40)

where (a) follows from (33) and chain rule for entropy, (b) follows since conditioning only reduces entropy, (c) follows from (9), and (d) follows from (30). Now, (35) follows from (40). The upper bound in (35) is tight for noiseless channels, while the lower bound is asymptotically tight as L → ∞ for arbitrary channels, since limL→∞ r(L, P ) = 0. Now if PX∗ L is a SECC capacity-achieving 1

input distribution, and P˜ is given by (12), then P˜ ensures that the average energy per symbol in a codeword is at least B . Hence, in general, CCCC (B) ≥ I(P˜ , W ) because the expression for CCCC (B) chooses that input distribution which maximizes rate while meeting the average energy constraint. Combining L this observation with (13), (32), and the fact that limL→∞ CCSCC (B) = CCCC (B), we get L L lim CCSCC (B) = lim CSECC (B) = lim I(P˜ , W ) = CCCC (B).

L→∞

L→∞

L→∞

(41)

The above equation indicates that if the CCC capacity-achieving input distribution in (31) is unique, and denoted as Pˆ , then limL→∞ P˜ = Pˆ . This suggests that for large L, the cj (10) become concentrated, P˜ tends towards the Pj which corresponds to cj with relatively high value, and r˜ (14) tends to r(L, Pj ) which diminishes with increasing L. Note that the r(L, P ) term is independent of the underlying channel, and the upper bound on the rate penalty (35) may be improved for certain channels by using knowledge of the channel statistics. The following theorem gives improved bounds for the binary symmetric channel (BSC) and the binary erasure channel (BEC), respectively. Theorem 8. The rate penalty bound given by (35) can be improved for BSC and BEC, respectively: (a) For a BSC with crossover probability 0 < p0 < 0.5, P (0) = Pr(X = 0), P (1) = Pr(X = 1), and 0 < γ = min(P (0), P (1)) ≤ 0.5, we have L 0 < CCCC (P ) − CCSCC (P ) ≤ h(p0 ? γ) − h(p0 ? α) < r(L, P ),

April 18, 2016

(42)

DRAFT

17

where the binary operator ? is defined as a ? b , a(1 − b) + (1 − a)b, h(·) is the binary entropy function, and α is chosen such that h(α) = h(γ) − r(L, P ), 0 ≤ α < 0.5. (b) For a BEC with erasure probability 0 <  < 1, we have L 0 < CCCC (P ) − CCSCC (P ) ≤ (1 − )r(L, P ) < r(L, P ).

(43)

Proof: See Appendix H. In the above theorem, the proof of part (a) uses Mrs. Gerber’s Lemma (MGL) [45], while an extension [46] of MGL is used for proving part (b).

E. CSCC Error Exponent The following theorem shows that the CSCC error exponent using subblock-composition P is related to the CCC error exponent by the same term, r(L, P ), used in rate loss bound (35). Theorem 9. For every DMC W , every blocklength n which is a multiple of subblock length L, and every R > 0, there exists a CSCC with subblock-composition P , and rate R, for which the average probability

of error is upper bounded as P¯e ≤ exp[−n Er (R + r(L, P ), P, W )].

(44)

Thus, the CSCC error exponent using subblock-composition P , with rate R on DMC W is lower bounded by Er (R + r(L, P ), P, W ). Proof: See Appendix I. V. N UMERICAL R ESULTS AND D ISCUSSION In this section, we present examples highlighting the tradeoff between delivery of sufficient energy to the receiver and achieving high information transfer rates using constrained codes such as SECC, CSCC, and CCC. We remark that the multiply constant-weight codes (MCWC) [25] form a sub-class of CSCCs corresponding to X = {0, 1}, where the constant-weight within each subblock is L · Pr(X = 1), and hence the numerical results presented in this section for CSCCs can also be employed as performance benchmark for practical MCWC codes [26]. Fig. 3 compares SECC, CSCC, and CCC capacities for a BSC with crossover probability 0.01 and b(0) = 0, b(1) = 1. These b-values reflect the case of on-off keying where bit-1 (bit-0) is represented

by the presence (absence) of a carrier signal. We observe that relative to CSCC, the SECC capacity L is generally higher because of greater flexibility in the choice of subblocks. Note that CSECC (B) and

April 18, 2016

DRAFT

18

1 0.9 0.8

Capacity

0.7 0.6 0.5 0.4 0.3

L CCSCC (B) with L = 8 L CSECC (B) with L = 8 CCCC (B)

0.2 0.1 0

0

0.2

0.4

0.6

0.8

1

Required average energy per symbol, B

Fig. 3. Capacity comparison for a binary symmetric channel with crossover probability 0.01 and b(0) = 0, b(1) = 1.

0.85

0.8

Capacity

0.75

0.7

CCCC (B) I(P˜ , W )

0.65

L CSECC (B) I(P˜ , W ) − r˜

0.6

L CCSCC (B)

0.55

8

16

24

32

40

48

56

64

72

80

Subblock Length, L

Fig. 4. Capacity comparison for a binary noiseless channel with B = 0.75 and b(0) = 0, b(1) = 1.

L CCSCC (B), given by (6) and (28), respectively, are non-increasing in B because the set ΓL B only becomes

smaller on increasing B . The CCCC (B) curve, given by (31), is non-increasing concave in B [14]. L Fig. 4 plots CSECC (B) and its bounds as a function of subblock length L for a binary noiseless L channel with B = 0.75. Here, we have CSECC (B) = I(P˜ , W ) − r˜, with P˜ and r˜ given by (11) and (14),

respectively, and it is observed that (13) provides tighter bounds on SECC capacity relative to (32). The plot shows that the capacity of subblock constrained codes increases with L because it provides greater

April 18, 2016

DRAFT

19

1 L CCSCC (B) CULA (B) L CSECC (B) CCCC (B)

0.9 0.8

Capacity

0.7 0.6 0.5 0.4

L = 8, B = 0.6

0.3 0.2 0.1 0

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

BSC crossover probability, p 0

Fig. 5. Comparison of different capacities as a function of BSC crossover probability p0 and b(0) = 0, b(1) = 1.

L=2 L=4 L=8 L = 16 L=∞

0.6

CSCC Capacity

0.5

0.4

0.3

0.2

0.1

0

0

0.2

0.4

0.6

0.8

1

Required average energy per symbol, B

L Fig. 6. Plot of CCSCC (B) versus B for BSC with crossover probability p0 = 0.1, b(0) = 0, b(1) = 1.

flexibility in the choice of symbols within a subblock. Fig. 5 compares capacity of different schemes for L = 8 and B = 0.6, as a function of BSC crossover probability p0 . Recall that the CULA (B) curve (in cyan) denotes achievable rates when the subblocks are S L uniformly distributed over A = TP , and may be computed using (7). The figure shows that although P ∈ΓL B

L L CULA (B) > CCSCC (B) for p0 < 0.05, we have CULA (B) < CCSCC (B) for 0.06 < p0 < 0.5. L Fig. 6 plots CCSCC (B) as a function of B for different values of L for a BSC with crossover probability

April 18, 2016

DRAFT

20

1

0.9

Capacity

0.8 L CCSCC (B) with p0 = 0.01 CCC C (B) with p0 = 0.01 L CCSCC (B) with p0 = 0.1 CCC C (B) with p0 = 0.1

0.7

0.6

0.5

0.4

2

6

10

14

18

22

26

30

34

38

42

46

50

Energy buffer size, Emax

Fig. 7. Plot of CSCC capacity versus receiver energy buffer size, Emax , with B = 0.5, b(0) = 0, b(1) = 1 for BSC with crossover probability p0 = {0.01, 0.1}.

p0 = 0.1. Note that the smaller the value of L, the greater the uniformity in energy distribution within

a codeword. The reduction in capacity due to choice of smaller L is the price we pay for providing smoother energy content. The plot for L = ∞ is evaluated using (31); this follows from (35), and the fact that limL→∞ r(L, P ) = 0. The CSCC capacity is plotted in Fig. 7 for a BSC as a function of the receiver energy buffer size, Emax , with B = 0.5. The subblock length L is chosen as a function of Emax to satisfy (17). It is seen L that CCSCC (B) increases with Emax because L is an increasing function of Emax . For p0 = 0.1, the

CSCC capacity is limited by the relatively high value of the crossover probability. On the other hand, for p0 = 0.01, the CSCC capacity is limited by the energy buffer size (since ‘noise’ is weak). From (17) we observe that L → ∞ as Emax tends to infinity, and hence the CSCC capacity corresponding to Emax → ∞ is equal to CCCC (B).

Fig. 8 plots the rate penalty incurred by using CSCC instead of CCC, for a BSC with crossover probability p0 , L = 16, and Pr(0) = Pr(1) = 0.5. As discussed in Sec. IV-D, the upper bound on the rate penalty given by r(L, P ) is shown to be close to the exact value when p0 ≈ 0. A tighter bound on the rate penalty given by h(p0 ? γ) − h(p0 ? α) is also plotted (see Theorem 8). For information transfer, although joint decoding of subblocks within a codeword is preferred for reducing the probability of error, it also causes delay because the receiver waits for the arrival of the entire codeword. For enabling real-time information transfer, the receiver may decode each subblock

April 18, 2016

DRAFT

21

L C CCC (P ) − C CSCC (P ), with L = 16

0.16 0.14 0.12

Exact h(p0 ⋆ γ ) − h(p0 ⋆ α) r(L, P )

0.1 0.08 0.06 0.04 0.02 0 −0.02

0

0.1

0.2

0.3

0.4

0.5

BSC crossover probability, p 0

L (P ) as a function of BSC crossover probability p0 for L = 16 and Pr(0) = Pr(1) = 0.5. Fig. 8. Plot of CCCC (P ) − CCSCC

independently, which we will refer to as local subblock decoding (LSD). With LSD, a CSCC code with subblock length L is viewed by the receiver as a CCC with codeword length L. Let M ∗ (L, ) denote the maximum size of length-L CCC with average error probability no larger than . When the channel satisfies some regularity conditions, then [47]–[49] r 1 V −1 1 O(1) ∗ log M (L, ) = C − Q () + log L + , L L 2L L

(45)

where C is the channel capacity, V is the information variance, and Q is the Gaussian Q-function. Fig. 9 compares rates using LSD (obtained by ignoring the O(1)/L term in (45)) with rates using joint subblock decoding for a BSC with crossover probability p0 = 0.11 when each subblock has equal number of zeros and ones. This normal approximation was shown to be accurate by Polyanskiy-Poor-Verd´u [50] L for the BSC in the given range of error probabilities. The red curve plots a lower bound on CCSCC (P )

using (42) where the probability of error can be brought arbitrarily close to zero by increasing the number of subblocks in a codeword and then jointly decoding the subblocks. The rate loss for CSCC decreases p roughly as 1/L with L using LSD, whereas the rate loss with joint decoding decreases as log(L)/L. Thus, demonstrating the ability to use energy in real-time imposes less of a penalty than the ability to use information in real-time. Fig. 10 plots ξ(CPL , T ), where ξ is given by (19), for CSCC sequences over a binary alphabet X = {0, 1}, where the subblock length is L = 10, composition P is given by P (0) = 0.2, P (1) = 0.8, and

the b-function is b(0) = 0, b(1) = 1. For these parameters, the energy per symbol within each subblock

April 18, 2016

DRAFT

22

0.5

Rate (bits per channel use)

0.45 0.4 0.35 0.3 0.25 0.2

Capacity L Lower bound for CCSCC (P ) CSCC with LSD (ǫ = 10−1 ) CSCC with LSD (ǫ = 10−3 ) CSCC with LSD (ǫ = 10−6 )

0.15 0.1 0.05 0

0

100

200

300

400

500

600

700

800

900

1000

Sub-block length, L

Energy per symbol in a sliding window

Fig. 9. Rates for a BSC with crossover probability p = 0.11.

0.8 0.7

˜ B ξ(CPL , T )

0.6 0.5 0.4 0.3 0.2 0.1 0

0

20

40

60

80

100

Sliding window length, T

Fig. 10. Lower bound for average energy per symbol in a sliding window of length T for any CSCC sequence with subblock length L = 10 and composition P given by P (0) = 0.2, P (1) = 0.8.

˜ = 0.8. As suggested in Corollary 1, the figure shows that ξ(C L , T ) alternates between increasing is B P ˜ as T becomes large. and decreasing cycles, while approaching B

VI. R EFLECTIONS We proposed the use of subblock energy-constrained codes (SECCs) for real-time simultaneous energy and information transfer. We characterized the SECC capacity and the SECC error exponent, and provided

April 18, 2016

DRAFT

23

useful bounds for these values. These bounds for SECCs continue to hold for general type-constrained subblock codes where each subblock is constrained to belong to an arbitrary but fixed set of type classes. Constant subblock-composition codes (CSCCs), a subclass of SECCs where all the subblocks have the same fixed composition, were shown to possess certain useful symmetry properties. By exploiting the property that the energy content in every subblock is constant, we obtained a necessary and sufficient condition for avoiding receiver energy outage for all possible CSCC sequences, and provided a tight lower bound on the average energy per symbol in a sliding window. We showed that relative to the classical constant composition codes (CCCs), the use of CSCCs incurs a rate loss, and the CSCC error exponent was shown to be related to the CCC error exponent by the same rate loss term. We also provided several examples highlighting the tradeoff between delivery of sufficient energy to the receiver and achieving high information rates. Other than the application of simultaneous energy and information transfer, CSCCs are also suitable candidates for power line communications due to their ability to carry constant energy within every subblock duration, thus avoiding energy variations which interfere with the primary function of power delivery. The CSCC codes may also find application in other diverse fields. For instance, codes proposed in [25] for use in low-cost authentication methods are a special case of CSCC with binary input alphabet. ACKNOWLEDGMENT The authors thank the editor and the reviewers for their constructive feedback and helpful suggestions. A PPENDIX A P ROOF OF T HEOREM 1 For the proof, we will construct a simple example of a symmetric DMC W : X → Y for which the uniform distribution over A does not achieve SECC capacity. Consider the following parameters for a BSC with crossover probability p0 : b(0) = 0, b(1) = 1, B = 0.5, L = 2, 0 < p0 < 0.5 .

(46)

With the above parameters, the input alphabet for the induced vector channel is given by A = {01, 10, 11}. L A uniform distribution over A will achieve SECC capacity if and only if I(X1L = xL 1 ; Y1 ) is same for

all xL 1 ∈ A [51, Thm. 4.5.1], where L I(X1L = xL 1 ; Y1 ) =

|A| W L (y1L |xL 1) W L (y1L |xL . 1 ) log X L L L W (y1 |˜ x1 ) y L ∈Y L

X

(47)

1

x ˜L 1 ∈A

April 18, 2016

DRAFT

24

1.6

x21 = 01 x21 = 11

1.4

I (X 12 = x21 ; Y 12 )

1.2 1 0.8 0.6 0.4 0.2 0

0

0.1

0.2

0.3

0.4

0.5

BSC crossover probability

L Fig. 11. I(X1L = xL 1 ; Y1 ) versus BSC crossover probability p0 for L = 2.

The proof is completed by numerically verifying that for the above parameters (46), we have I(X1L = 01; Y1L ) 6= I(X1L = 11; Y1L ). Fig. 11 shows that I(X1L = 01; Y1L ) and I(X1L = 11; Y1L ) are different



when 0 < p0 < 0.5. A PPENDIX B P ROOF OF T HEOREM 2

From the remark following Theorem 1, we know that for any given channel W , there exists a SECC ∗ xL ) whenever xL and x capacity-achieving input distribution PX∗ L which satisfies PX∗ L (xL ˜L 1 ) = PX L (˜ 1 1 1 1

1

1

belong to the same type class. Then, from (10) and the grouping axiom [52, p. 8], we have HPX∗ L (X1L ) = 1

J X

cj log |TPLj | +

j=1

J X

(a)

−cj log cj = LH(P˜ ) − L˜ r,

(48)

j=1

where (a) follows from (14). Now,   L L L L CSECC (B) = HPX∗ L (X1 ) − HP ∗ L ×W L (X1 |Y1 ) /L 1

(b)



LH(P˜ ) − L˜ r−

X 1

L X

! HP˜ ×W (Xi |Yi ) /L = I(P˜ , W ) − r˜,

(49)

i=1

where (b) follows from (11), (48) and the fact that conditioning only reduces entropy. For a noiseless channel, the inequality (b) turns to an equality as HP ∗ L ×W L (X1L |Y1L ) = 0 = HP˜ ×W (Xi |Yi ). X

1

Towards proving the SECC capacity upper bound, note that since constant composition codes achieve capacity on a DMC [43], the SECC capacity is achieved by codewords having empirical distribution

April 18, 2016

DRAFT

25

PX∗ L with respect to alphabet A. These codewords, when viewed as a sequence of symbols from X have 1

L empirical distribution P˜ . Thus, CSECC (B) ≤ I(P˜ , W ) since achievable rates using codewords having

constant composition P˜ is upper bounded by I(P˜ , W ).



A PPENDIX C P ROOF OF T HEOREM 3 The exponent Er (LR, PX∗ L , W L ) can be expressed as [43] 1   ∗ L L L ∗ ∗ L + Er (LR, PX1L , W ) , min D(V ||W |PX1L ) + [I(PX1L , V ) − LR] , L

(50)

V

V L ranging over all channels V L : A → Y L . Now, the D(V L ||W L |PX∗ L ) term in (50), is given by 1

X

D(V L ||W L |PX∗ 1L ) =

PX∗ 1L (xL 1)

xL 1 ∈A

X

V L (y1L |xL 1 ) log

y1L ∈Y L

V L (y1L |xL 1) L L W (y1 |xL 1)

 1 = X W L (Y1L |X1L ) 1 1  X  L L  (a) X 1 = D(Vi ||W |P˜ ), ≥ −HP˜ ×Vi (Y |X) + EP˜ ×Vi log W (Yi |Xi ) −HP ∗ L ×V L (Y1L |X1L ) X

 + EP ∗ L ×V L log

(51)

i=1

i=1

where Vi is the marginal distribution of V

L

corresponding to the ith symbol, and (a) follows from (5),

(11), and the fact conditioning only reduces the entropy. The term I(PX∗ L , V L ) in (50) satisfies 1

I(PX∗ 1L , V L ) = HPX∗ L (X1L ) − HP ∗ L ×V L (X1L |Y1L ) X

1

(b)

≥ LHP˜ (X) − L˜ r−

1

L X

HP˜ ×Vi (Xi |Yi ) =

L X

I(P˜ , Vi ) − L˜ r,

(52)

i=1

i=1

where (b) follows using (48) and the fact conditioning only reduces entropy. Let Vb L denote that V L which achieves the minimum in (50). Then Vb L is of the form [53], [54] Vb L (y1L |xL 1) = P

1−s Q L (y L )s W L (y1L |xL Y1 1) 1

y˜1L ∈Y L

L s 1−s Q L (˜ W L (˜ y1L |xL Y1 y1 ) 1)

,

(53)

where QY1L satisfies the set of simultaneous equations QY1L (y1L ) =

L L L 1−s Q L (y L )s X PX∗ L (xL Y1 1 )W (y1 |x1 ) 1 P1 , L |xL )1−s Q L (˜ L s L y W (˜ y Y1 1 1 1) y˜1L ∈Y L L

(54)

x1 ∈A

and s ∈ [0, 0.5] is chosen as a function of rate R. Now, if π is an arbitrary permutation on L letters, ∗ L L L L ∗ L then W L (π(y1L )|π(xL 1 )) = W (y1 |x1 ) and PX L (π(x1 )) = PX L (x1 ). Thus, from (54) it follows that 1

QY1L (π(y1L ))

=

QY1L (y1L )

1

L L bL L L bL and hence Vb L (π(y1L )|π(xL 1 )) = V (y1 |x1 ). In particular, V (π(y1 )|π(x1 )) =

Vb L (y1L |xL 1 ) when π corresponds to a transposition which interchanges the symbols at the first and the ith

April 18, 2016

DRAFT

26

index, and hence Vbi , the marginal distribution of Vb L corresponding to the ith symbol, has distribution identical to Vb1 , where 1 < i ≤ L, . Because Vb L achieves the minimum in (50), Er (LR, PX∗ 1L , W L ) = D(Vb L ||W L |PX∗ 1L ) + [I(PX∗ 1L , Vb L ) − LR]+

(55)

(c)

≥ L D(Vb1 ||W |P˜ ) + [L I(P˜ , Vb1 ) − L˜ r − LR]+   ≥ L min D(V ||W |P˜ ) + [I(P˜ , V ) − (˜ r + R)]+ = L Er (R + r˜, P˜ , W ), V

where (c) follows from (51), (52), and the fact Vbi , 1 ≤ i ≤ L are identically distributed.

(56) 

A PPENDIX D P ROOF OF T HEOREM 4 The following lemma, with G =

P

x∈X/

LP (x) (B − b(x)), will be used to prove sufficiency.

Lemma 1. A CSCC with subblock-composition P ∈ ΓL B satisfies the following properties: (a) If there is no outage or overflow during the reception of the first subblock, then E(L + 1) ≥ E(1). (b) If E(1) ≥ G, then there is no energy outage during the reception of the first subblock. (c) If E(1) ≥ G and Emax ≥ 2G, then E(L + 1) ≥ G. Proof: If there is no energy outage or overflow, then the total energy harvested during the reception P of the first subblock is x∈X LP (x)b(x), while the total energy consumed is LB and claim (a) follows since P belongs to the set ΓL B (3). Let I = {1, 2, . . . , L}, and I/ = {i ∈ I|Xi ∈ X/ }. For i ∈ I , the level in the energy buffer decreases during the ith channel use if and only if i ∈ I/ , and the corresponding decrease in energy level is B − b(Xi ). Since the subblock has composition P , the sum of energy decrements over the reception of P the first subblock is i∈I/ (B − b(Xi )) = G, and claim (b) follows.

For proving claim (c), we note that the condition E(1) ≥ G implies that there is no energy outage during the reception of the first subblock (using claim (b)). Further, if there is no overflow then E(L+1) ≥ E(1) ≥ G (using claim (a)). In case there is energy overflow in the ith channel use for any i ∈ I , we

have E(i + 1) = Emax ≥ 2G, and thus E(L + 1) ≥ E(i + 1) − G ≥ G. When L satisfies (17), then Emax ≥ 2G. Since the initial energy level satisfies E(1) ≥ G, the energy level at the start of every subblock is at least G (by recursive application of Lemma 1(c)) and sufficiency follows from Lemma 1(b). To prove necessity, we will show that when Emax , x∈X/ 2P (x) (B − b(x))

L> P

April 18, 2016

(57)

DRAFT

27

then CSCC codewords exist which will result in energy outage at the receiver. In this case G=

X

LP (x) (B − b(x)) >

x∈X/

Now let L1 =

P

P1 (x) =

Emax . 2

(58)

LP (x), and define

x∈X/

  P

if x ∈ X/

 0,

if x ∈ X.

P (x) x) , x ˜∈X/ P (˜

,

P2 (x) =

  0,  P

if x ∈ X/ P (x) P (˜ x) ,

x ˜∈X.

if x ∈ X.

L1 L−L1 L−L1 L1 L1 L 1 , xL ∈ TPL−L } , S2 = {xL S1 = {xL L−L1 +1 ∈ TP1 }. 1 | x1 1 | x1 ∈ TP1 , xL1 +1 ∈ TP2 2

Clearly S1 ⊂ TPL , S2 ⊂ TPL , where S1 (resp. S2 ) denotes the set of subblocks of length L with first (resp. last) L1 input symbols belonging to X/ . Note that E(1) ≥ G is necessary to avoid outage because if E(1) < G, then outage results when the first subblock in a codeword belongs to S1 . Let the first subblock in a given codeword belong to S2 . Since the last L1 symbols (within the first subblock) belong to X/ , we have E(L + 1) = [E(L − L1 + 1) − G]+ . If there is no outage during the reception of the first subblock, E(L + 1) = E(L − L1 + 1) − G ≤ Emax − G < Emax /2,

(59)

where the last inequality follows from (58). Now let the second subblock belong to S1 . There is no energy outage during the reception of first L1 symbols within the second subblock if and only if E(L + 1) ≥ G. However, from (59) and (58) it follows that E(L + 1) < Emax /2 < G, and hence outage cannot be avoided in the second subblock. In general, outage results if L satisfies (57), and any two adjacent subblocks in a codeword belongs to S2 and S1 , respectively.



A PPENDIX E P ROOF OF T HEOREM 5 Let X1n be a CSCC sequence with subblock length L and composition P . The energy contained in a sliding window of size T is smallest when the window is arranged such that the partially overlapping portions of subblocks contain symbols with low energy. We consider the two cases separately. i) T ≤ L: In this case, the total energy contained in a sliding window of size T in X1n is bounded as t+T X−1 i=t

b(Xi ) ≥

kX 1 −1

b(xi )2LP (xi ) + u1 b(xk1 ).

(60)

i=0

Note that LP (xi ) is the number of occurrences of symbol xi in each subblock, and the above expression considers the case where the sliding window partially overlaps with two subblocks such that the overlapping portion contains symbols with minimum energy. Further, one can construct

April 18, 2016

DRAFT

28

a CSCC sequence X1n and choose a starting index t for which the inequality in (60) turns to an equality. Dividing (60) by T , we get (20) with δT given by (22). ii) T > L: In this case, the number of subblocks which overlap completely with a sliding window of length T are either qT or qT − 1. When qT subblocks overlap completely with the window, the total energy contained in the window is lower bounded by ν2 , qT

m X

b(xi )LP (xi ) +

i=0

kX 2 −1

b(xi )2LP (xi ) + u2 b(xk2 ),

(61)

i=0

while the energy within the window is lower bounded by ν3 , (qT − 1)

m X

b(xi )LP (xi ) +

i=0

kX 3 −1

b(xi )2LP (xi ) + u3 b(xk3 ),

(62)

i=0

when qT − 1 subblocks overlap completely with the window. From (61) and (62), it follows that the total energy in a sliding window is bounded as t+T X−1

b(Xi ) ≥ min{ν2 , ν3 }.

(63)

i=t

Further, one can construct a CSCC sequence X1n and choose a starting index t for which the inequality 

in (63) turns to an equality. Dividing (63) by T , we get (20) with δT given by (24). A PPENDIX F P ROOF OF C OROLLARY 1

The number of zeros in each subblock is (L − l1 ). Consider a CSCC sequence where the first two subblocks are such that (L − l1 ) zeros are stacked at the end of the first subblock, and also stacked at the start of the second subblock. For this sequence, the energy per symbol within a sliding window of size 1 ≤ T ≤ 2(L − l1 ) is zero when the window overlaps with the zeros in the first two subblock. The average energy per symbol in window of size T = (j + 2)L − 2l1 is minimum when the window overlaps completely with j subblocks, while partially overlapping with two subblocks, located at the start and end, respectively, of the window, and where these partially overlapping positions contain only jl1 zeros. Hence ξ(CPL , (j + 2)L − 2l1 ) = . For 0 ≤ η ≤ l1 , we have ξ(CPL , (j + 2)L − jL + 2(L − l1 ) jl1 + η 2l1 + η) = , and hence ξ(CPL , T ) is strictly increasing in T within the time interval jL + 2(L − l1 ) + η [(j + 2)L − 2l1 , (j + 2)L − l1 ]. (j + 1)l1 For 0 ≤ η ≤ (L − l1 ), we have ξ(CPL , (j + 2)L − l1 + η) = , and hence (j + 1)L + (L − l1 ) + η ξ(CPL , T ) is strictly decreasing in T within the time interval [(j + 2)L − l1 , (j + 3)L − 2l1 ]. 

April 18, 2016

DRAFT

29

A PPENDIX G P ROOF OF T HEOREM 6 We will employ Gallager’s definition of a symmetric channel [51] to show that the induced vector channel using CSCC is always symmetric. Definition 2 ( [51]). A DMC is symmetric if the set of outputs can be partitioned into subsets in such a way that for each subset the matrix of transition probabilities (using inputs as rows and outputs of the subsets as columns) has the property that each row is a permutation of each other row and each column (if more than 1) is a permutation of each other column. Theorem 10 ( [51]). For a symmetric discrete memoryless channel, capacity is achieved by using the inputs with equal probability. We now show that when CSCC is employed on a DMC, the induced vector channel is symmetric, even when the underlying (scalar) DMC is not symmetric. This claim will be proved if we can partition the outputs into subsets such that for each subset the matrix of transition probabilities has the property that each row (column) is a permutation of each other row (column). If y1L ∈ TQL and y˜1L ∈ TQL for a given composition Q, then y˜1L = π(y1L ) for some permutation π . Let TPL be the input alphabet for the induced vector channel using CSCC with subblock-composition   L L L y L |xL : xL ∈ T L }, and so the P . Then it can be verified that {W L π(y1L )|xL 1 1 1 1 : x1 ∈ TP } = {W P columns of the vector channel W L corresponding to output subset TQL are permutations of each other.   L L L y L |xL : y L ∈ T L }, Similarly, from symmetry it follows that {W L y1L |π(xL 1 ) : y1 ∈ TQ } = {W 1 1 1 Q and hence the rows of the vector channel transition matrix are also permutations of each other. Thus W L is symmetric and the proof is complete using Theorem 10.



A PPENDIX H P ROOF OF T HEOREM 8 L (a) For BSC, we have the strict inequality 0 < CCCC (P ) − CCSCC (P ) for 0 < p0 < 0.5 as " L # L X X   1 1 L CCSCC (P ) = H(Y1L ) − H(Y1L |X1L ) = H(Yi |Y1i−1 ) − H(Yi |Xi ) L L i=1 i=1 " L # L X (i) 1 X (ii) < HP ×W (Yi ) − HP ×W (Yi |Xi ) = CCCC (P ), L i=1

April 18, 2016

i=1

DRAFT

30

where Y1i−1 = Y1 . . . Yi−1 , (i) follows since Yi is related to Y1i−1 via X1i−1 and Xi , and the fact that the joint probability PXY (Xi , Yi ) is given by (9), while (ii) follows from (30). For composition P with 0 < γ = min(P (0), P (1)) ≤ 0.5, the output entropy on a BSC is H(Y ) = h(p0 ? γ) and hence CCCC (P ) = h(p0 ? γ) − h(p0 ).

(64)

For CSCC, from (33) and definition of α, it follows that 1 H(X1L ) = H(P ) − r(L, P ) = h(γ) − r(L, P ) = h(α). L

Now using (65) and applying Mrs. Gerber’s Lemma [45], L1 H(Y1L ) ≥ h(p0 ? α), and hence " # L X 1 L L H(Y1 ) − H(Yi |Xi ) ≥ h(p0 ? α) − h(p0 ). CCSCC (P ) = L From (64) and (66) we have

i=1 L CCCC (P ) − CCSCC (P )

(65)

(66)

≤ h(p0 ? γ) − h(p0 ? α). We only have to show

that h(p0 ? γ) − h(p0 ? α) < r(L, P ) for completing the proof. Towards this we first observe that when 0 < x ≤ 0.5 and 0 < p0 < 0.5, then p0 ? x ≥ x. Next we note that the derivative of h(x) satisfies h0 (x) = log 1−x x , and hence is a monotonically decreasing function of x for 0 < x ≤ 0.5. Since h(α) = h(γ) − r(L, P ), we have h(p0 ? γ) − h(p0 ? α) < r(L, P ) ⇐⇒ h(p0 ? γ) − h(γ) < h(p0 ? α) − h(α).

(67)

If we define f (x) = h(p0 ? x) − h(x) for 0 ≤ x ≤ 0.5, then f 0 (x) = (1 − 2p0 )h0 (p0 ? x) − h0 (x). Hence f 0 (x) < 0 for 0 < x ≤ 0.5 since h0 (x) is monotonically decreasing in x and p0 ? x ≥ x. This in turn implies that f (x) is a strictly monotonically decreasing function of x. It follows that f (γ) < f (α) (since α < γ ) and (67) is satisfied. hP i (ii) (i) PL L L (b) When 0 <  < 1, we have CCSCC (P ) < L1 H (Y ) − H (Y |X ) = CCCC (P ), i i i P ×W P ×W i=1 i=1

where the strict inequality (i) follows since Yi is related to Y1i−1 via X1i−1 and Xi , and (ii) follows from (30). When γ = P (0) = Pr(X = 0), then for BEC we have CCCC (P ) = (1 − )h(γ). If α is chosen such that h(α) = h(γ) − r(L, P ), then from (33) it follows that H(X1L )/L = h(α).

Now applying an extension of MGL for binary input symmetric channels [46], we get H(Y1L )/L ≥ h i P L (1 − )h(α) + h(). Thus, CCSCC (P ) = L1 H(Y1L ) − L H(Y |X ) ≥ (1 − )h(α), and (43) i i i=1 L follows from the expressions for CCCC (P ) and CCSCC (P ).

 A PPENDIX I P ROOF OF T HEOREM 9 As discussed in Sec. IV-C, L uses of the channel for CSCC using subblock-composition P , subblock length L, and codeword length n on DMC W , may be viewed as a single use of a vector channel having

April 18, 2016

DRAFT

31

input alphabet TPL , output alphabet Y L , length-L product channel W L , and codeword length equal to n/L super-letters. Since rate R for the scalar channel corresponds to rate LR for the vector channel, the

random coding bound on the average probability of error for the induced vector channel W L is [51] n P˜e ≤ exp[− Er (LR)], L

(68)

where Gallager’s error exponent Er (LR) is given by [51]   Er (LR) = max max E0 (ρ, PX1L ) − ρLR ,

(69)

0≤ρ≤1 PX L 1

1+ρ

 E0 (ρ, PX1L ) = − log

X

X  L 1

y ∈Y

L

L 1

x ∈T

L L L 1/(1+ρ)  PX1L (xL 1 ) W (y1 |x1 )

.

(70)

L P

For CSCCs, the induced vector channel is symmetric, and hence the maximum is achieved in (69) when the input distribution PX1L is uniform [51], given by    1L , if xL ∈ T L 1 P |TP | L L P (x1 ) =  0, otherwise.

(71)

In this case, the error exponent Er (LR) is equal to the exponent of Csisz´ar and K¨orner [43], [55] Er (LR, P L , W L ) , min D(V L ||W L |P L ) + [I(P L , V L ) − LR]+ , L V

(72)

V L ranging over all channels V L : TPL → Y L . Thus, the error probability bound (68) can equivalently

be written as n P˜e ≤ exp[− Er (LR, P L , W L )]. L

(73)

The first term on the right side of (72) is V L (y1L |xL 1) L L W (y1 |xL 1) L xL 1 ,y1   1 L L = −HP L ×V L (Y1 |X1 ) + EP L ×V L log L L L W (Y1 |X1 )    L L X (a) X 1 ≥ −HP ×Vi (Y |X) + EP ×Vi log = D(Vi ||W |P ), W (Yi |Xi )

D(V L ||W L |P L ) =

X

L L L P L (xL 1 )V (y1 |x1 ) log

i=1

(74)

(75) (76)

i=1

where Vi is the marginal distribution of V L corresponding to the ith symbol, and (a) follows from the memoryless property of W and the fact conditioning only reduces entropy.

April 18, 2016

DRAFT

32

The term I(P L , V L ) in (72) can be bounded as follows, I(P L , V L ) = HP L (X1L ) − HP L ×V L (X1L |Y1L ) (b)

≥ LHP (X) − Lr(L, P ) −

L X

(77)

HP ×Vi (Xi |Yi )

(78)

i=1

=

L X

I(P, Vi ) − Lr(L, P ),

(79)

i=1

where (b) follows using (33), and the fact conditioning only reduces entropy. Let Vb L denote that V L which achieves the minimum in (72). Using an argument similar to the one above (55) in Appendix C, it can be shown that for 1 < i ≤ L, the marginal distributions, Vbi , are identically distributed. Then Er (LR, P L , W L ) = D(Vb L ||W L |P L ) + [I(P L , Vb L ) − LR]+ (c)

≥ L D(Vb1 ||W |P ) + [L I(P, Vb1 ) − Lr(L, P ) − LR]+

≥ L min D(V ||W |P ) + [I(P, V ) − (R + r(L, P ))]+



V

= L Er (R + r(L, P ), P, W ),

(80)

where (c) follows from (76), (79), and the fact Vbi , 1 ≤ i ≤ L are identically distributed. The theorem is 

proved by applying (80) in (73). R EFERENCES

[1] A. Tandon, M. Motani, and L. R. Varshney, “Constant subblock composition codes for simultaneous energy and information transfer,” in Proc. IEEE SECON 2014 Workshop on Energy Harvesting Communications, Jun. 2014, pp. 45–50. [2] ——, “Real-time simultaneous energy and information transfer,” in Proc. 2015 IEEE Int. Symp. Inf. Theory, Jun. 2015, pp. 1124–1128. [3] N. Pavlidou, A. J. H. Vinck, J. Yazdani, and B. Honary, “Power line communications: state of the art and future trends,” IEEE Commun. Mag., vol. 41, no. 4, pp. 34–40, Apr. 2003. [4] L. R. Varshney, “On energy/information cross-layer architectures,” in Proc. 2012 IEEE Int. Symp. Inf. Theory, Jul. 2012, pp. 1361–1365. [5] Z. Ding, C. Zhong, D. W. K. Ng, M. Peng, H. A. Suraweera, R. Schober, and H. V. Poor, “Application of smart antenna technologies in simultaneous wireless information and power transfer,” IEEE Commun. Mag., vol. 53, no. 4, pp. 86–93, Apr. 2015. [6] S. Ulukus, A. Yener, E. Erkip, O. Simeone, M. Zorzi, P. Grover, and K. Huang, “Energy harvesting wireless communications: A review of recent advances,” IEEE J. Sel. Areas Commun., vol. 33, no. 3, pp. 360–381, Mar. 2015. [7] J. T. Tengdin, “Distribution line carrier communications - an historical perspective,” IEEE Trans. Power Del., vol. 2, no. 2, pp. 321–329, Apr. 1987.

April 18, 2016

DRAFT

33

[8] G. A. Covic and J. T. Boys, “Modern trends in inductive power transfer for transportation applications,” IEEE J. Emerg. Sel. Topics Power Electron., vol. 1, no. 1, pp. 28–41, March 2013. [9] J. S. Ho, A. J. Yeh, E. Neofytou, S. Kim, Y. Tanabe, B. Patlolla, R. E. Beygui, and A. S. Y. Poon, “Wireless power transfer to deep-tissue microimplants,” Proc. Natl. Acad. Sci. U.S.A., vol. 111, no. 22, pp. 7974–7979, Jun. 2014. [10] I. Krikidis, S. Timotheou, S. Nikolaou, G. Zheng, D. W. K. Ng, and R. Schober, “Simultaneous wireless information and power transfer in modern communication systems,” IEEE Commun. Mag., vol. 52, no. 11, pp. 104–110, Nov. 2014. [11] J. C. Lin, “Wireless power transfer for mobile applications, and health effects,” IEEE Antennas Propag. Mag., vol. 55, no. 2, pp. 250–253, Apr. 2013. [12] A. Yakovlev, S. Kim, and A. Poon, “Implantable biomedical devices: Wireless powering and communication,” IEEE Commun. Mag., vol. 50, no. 4, pp. 152–159, Apr. 2012. [13] G. Yilmaz, O. Atasoy, and C. Dehollain, “Wireless energy and data transfer for in-vivo epileptic focus localization,” IEEE Sensors J., vol. 13, no. 11, pp. 4172–4179, Nov. 2013. [14] L. R. Varshney, “Transporting information and energy simultaneously,” in Proc. 2008 IEEE Int. Symp. Inf. Theory, Jul. 2008, pp. 1612–1616. [15] P. Grover and A. Sahai, “Shannon meets Tesla: Wireless information and power transfer,” in Proc. 2010 IEEE Int. Symp. Inf. Theory, Jun. 2010, pp. 2363–2367. [16] P. Popovski, A. M. Fouladgar, and O. Simeone, “Interactive joint transfer of energy and information,” IEEE Trans. Commun., vol. 61, no. 5, pp. 2086–2097, May 2013. ´ I. Barbero, E. Rosnes, G. Yang, and Ø. Ytrehus, “Constrained codes for passive RFID communication,” in Proc. 2011 [17] A. Inf. Theory Appl. Workshop, Feb. 2011. [18] A. M. Fouladgar, O. Simeone, and E. Erkip, “Constrained codes for joint energy and information transfer,” IEEE Trans. Commun., vol. 62, no. 6, pp. 2121–2131, Jun. 2014. [19] A. Tandon, M. Motani, and L. R. Varshney, “On code design for simultaneous energy and information transfer,” in Proc. 2014 Inf. Theory Appl. Workshop, Feb. 2014. [20] K. A. S. Immink, P. H. Siegel, and J. K. Wolf, “Codes for digital recorders,” IEEE Trans. Inf. Theory, vol. 44, no. 6, pp. 2260–2299, Oct. 1998. [21] K. A. S. Immink, “Runlength-limited sequences,” Proc. IEEE, vol. 78, no. 11, pp. 1745–1759, Nov. 1990. [22] E. Zehavi and J. K. Wolf, “On runlength codes,” IEEE Trans. Inf. Theory, vol. 34, no. 1, pp. 45–54, 1988. [23] S. Shamai and Y. Kofman, “On the capacity of binary and Gaussian channels with run-length-limited inputs,” IEEE Trans. Commun., vol. 38, no. 5, pp. 584–594, 1990. [24] P. Jacquet and W. Szpankowski, “Noisy constrained capacity for BSC channels,” IEEE Trans. Inf. Theory, vol. 56, no. 11, pp. 5412–5423, Nov. 2010. [25] Z. Cherif, J.-L. Danger, S. Guilley, J.-L. Kim, and P. Sole, “Multiply constant weight codes,” in Proc. 2013 IEEE Int. Symp. Inf. Theory, Jul. 2013, pp. 306–310. [26] Y. M. Chee, Z. Cherif, J.-L. Danger, S. Guilley, H. M. Kiah, J.-L. Kim, P. Sole, and X. Zhang, “Multiply constant-weight codes and the reliability of loop physically unclonable functions,” IEEE Trans. Inf. Theory, vol. 60, no. 11, pp. 7026–7034, Nov. 2014. [27] Z. Cherif, J.-L. Danger, S. Guilley, and L. Bossuet, “An easy-to-design PUF based on a single oscillator: The Loop PUF,” in Proc. 2012 Euromicro Conf. Digital Syst. Design, Sept. 2012, pp. 156–162. [28] E. Ordentlich and R. M. Roth, “Two-dimensional weight-constrained codes through enumeration bounds,” IEEE Trans. Inf. Theory, vol. 46, no. 4, pp. 1292–1301, Jul. 2000.

April 18, 2016

DRAFT

34

[29] Y. M. Chee, H. M. Kiah, and P. Purkayastha, “Matrix codes and multitone frequency shift keying for power line communications,” in Proc. 2013 IEEE Int. Symp. Inf. Theory, Jul. 2013, pp. 2870–2874. [30] W. Chu, C. J. Colbourn, and P. Dukes, “Constructions for permutation codes in powerline communications,” Des. Codes Cryptog., vol. 32, pp. 51–64, 2004. [31] C. J. Colbourn, T. Klove, and A. C. H. Ling, “Permutation arrays for powerline communication and mutually orthogonal latin squares,” IEEE Trans. Inf. Theory, vol. 50, no. 6, pp. 1289–1291, Jun. 2004. [32] W. Chu, C. J. Colbourn, and P. Dukes, “On constant composition codes,” Discrete Appl. Math., vol. 154, no. 6, pp. 912–929, Apr. 2006. [33] Y. M. Chee, H. M. Kiah, P. Purkayastha, and C. Wang, “Importance of symbol equity in coded modulation for power line communications,” in Proc. 2012 IEEE Int. Symp. Inf. Theory, Jul. 2012, pp. 661–665. [34] O. Ozel and S. Ulukus, “AWGN channel under time-varying amplitude constraints with causal information at the transmitter,” in Conf. Rec. 45th Asilomar Conf. Signals, Syst. Comput., Nov. 2011, pp. 373–377. [35] ——, “Achieving AWGN capacity under stochastic energy harvesting,” IEEE Trans. Inf. Theory, vol. 58, no. 10, pp. 6471–6483, Oct. 2012. [36] Y. Dong and A. Ozgur, “Approximate capacity of energy harvesting communication with finite battery,” in Proc. 2014 IEEE Int. Symp. Inf. Theory, June 2014, pp. 801–805. [37] V. Jog and V. Anantharam, “An energy harvesting AWGN channel with a finite battery,” in Proc. 2014 IEEE Int. Symp. Inf. Theory, Jun. 2014, pp. 806–810. [38] R. Rajesh, V. Sharma, and P. Viswanath, “Capacity of Gaussian channels with energy harvesting and processing cost,” IEEE Trans. Inf. Theory, vol. 60, no. 5, pp. 2563–2575, May 2014. [39] W. Mao and B. Hassibi, “On the capacity of a communication system with energy harvesting and a limited battery,” in Proc. 2013 IEEE Int. Symp. Inf. Theory, Jul. 2013, pp. 1789–1793. [40] X. Zhou, R. Zhang, and C. K. Ho, “Wireless information and power transfer: Architecture design and rate-energy tradeoff,” IEEE Trans. Commun., vol. 61, no. 11, pp. 4754–4767, Nov. 2013. [41] S. Arimoto, “An algorithm for computing the capacity of arbitrary discrete memoryless channels,” IEEE Trans. Inf. Theory, vol. 18, no. 1, pp. 14–20, Jan. 1972. [42] R. E. Blahut, “Computation of channel capacity and rate-distortion functions,” IEEE Trans. Inf. Theory, vol. 18, no. 4, pp. 460–473, Jul. 1972. [43] I. Csisz´ar and J. K¨orner, Information Theory: Coding Theorems for Discrete Memoryless Systems (2nd ed).

Cambridge

Univ. Press, 2011. [44] R. M. Fano, Transmission of Information: A Statistical Theory of Communications.

Cambridge, MA: MIT Press, 1961.

[45] A. D. Wyner and J. Ziv, “A theorem on the entropy of certain binary sequences and applications–I,” IEEE Trans. Inf. Theory, vol. IT-19, no. 6, pp. 769–772, Nov. 1973. [46] N. Chayat and S. Shamai, “Extension of an entropy property for binary input memoryless symmetric channels,” IEEE Trans. Inf. Theory, vol. 35, no. 5, pp. 1077–1079, Sept. 1989. [47] Y. Polyanskiy, “Channel coding: Non-asymptotic fundamental limits,” Ph.D. thesis, Princeton University, Nov. 2010. [48] P. Moulin, “The log-volume of optimal constant-composition codes for memoryless channels, within O(1) bits,” in Proc. 2012 IEEE Int. Symp. Inf. Theory, Jul. 2012, pp. 826–830. [49] M. Tomamichel and V. Tan, “A tight upper bound for the third-order asymptotics for most discrete memoryless channels,” IEEE Trans. Inf. Theory, vol. 59, no. 11, pp. 7041–7051, Nov. 2013.

April 18, 2016

DRAFT

35

[50] Y. Polyanskiy, H. V. Poor, and S. Verdu, “Channel coding rate in the finite blocklength regime,” IEEE Trans. Inf. Theory, vol. 56, no. 5, pp. 2307–2359, May 2010. [51] R. G. Gallager, Information Theory and Reliable Communication. [52] R. B. Ash, Information Theory.

New York: John Wiley and Sons, Inc., 1968.

New York: Dover Publications, 1965.

[53] A. Tandon, M. Motani, and L. R. Varshney, “Subblock-constrained codes for real-time simultaneous energy and information transfer,” May 2015, arXiv:1506.00213v1 [cs.IT]. [54] R. E. Blahut, “Hypothesis testing and information theory,” IEEE Trans. Inf. Theory, vol. 20, no. 4, pp. 405–417, Jul. 1974. [55] I. Csisz´ar and J. K¨orner, “Graph decomposition: A new key to coding theorems,” IEEE Trans. Inf. Theory, vol. 27, no. 1, pp. 5–12, Jan. 1981.

April 18, 2016

DRAFT