Secure Broadcasting over Fading Channels - MIT

1

Secure Broadcasting over Fading Channels Ashish Khisti, Student Member, IEEE, Aslan Tchamkerten, and Gregory W. Wornell, Fellow, IEEE

Abstract Motivated by the key-distribution application, we study the information theoretic problem of broadcasting confidential messages to multiple receivers. Two scenarios are considered: all receivers want a common message and each receiver needs an independent message. For the case of reversely degraded parallel channels with one sender, one eavesdropper and arbitrary number of legitimate receivers we determine the common-message-secrecy-capacity and the secrecy-sum-capacity for independent messages. For the case of fading channels, we assume that the channel state information of the legitimate receivers is known to all the terminals, while the channel state information of the eavesdropper is only known to the eavesdropper. For the case of common message our proposed scheme achieves a rate independent of the number of receivers, in contrast to naive schemes that achieve a rate which vanishes with the number of receivers. For the case of independent messages, an opportunistic transmission scheme is proposed and shown to be optimal in the limit of large number of receivers. Index Terms Key distribution, wiretap channel, information theoretic secrecy, confidential messages, parallel channels, fading channels, multiuser diversity, multicasting

I. I NTRODUCTION A number of emerging applications require a key distribution mechanism to selectively broadcast confidential messages to legitimate receivers. For example in pay TV systems, a content provider wishes to selectively broadcast a certain program to a subset of customers who have subscribed to it. An online key distribution mechanism enables the service provider to distribute a decryption key to these legitimate receivers while securing it from potential eavesdroppers. The program can be encrypted via standard cryptographic protocols, so that only users who have access to the decryption key can view it. In the absence of such a mechanism, current solutions rely on variants of traditional public key cryptography (see, e.g., [6]) and are vulnerable to attacks such as piracy [8]. The problem of broadcasting confidential messages in an information theoretic setting was formulated by Wyner [24]. Wyner’s wiretap channel has three terminals: one sender, one legitimate receiver, and one eavesdropper. The tradeoff between the rate to the legitimate receiver and the equivocation at the eavesdropper is characterized when the eavesdropper has a degraded channel compared to the legitimate receiver. This formulation has been generalized for non-degraded broadcast channels in [4], and applied to Gaussian channels in [12]. In recent times the wiretap channel has received renewed interest for secure communication in a wireless environment [1], [10], [11], [14], [18]. The approach in these works is that the channel variations experienced by the receivers can be exploited to enable secure communication even when the eavesdropper on an average has a channel stronger than the receiver. Some of these works [10], [11], [18], observe that for ergodic fading channels only statistical knowledge of the eavesdropper’s channel suffices for secure communication, and the proposed strategies carefully adapt to the channel variations of the legitimate receiver. In the present paper we further investigate physical layer security by considering the setting of multiple receivers — a natural scenario for key distribution applications. This work was supported in part by NSF under Grant No. CCF-0515109. The authors are with the Massachusetts Institute of Technology. Email: {khisti,tcham,gww}@mit.edu. This work was presented in part at the 44th Annual Allerton Conference on Communication, Control and Computing, Monticello, IL, September 26-29, 2006.

2

We first extend Wyner’s wiretap channel to parallel broadcast channels with one sender, multiple legitimate receivers, and one eavesdropper. We consider two situations: all legitimate receivers get a common message or independent messages. We first derive upper and lower bounds on the commonmessage-secrecy-capacity. These bounds coincide when the users are reversely degraded. Next, we consider the case where the legitimate receivers get independent messages. We establish the secrecy-sum-capacity for the reversely degraded case. The achievable scheme is simple: transmit to the strongest user on each parallel channel and use independent codebooks across the channels. Our results for the parallel broadcast channels can be viewed as generalizations of the results in [7] which considers a similar setup without the presence of an eavesdropper. Nevertheless, our achievable scheme for the common-message-secrecycapacity when specialized to the case of no eavesdropper yields a different scheme than the one proposed in [7]. We extend our achievable schemes for the parallel channels to the case of fading channels. In this setup, we assume that the legitimate receivers’ channel state information (CSI) is revealed to all communicating parties (including the eavesdropper), while the eavesdropper’s channel gains are revealed only to her. We first examine the case when a common message needs to be delivered to all legitimate receivers in the presence of potential eavesdroppers and present a scheme that achieves a rate that does not decay to zero with increasing number of receivers. Note that, without a secrecy constraint, transmitter CSI appears to be of little value for multicasting over ergodic channels. Indeed the capacity appears to be not too far from the maximum achievable rate with a flat power allocation scheme. The secrecy constraint adds a new dimension to the multicasting problem as it requires to consider protocols that exploit transmitter CSI in an efficient manner. For the case of independent messages, we propose an opportunistic scheme that selects the user with the strongest channel at each time. With Gaussian wiretap codebooks for each legitimate receiver, we show that this scheme achieves the sum capacity in the limit of large number of receivers. Our results can be interpreted as the wiretap analog of the multiuser diversity results in settings without secrecy constraint (see, e.g., [22]). The remainder of the paper is organized as follows. In Section II we formally describe the channel models considered in this work. The main results are summarized in Section III. The case of common message is considered in Sections IV and V for the parallel and fading channels respectively, while Sections VI and VII deal with the independent message case for parallel and fading channels. We use the following notation. Upper case letters are used for random variables and the lower case for their realizations. The notation sn denotes a vector of length n. Vector quantities related to the eavesdropper have a subscript e, e.g., yen , while the ones of the legitimate receivers are subscripted by the user number, e.g., yin . We use the subscript i to index the receivers and the subscript j to index the channels. We use the letter t to denote the discrete time index. If there is an ordering of users on a given channel, the strongest user on channel j will be denoted by πj . The set of ordered users on channel j is denoted as πj (1), πj (2), . . .. The entropy of a discrete random variable X will be denoted by H(X) and the mutual information between two random variables X and Y is denoted by I(X; Y ). Following this convention, p(X) will denote the probability mass function of random variable X, as opposed to pX (·). II. C HANNEL M ODEL In this section, we present formal definitions for the parallel channel and fading channel models to be considered in the rest of this paper. A. Parallel Channels In our setup, there are M parallel channels for communication, one sender, K legitimate receivers, and one eavesdropper. Definition 1 (Product Broadcast Channel): An (M, K) product broadcast channel consists of one sender, K receivers, one eavesdropper, and M channels. The channels have finite input and output alphabets, are

3

X1

Y11

Y21

Ye1

X2

Y22

Ye2

Y12

X3

Y13

Ye3

Y23

Fig. 1. An example of reversely degraded parallel channel in Definition 2 with one sender, K = 2 users, one eavesdropper, and M = 3 channels. The input symbols to the three channels are (X1 , X2 , X3 ). The output at receivers are (Yi1 , Yi2 , Yi3 ), while the output at the eavesdropper is (Ye1 , Ye2 , Ye3 ). Note that on each channel the receivers are degraded, but the order of degradation is different across the channels.

memoryless and independent of each other, and are characterized by their transition probabilities Pr

n n n n {y1j , y2j , . . . , yKj , yej }j=1,...,M

|

{xnj }j=1,...M

=

M Y n Y j=1 t=1

Pr(y1j (t), y2j (t), . . . , yKj (t), yej (t) | xj (t))

(1) where = xj (1), xj (2), . . . , xj (n) denotes the sequence of symbols transmitted on channel j, and where yijn = yij (1), yij (2), . . . , yij (n) denotes the sequence of symbols received by user i on channel j from time 1 up to n. The alphabets of the Xj ’s and Yij ’s are denoted by X and Y respectively. Of particular interest is a special class of reversely degraded broadcast channels. Definition 2 (Reversely Degraded Broadcast Channel): An (M, K) reversely degraded broadcast channel is an (M, K) product broadcast channel, where each of the M parallel channels is degraded in a certain order. For some permutation πj (1), πj (2), . . . πj (K + 1) of the set {1, 2, . . . , K, e} of the K + 1-receivers, a Markov chain Xj → Yπj (1) → Yπj (2) → . . . → Yπj (K+1) can be specified. Note that in Definition 2 the order of degradation can be different across the channels, so the overall channel may not be degraded. An example of reversely degraded parallel channel is shown in Fig 1. Also, on any parallel channels component, the K users and the eavesdropper are physically degraded. The capacity results will, however, only depend on the marginal distribution of receivers on each channel. Accordingly, these results also hold for a larger class where receivers on each channel component are stochastically degraded. xnj

B. Fading Channels We consider the case of one sender, K receivers, and one eavesdropper. We will focus on the fast fading channel model, yi (t) = hi (t)x(t) + zi (t),

i ∈ {1, 2, . . . K, e},

t ∈ {1, 2, . . . n}.

(2)

We assume that the channel gains of all the users are independently sampled. The gains {hi (t)} are sampled i.i.d. CN (0, µi ) for each t = 1, 2, . . . , n. The channel gains he (t) of the eavesdropper are sampled i.i.d. CN (0, µe ) and independent of the channel gains of all the users. We assume that all the noise vectors zi (t) are sampled from i.i.d. CN (0, 1) distribution and the input must satisfy an average power constraint E[|X(t)|2 ] ≤ P . In practice, the fast fading model (2) is unrealistic. A typical model for wireless systems is a block fading channel model where the channel gains are constant for a certain period and change across blocks. In conventional systems, the ergodic capacity does not depend on the coherence period (see, e.g., [2]). In contrast in the wiretap setting the capacity appears to be sensitive to the length of the coherence period [10]. Note that the fast fading model (2) can be realized via interleaving of codebooks. Throughout, we assume the hi (t)’s to be revealed to the transmitter, the K legitimate receivers and the eavesdropper in a causal manner. Implicitly we assume that there is an authenticated public feedback link from the receivers to the transmitter. The channel coefficients of the eavesdropper {he (t)}1≤t≤n are only

4

known to the eavesdropper. The transmitter and the legitimate receivers only have statistical knowledge of the eavesdropper’s channel gains. Also recall that there is no prior key shared between the sender and the legitimate receiver and both the encoding/decoding functions and the codebook are public. Secure communications is enabled by adapting the transmission to match the legitimate receivers channel. As a final remark, we note that in our setup we are assuming only one eavesdropper. However note that the secrecy capacity only depends on the statistics of He (t) channel and not on the actual realized sequence. Since the transmitter and the legitimate receivers do not have the CSI of the eavesdropper’s channel, the encoding and decoding functions do not depend on he (t). In this sense, if the message is deemed to be secure against any given eavesdropper, it is also secure against any statistically equivalent eavesdropper. III. M AIN R ESULTS In this section, we summarize the main results of the paper. All our results are regarding the secrecy capacity of the channel models stated in Section II. Formal definitions of achievable rate and the secrecy capacity will be provided in the subsequent sections. Remark 1: We only focus on the secrecy capacity as opposed to the entire rate-equivocation region. In the key-distribution application of interest, the key length is limited by the equivocation rate Re — the minimum number of bits the eavesdropper needs to guess to decode the message. Accordingly, the secrecy capacity is of primary interest.1 A. Parallel Channels - Common Message Our main result is the common-message-secrecy-capacity for the reversely degraded parallel channel. Theorem 1: The common-message-secrecy-capacity for the reversely degraded channel model in Definition 2 is given by M X common CK,M = Q max min I(Xj ; Yij |Yej ) (3) M j=1

p(Xj ) i∈{1,2,...,K}

j=1

where the maximization is over the product distributions i.e., p(X1 , . . . , XM ) = p(X1 )×p(X2 ) . . .×p(XM ) and the minimization is over the set of users. Towards this end, we establish upper and lower bounds on the common-message-secrecy-capacity for the product channel model (1) and observe that the bounds coincide for the reversely degraded model. To state our upper bound we introduce the following additional notation. For each j = 1, 2, . . . , M, let Pj denote the collection of all joint distributions p′ (Y1j , Y2j , . . . YKj , Yej |Xj ) with the same marginal marginal distribution as p(Y1j |Xj ), p(Y2j |Xj ), . . . , p(YKj |Xj ), p(Yej |Xj ). Let P = P1 × P2 × . . . × PM denote the cartesian product of these sets across the channels. Lemma 1 (Upper Bound): For the product broadcast channel model in Definition 1, an upper bound on the secrecy capacity is given by +,common RK,M = min Q max ∆

P

M j=1

min

p(Xj ) i∈{1,2,...,K}

M X j=1

I(Xj ; Yij |Yej )

(4)

where the first minimum is over all the joint distributions, {p′ (Y11 , Y21 , . . . YK1, Ye1 |X1 ), . . . , p′ (Y1M , Y2M , . . . YKM , YeM |XM )} ∈ P. 1

Throughout this paper, we refer to “perfect secrecy” as the scenario in which the normalized mutual information at the eavesdropper is vanishingly small in the block length. Note that this is significantly weaker than the notion considered by Shannon [19], which requires that the mutual information be zero regardless of the blocklength, and the notion by Maurer and Wolf [17] which requires that the mutual information approach zero with the block length (without the normalization with the blocklength). See the conclusion section of this paper for a further discussion on this.

5

and the maximum is over all marginal distributions p(X1 ), . . . , p(XM ) of mutually independent random variables X1 , X2 , . . . , XM . Lemma 2 (Lower Bound): An achievable common-message-secrecy-rate for the product broadcast channel model Definition 1 is2 −,common ∆ RK,M =

Q max p(U ) M j=1

j

min

i∈{1,2,...,K}

{Xj =fj (Uj )}j=1,...,M

M X j=1

{I(Uj ; Yij ) − I(Uj ; Yej )}+ .

(5)

The random variables U1 , U2 , . . . UM are independent over some alphabet U, and each fj : U → X , j = 1, . . . , M is a mapping from the U to X . Remark 2: In [7] the problem of broadcasting common and independent messages over reversely degraded channels has been considered. We remark that our achievable scheme in Lemma 2, when specialized to the case of no eavesdropper, yields a different capacity achieving scheme than the one in [7]. An obvious random binning extension of the scheme presented in [7] does not achieve the secrecy capacity in Theorem 1. B. Fading Channels - Common Message A number of recent works [1], [10], [11], [14] have observed that secure communication is possible over fading channels even when the eavesdropper’s channel is on an average stronger than the legitimate receiver’s channel. The key idea is to adapt the rate and transmit power to “match” the channel of the legitimate receiver. In the present work, we provide some insights into the robustness of these schemes by considering the case when a common message has to be delivered to a set of users, while keeping it secret from potential eavesdroppers. The common message constraint requires us to simultaneously adapt rate and power to the channel gains of several legitimate users. Somewhat surprisingly, we observe that it is possible to broadcast at a rate independent of the number of legitimate users. Theorem 2: The common-message-secrecy-rate is for the fast-fading model (2) is bounded by: C common (P ) ≥ Rcommon (P ) = min EHi {log(1 + |Hi |2 P ) − EHe [log(1 + |He |2 P )]}+ 1≤i≤K

C

common

(P ) ≤

common R+ (P )

= min

max

1≤i≤K P (Hi ):E[P (Hi )]≤P

E[{log(1 + |Hi |2 P (Hi)) − log(1 + |He |2 P (Hi ))}+ ].

(6) One can contrast the achievable rate in (6) with the rate of the following naive scheme: the transmitter only transmits when all the users have channel gains above a threshold. Note that such a scheme achieves a rate that decays to zero exponentially in the number of users and is highly suboptimal. We numerically evaluate the performance of our proposed scheme in the high SNR regime. Corollary 1: If the channel gains of all the users and the eavesdropper are i.i.d. Rayleigh faded with E[|Hi |2 ] = 1: lim Rcommon (P ) = 0.709 bits/symbol P →∞ (7) common lim R+ (P ) = 1 bits/symbol P →∞

While our proposed scheme achieves a rate independent of the number of users (and hence the best possible scaling with the number of users), the optimality of the scheme remains open. Note that even the special case of K = 1 has not been resolved for the fast fading model [11], [15]. 2

{v}+ stands for max{0, v}.

6

C. Parallel Channels - Independent Message In absence of the secrecy constraint, the sum capacity for the reversely degraded broadcast channel is maximized when only the strongest user on each parallel channel is served [21]. We show that the same scheme is also optimal with the secrecy constraint. Theorem 3: Let πj denote the strongest user on channel j. The secrecy-sum-capacity for the reversely broadcast channel is given by sum CK,M

=

max

p(X1 )p(X2 )...p(XM )

M X j=1

I(Xj ; Yπj |Yej ).

(8)

Furthermore, the expression in (8) is an upper bound on the secrecy-sum-capacity when only the legitimate users are reversely degraded — but the set of receivers together with the eavesdropper is not degraded. D. Fading Channels - Independent Message The problem of broadcasting independent messages to legitimate receivers over ergodic fading channels has been well studied [13], [21]. An opportunistic transmission scheme is shown to attain the largest sum capacity. Here, we derive analogous results for secure transmission. Suppose that Hmax denotes the largest instantaneous channel gain among the K users. Lemma 3: For the channel model (2), the secrecy-sum-capacity is upper and lower bounded as + RK (P ) = max E {log(1 + |Hmax |2 P (Hmax)) − log(1 + |He |2 P (Hmax ))}+ (9) P (Hmax ):E[P (Hmax )]≤P

and

− (P ) = RK

max

P (Hmax ):E[P (Hmax )]≤P

E log(1 + |Hmax |2 P (Hmax )) − log(1 + |He |2 P (Hmax )) ,

(10)

respectively, where {v}+ denotes the max(0, v). The difference in our lower and upper bounds in (9) and (10) is that the {·}+ operator is inside the expectation in our upper bound but not in the lower bound. Thus the “loss” with respect to the upper bound occurs whenever |Hmax |2 ≤ |He |2 . As the number of legitimate receivers grows this event happens rarely and the gap between the bounds vanishes. Formally we have Theorem 4: Suppose that all the channel gained are sampled from CN (0, 1). Then the secrecy capacity satisfies, sum CK (P ) = max E log(1 + |Hmax |2 P (Hmax )) − log(1 + |He |2 P (Hmax)) + o(1), (11) P (Hmax ):E[P (Hmax )]≤P

+ where o(1) → 0 as K → ∞. More generally, the gap between our upper bound RK (P ) and the lower − bound RK (P ) in Lemma 3 satisfies # " 2 |He | 2 2 + − 2 2 (12) RK (P ) − RK (P ) ≤ Pr(|He | ≥ |Hmax | )E log |He | ≥ |Hmax | . |Hmax |2 The result of Theorem 4 shows that opportunistic transmission in conjunction with single user Gaussian codebooks achieves the optimal sum secrecy-rate in the limit of large number of receivers. Remark 3: It is possible to improve the achievable rate by transmitting some fictitious noise as reported in [11], [15], but there is still a gap between the upper and lower bounds. Our bounds in the high SNR limit as a function of the number of receivers are shown in Fig. 2 for the i.i.d. Rayleigh fading case. Note that the bounds are quite close even for a moderate number of users.

7

3

Rate (bits/symbol)

2.5

2

1.5

1

0.5 0

5

10

15

20

25

30

Number of Users

Fig. 2. Upper and Lower bounds in the High SNR limit for the i.i.d. Rayleigh fading case. The y-axis plots the bounds in bits/symbol while the x-axis plots the number of users.

IV. PARALLEL C HANNELS - C OMMON M ESSAGE In this section we consider the case where all the receivers are interested in only a common message. This common message must be protected from the eavesdropper in the sense described below. Definition 3: A (n, 2nR ) code consists of a message set W = {1, 2, . . . 2nR }, a (possibly stochastic) mapping ωn : W → |X n × X n {z × . . . × X n} from the message set to the codewords for the M channels, M times

n n and a decoder Φi,n : Y . . . . . . . . × Y n} → W for i = 1, 2, . . . K at each receiver. We denote the | × Y × .{z M times

ˆ i . A common-message-secrecy-rate R is achievable if, for any ε > 0, message estimate at decoder i by W ˆ i ) ≤ ε for i = 1, 2, . . . K, while there exists a length n code such that Pr(W 6= W

1 n H(W |Ye1n , Ye2n , . . . , YeK ) ≥ R − ε. n The common-message-secrecy-capacity is the supremum over all achievable rates. The rest of this section is devoted to the proof of the results stated in Section III-A.

(13)

A. Upper Bound: Lemma 1 The complete derivation of the upper bound is provided in Appendix II. It relies of the following facts. Fact 1: ( [4]) The common-message-secrecy-capacity for the wiretap channel depends only on the marginal distributions p(Y1j |Xj ), p(Y2j |Xj ), . . . , p(YKj |Xj ) in (1) and not on the joint distribution p(Y1j , . . . , YKj |Xj ) for each j = 1, 2, . . . , M. Fact 2: For any random variables X, Y , and Z the quantity I(X; Y |Z) is concave in p(X). Proof: See Appendix I B. Achievable Rate: Lemma 2 We informally present the main ideas in our achievability scheme. The structure of the codebooks is shown in Fig. 3. We construct M independent codebooks, one for each channel, denoted as C1 , C2 , . . . , CM .

8

Q1 = 2n(I(U1 ;Y1 )−ǫF ) CWs per Bin

Q2 = 2n(I(U2 ;Y2 )−ǫF ) CWs per Bin

n U11 (w1 )

n (w1 ) U1Q 1

Msg. 2 bin

n U11 (w2 )

n (w2 ) U1Q 1

n U21 (w1 )

n U2Q (w1 ) 2

2nR message Bins

Msg. 1 bin

Codebook C2

Codebook C1

Fig. 3. Structure of the codebooks in our coding scheme for the case of two parallel channels. Each codebook has 2nR message bins and Qj ≈ 2n(I(Uj ;Yej )) codewords per message bin. Thus the size of bins depends on the mutual information of the eavesdropper on the corresponding channel. This flexible binning enables to confuse the eavesdropper on each channel. Note that C1 and C2 above have the same n number of rows but different number of columns. The codewords for message wk in Cj are labeled as un j1 (wk ), . . . ujQj (wk )

.

For R as in Lemma 2, each Cj has 2n(R+I(Uj ;Yej )) codewords, randomly partitioned into 2nR message bins — there are 2nI(Uj ;Yej ) codewords per bin. Given a message W , the encoder selects M codewords as follows. On channel j, it looks into the bin corresponding to message W in Cj and randomly selects a codeword in this bin. Each legitimate receiver attempts to find a message that is jointly typical with its received sequences. An appropriate choice of R guarantees successful decoding with high probability for each legitimate receiver, and near perfect equivocation at the eavesdropper. A formal description of the coding scheme is given in Appendix III C. Note on Achievability Scheme The achievability scheme in the proof of Theorem 1 when specialized to the case of no eavesdropper provides a different scheme than the one presented by El Gamal in [7]. The distinction between the two schemes is shown in Fig. 4. The scheme in [7] is as follows: Each message maps to a M × n dimensional codeword and the ith component of the codeword is transmitted on channel i. We refer to this as a single codebook scheme. A random binning extension of this scheme to account for the secrecy constraint yields the following rate Rsingle =

max

min

p(X1 ,X2 ,...,XM ) i∈{1,2,...K}

{I(X1 , X2 . . . XK ; Yi1 , . . . YiK ) − I(X1 , X2 . . . XK ; Ye1 , . . . YeK )} . (14)

In our proposed scheme there are M separate codebooks, each of rate R, one on each channel. The message is mapped to a codeword in each codebook and transmitted on the corresponding channel. Note that each codebook operates above the capacity of the corresponding channel. Nevertheless, if each receiver performs joint decoding of all its received sequences, it will (with high probability) recover the message. The secrecy rate corresponding to this scheme for the reversely degraded case is R=

min

i∈{1,2,...,K}

M X j=1

I(Xj ; Yij |Yej ),

which is in general larger than (14). The intuition behind the gains is that having separate codebooks on each channel enables us to separately tune the bin-size of each codebook.

9

W

W

C

C1

X1

Y11

Y21

Ye1

C2

X2

Y22

Ye2

Y12

C3

X3

Y13

Ye3

Y23

X1

Y11

Y21

Ye1

X2

Y22

Ye2

Y12

X3

Y13

Ye3

Y23

Fig. 4. Two coding schemes for common message transmission on proposed channels. The top figure shows the scheme proposed in Theorem 1. It achieves the common message capacity. In this scheme we use separate codebooks on each parallel channel. This allows us to separately bin on each channel. The lower figure shows the scheme that uses a single codebook. While this scheme is optimal when there is no eavesdropper [7], it is suboptimal in our setup. This drawback of this scheme is that because of the single codebook, one cannot separately bin for each channel.

D. Gaussian Channels We consider the Gaussian channel model where Yij = Xj + Zij Yej = Xj + Zej ,

(15)

2 with Zij ∼ N (0, σij2 ) and Zej ∼ N (0, σej ). All these noise variables are assumed independent. We also PM impose an average power constraint E[ j=1 Xj2 ] ≤ P . A straightforward extension of Theorem 1 gives the following: Corollary 2: The common-message-secrecy-capacity for the Gaussian parallel broadcast channel in (15) is + M X Pj 1 Pj 1 common,Gaussian log 1 + 2 − log 1 + 2 , (16) CK,M = max min (P1 ,P2 ,...PM )∈F 1≤i≤K 2 σij 2 σej j=1

where F is the set of all feasible power allocations that satisfy A sketch of the proof is provided in Appendix IV.

PM

j=1 Pj

≤ P.

V. FADING C HANNELS - C OMMON M ESSAGE In this section, we examine the case when each receiver is only interested in a common message that has to be protected from potential eavesdroppers. The problem of broadcasting a common message over ergodic channels is interesting, especially with a secrecy constraint — one cannot ignore CSI of legitimate receivers at the transmitter as such a scheme reveals the message to any eavesdropper that has a channel statistically equivalent to the other receivers. The knowledge of the channel of legitimate receivers must be taken into account to selectively transmit the message to these receivers. However such a scheme must be efficient — a naive scheme that simply transmits when all the users simultaneously have the channel gain above a threshold is highly suboptimal. Such a scheme has a rate that decays to zero exponentially in the number of legitimate receivers. In contrast, the scheme we present in this section has a rate constant, independent of the number of receivers. Note that this is the best possible scaling with the number of receivers that is possible. Definition 4: A (n, 2nR ) code for the channel consists of an encoding function which is a mapping from the message w to transmitted symbols x(t) = ft (w; ht1 , ht2 , . . . , htK ) for t = 1, 2, . . . , n, and a decoding

10

Transmitter

Transmitter

Transmitter

S1

S2

S3

S4

|H1|2 ≥ T |H2|2 ≥ T

|H1|2 ≥ T |H2|2 < T

|H1|2 < T |H2|2 ≥ T

|H1|2 < T

Transmitter

|H2|2 < T

Fig. 5. Decomposition of the two user system into four states. In the first state both users have channel gains above the threshold. In the second state only user 1 has channel above the threshold while in the third state only user 2 has channel above the threshold. The fourth state both users have channels below the threshold. In any state, a user is colored dark if the channel gain is below the threshold and shaded if the channel gain is above the threshold.

ˆ i = φi (y n ; hn , hn , . . . , hn ). A rate R is achievable if, for every ε > 0, there function at each receiver: W i 1 2 K ˆ i 6= W ) ≤ ε for any i = 1, 2, . . . , K such that the exists a sequence of length n codes such that Pr(W following equivocation condition is satisfied: ! 1 n n n n ≥ R − ε. (17) H W Y e , He , H1 , . . . , HK n n The entropy term in (17) is conditioned on H1n , . . . , HK as these channel gains of the K receivers are assumed to be known to the eavesdropper. However, the encoding and decoding functions in Def. 4 do not depend on hne . A. Lower Bound for Theorem 2 First we consider the following probabilistic extension of the parallel broadcast channel [13]: At each time, only one of the parallel channel operates and channel j is selected with a probability pj , independent of all other times. Also suppose that there is a total power constraint P on the input. A straightforward extension of Lemma 2 provides the following achievable rate ∆ common RK,M (P ) =

max

min

i∈{1,2,...,K}

M X j=1

pj {I(Uj ; Yij ) − I(Uj ; Yej )}+ ,

(18)

where U1 , U2 , . . . UM are auxiliary random variables and the maximum is over distribution PM the product 2 p(U1 )p(U2 ) . . . p(UM ) and the stochastic mappings Xj = fj (Uj ) that satisfy j=1 pj E[Xj ] ≤ P . For simplicity we focus on the case of two receivers. The case of more than two receivers is analogous. Fix a threshold T > 0 and decompose the system into four states as shown in Fig. 5. . The transmission happens over a block of length n and we classify t = 1, 2, . . . , n as S1 = t ∈ {1, n} | |h1 (t)|2 ≥ T, |h2 (t)|2 ≥ T S2 = t ∈ {1, n} | |h1 (t)|2 ≥ T, |h2 (t)|2 < T (19) S3 = t ∈ {1, n} | |h1 (t)|2 < T, |h2 (t)|2 ≥ T S4 = t ∈ {1, n} | |h1 (t)|2 < T, |h2 (t)|2 < T . The resulting channel is a probabilistic parallel channel with probabilities of the four channels as p(S1 ) = Pr(|H1 |2 ≥ T, |H2 |2 ≥ T ), p(S2 ) = Pr(|H1 |2 ≥ T, |H2 |2 < T ), p(S3 ) = Pr(|H1 |2 < T, |H2 |2 ≥ T ) and p(S4 ) = Pr(|H1 |2 < T, |H2 |2 < T ). Also note that with Xj = Uj ∼ CN (0, P ) in (18) the achievable rate expression is " # Rcommon (P ) = min Pr(|Hi |2 ≥ T )E log(1 + |Hi |2 P ) − log(1 + |He |2 P ) |Hi |2 ≥ T . 1≤i≤2

11

Optimizing over the threshold, we have # Rcommon (P ) = max min Pr(|Hi |2 ≥ T )E log(1 + |Hi |2 P ) − log(1 + |He |2 P ) |Hi |2 ≥ T T >0 1≤i≤2 Z ∞ = max min (log(1 + xP ) − EHe [log(1 + |He |2 P ])pi(x)dx T >0 1≤i≤2 T Z ∞ ≥ min (log(1 + xP ) − EHe [log(1 + |He |2 P ])pi (x)dx (20) 1≤i≤2 T ∗ Z ∞ {log(1 + xP ) − EHe [log(1 + |He |2 P )]}+ pi (x)dx = min 1≤i≤2 0 = min EHi {log(1 + |Hi |2 P ) − EHe [log(1 + |He |2 P )]}+ , (21) "

1≤i≤2

where T ∗ in (20) is the solution to log(1 + xP ) − EHe [log(1 + |He |2 P ] = 0. (The optimality of T ∗ follows from the fact that pi (x) ≥ 0 and hence the integral is maximized by keeping all terms which are positive and discarding the negative terms, however this is not necessary to note as this is an achievable scheme.) Note that (21) coincides with the achievable rate in Theorem 2 for the case of K = 2 users. As remarked earlier, this scheme straightforwardly generalizes to more than two receivers. With K receivers we will have a total of 2K states, where each state specifies the subset of users that are above the threshold T ∗ . Remark 4: Our proposed scheme suggests a concatenated coding approach with an outer erasure code and an inner wiretap code. Incoming information bits are mapped into a codeword of a (2K − 1, 2K−1) erasure code over a sufficiently large alphabet. Each resulting symbol then forms the message for its corresponding state. Each receiver obtains 2K−1 symbols in states where its channel gain is above the threshold and can recover the information symbols. Details of this architecture are provided in [11]. An alternative scheme that discretizes the fading coefficients along the lines of [9] is provided in Appendix V. 1) Upper Bound: Suppose that we only need to transmit the message to user i. An upper bound on the secrecy capacity for this single user channel is obtained by specializing Lemma 3 to the case of K = 1 user. Accordingly, we have common R+ (P ) ≤ max E {log(1 + |Hi |2 P (Hi)) − log(1 + |He |2 P (Hi ))}+ . (22) P (Hi ):E[P (Hi )]≤P

Since i is arbitrary, we can tighten this upper bound by minimizing over i and this yields the expression common for R+ (P ) in (6).

B. Proof of Corollary 1 Since the channel gains of all the users are sampled i.i.d. from CN(0, 1), we will use a generic variable H to denote the channel gain of any given user. For the upper bound, we note that " " + # + # |H|2 1 + |H|2P (H) ≤E log . E log 1 + |He |2 P (H) |He |2 Direct calculation for the Rayleigh Fading case shows that the right hand side above equals 1 bit/symbol as stated in (7). For the lower bound, recall that + Rcommon (P ) = E[ log(1 + |H|2P ) − E[log(1 + |He |2 P )] ]

12 +

Let us define fP (x) = {log(1 + xP ) − E[log(1 + |He |2 P )]} . We show in the Appendix VI that fP (x) satisfies the conditions for the dominated convergence theorem and hence, + lim Rcommon (P ) = E[ lim log(1 + |H|2 P ) − E[log(1 + |He |2 P )] ] (23) P →∞ →∞ "P # + 1 + |H|2P =E log lim P →∞ exp{E[log(1 + |He |2 P )]} " + # γ =E log |H|2 + = 0.709 bits/symbol. (24) log 2 Here (24) can be verified using the L’Hospitals rule and γ = 0.5772 is the Euler-Gamma constant. VI. PARALLEL C HANNELS - I NDEPENDENT M ESSAGE We consider the case of M parallel channels, one eavesdropper and K receivers, each interested in an independent message. Each such message must be protected from the eavesdropper. Definition 5 (Length n Code): A (2nR1 , 2nR2 , . . . , 2nRK , n) code for the product broadcast wiretap channel in Definition 1 consists of a mapping ωn : W1 ×W2 ×. . .×WK → |X n × X{zn . . . X n} from the messages M times

n n of the K users to the M channel inputs and K decoding functions φi,n : Y × . . . × Y n} → Wi , one | × Y {z M times

ˆ i . A perfect-secrecy-rate tuple at each legitimate receiver. We denote the message estimate at decoder i by W ˆ i) ≤ ε (R1 , R2 , . . . , RK ) is achievable if, for every ε > 0, there is a length n code such that Pr(Wi 6= W for all i = 1, 2, . . . , K, and such that the following condition is satisfied 1 1 n H(Wi|W1 , . . . , Wi−1 , Wi+1 , . . . WK , Ye1n , . . . YeM ) ≥ H(Wi ) − ε, i = 1, 2, . . . M. (25) n n sum The secrecy-sum-capacity CK,M is the supremum of R1 + R2 + . . . + RK over the achievable rate tuples (R1 , R2 , . . . , RK ). Remark 5: Our constraint (25) provides perfect equivocation for each message, even if all the other messages are revealed to the eavesdropper. It may be possible to increase the secrecy rate by exploiting the fact that the eavesdropper does not have access to other messages. This is not considered in the present paper. A. Proof of Upper Bound in Theorem 3 We establish the upper bound in Theorem 3. Suppose a genie provides the output of the strongest receiver, πj , to all other receivers on each channel, i.e., on channel j the output Yπnj is made available to all the receivers. Because of degradation, we may assume, without loss of generality, that each receiver only observes (Yπn1 , . . . , YπnM ). Clearly, such a genie aided channel can only have a sum capacity larger than the original channel. Since all receivers are identical, to compute the sum capacity it suffices to consider the situation with one sender, one receiver, and one eavesdropper. Lemma 4: The secrecy-sum-capacity in Theorem 3 is upper bounded by the secrecy capacity of the sum genie aided channel, i.e., CK,M ≤ C GenieAided . Proof: Suppose that a secrecy rate point (R1 , R2 , . . . RK ) is achievable for the K user channel in Theorem 3 and let the messages be denoted as (W1 , W2 , . . . WK ). This implies that, for any ε > 0 and n ˆ i 6= Wi ) ≤ ε for i = 1, 2, . . . , K, and such that large enough, there is a length n code such that Pr(W 1 n H(Wi |W1 , . . . Wi−1 , Wi+1 , . . . WK , Ye1n , Ye2n , . . . , YeM ) ≥ Ri − ε . n

(26)

13

We now show that a rate of (

PK

i=1

Ri , 0, . . . , 0) is achievable on the genie aided channel. First, note | {z } K−1

that any message that is correctly decoded on the original channel is also correctly decoded by user 1 on the genie aided channel. It remains to bound the equivocation on the genie aided channel when the message to receiver 1 is W = (W1 , W2 , . . . , WK ). We have 1 1 n n H(W |Ye1n , Ye2n , . . . , YeM ) = H(W1 , W2 , . . . , WK |Ye1n , Ye2n , . . . , YeM ) n n K X 1 n ≥ H(Wi|W1 , . . . Wi−1 , Wi+1 , . . . WK , Ye1n , Ye2n , . . . , YeM ) n i=1 ≥

K X i=1

Ri − Kε

where the last step follows from (26). Since ε is arbitrary, this establishes the claim. Lemma 5: The secrecy capacity of the genie aided channel is C GenieAided =

max

p(X1 )p(X2 )...p(XM )

M X j=1

I(Xj ; Yπj |Yej ).

(27)

Proof: Since all receivers are identical on the genie aided channel, this Lemma is a direct consequence of Theorem 1 when specialized to K = 1 receivers. Remark 6: The upper bound continues to hold even if the eavesdroppers channel is not ordered with respect to the legitimate receivers. In general, following Lemma 1, the upper bound can be tightened by considering, for all 1 ≤ j ≤ M, the worst joint distribution p′ (Yπj , Yej |Xj ) among all joint distributions with the same marginal distribution as p(Yπj |Xj ) and p(Yej |Xj ), yielding sum CK,M ≤Q

M j=1

min p′ (Yπj ,Yej |Xj )

Q max M j=1

p(Xj )

M X j=1

I(Xj ; Yπj |Yej ).

(28)

B. Achievability Scheme for Theorem 3 The achievability scheme for Theorem 3 is as follows: we only send information intended to the strongest user, i.e., only user πj on channel j can decode. It follows from the result of the wiretap channel [24] that P a rate of Rj = maxp(Xj ) I(Xj ; Yπj |Yej ) is achievable on channel j. Accordingly the total sum rate of j Rj is achievable which is the capacity expression. C. Gaussian Channels Theorem 3 can be extended to the case of Gaussian parallel channels. Let σπ2j denote the noise variance of the strongest user on channel j. Then the secrecy-sum-capacity is given by ( ! )+ M X 1 P 1 P j j sum,Gaussian CK,M (P ) = max log 1 + 2 − log 1 + 2 (29) (P1 ,P2 ,...PM ) 2 σ 2 σ π ej j j=1

P where the maximization is over all power allocations satisfying M j=1 Pj ≤ P . The achievability follows by using independent Gaussian wiretap codebooks on each channel and only considering the strongest user on each channel. For the upper bound we have to show that Gaussian inputs are optimal in the capacity expression in Theorem 3. The justifications are the same as in the common message case in Section IV-D.

14

VII. FADING C HANNELS - I NDEPENDENT M ESSAGE We consider the case where each receiver wants an independent message. We will focus on the sum rate of the system. This scenario has been widely studied in conventional systems (i.e., without a secrecy constraint) where the transmitter CSI provides dramatic gains (see e.g., [13], [21]). An “opportunistic scheme” that selects the user with the largest instantaneous gain maximizes the sum-rate of the system. The results in this section can be interpreted as an extension of opportunistic transmission in the presence of eavesdroppers. Definition 6: A (n, 2nR1 , . . . , 2nRK ) code consists of an encoding function from the messages w1 , . . . , wK with wi ∈ {1, 2, . . . , 2nRi } to transmitted symbols x(t) = ωt (w1 , w2 , . . . , wK ; ht1 , ht2 , . . . , htK ) for t = 1, 2, . . . , n, and a decoding function at each receiver wî = φi (yin ; hn1 , hn2 , . . . , hnK ). A rate tuple (R1 , R2 , . . . , RK ) is achievable with perfect secrecy if, for any ε > 0, there exists a length n code such that, for each ˆ i 6= Wi ) ≤ ε and i = 1, 2, . . . , K, with Wi uniformly distributed over {1, 2, . . . , 2nRi }, we have Pr(W ! 1 n H Wi W1 , . . . , Wi−1 , Wi+1 , . . . , WK , Yen , Hen , H1n , . . . , HK ≥ Ri − ε. (30) n

The secrecy-sum-capacity is the supremum value of R1 + R2 + . . . + RK among all achievable rate tuples. A. Upper Bound in Lemma 3 Our proof technique is to introduce a single user genie-aided channel as in Section VI-A and then to upper bound this single user channel. This upper bound on the genie aided channel is closely related to an upper bound provided in [10] for the ergodic fading channel with large coherence periods. We nevertheless provide a complete derivation in Appendix VII.

B. Achievability in Lemma 3 The achievability scheme combines opportunistic transmission and a Gaussian wiretap code. At each time, only the message of the user with the best instantaneous channel gain is selected for transmission. We quantize each receiver’s channel gain into q levels A1 = 0 < A2 < . . . < Aq ≤ Aq+1 = J (if any user’s channel gain exceeds J, then this slot is ignored for transmission). Since the channel gains of the K users are independent, there are a total of M = q K different super-states. These are denoted as S1 , S2 , . . . , SM . Each of the super-states denotes one parallel channel. Note that on each parallel channel, the legitimate users have a Gaussian channel, while the eavesdropper has a fading channel. Our scheme transmits an independent message on each of the M parallel channels. Let Gj ∈ {A1 , A2 , . . . , Aq } denote the gain of the strongest user on channel j. We use a Gaussian codebook with power P (Gj ) on channel j. The achievable rate on channel j is Rj = I(Uj ; Yj ) − I(Uj ; Yej , Hej ) = log(1 + Gj P (Gj )) − E[log(1 + |He |2 P (Gj ))], where the second equality follows from our choice of Xj = Uj ∼ N (0, P (Gj )). The overall achievable

15

sum rate is given by − RK (P ) =

M X

Pr(Sj )Rj

j=1

=

M X j=1 q

=

X l=1 q

=

X l=1

Pr(Sj )(log(1 + Gj P (Gj )) − E[log(1 + |He |2 P (Gj ))]) Pr(Al )(log(1 + Al P (Al )) − E[log(1 + |He |2 P (Al ))]) + Pr(Al ) log(1 + Al P (Al )) − E[log(1 + |He |2 P (Al ))] ,

(31)

where the second last equality follows by using the fact that Gj ∈ {A1 , A2 , . . . , Aq } and rewriting the summation over these indices and the last equality follows from the fact that if for some P (Al ) we have log(1 + Al P (Al )) − E[log(1 + |He |2 P (Al )) < 0, then we can simply replace P (Al ) by zero to increase the value. As we fix J and take q → ∞ we show in Appendix VIII that the summation converges to Z J − RK (P ) = {log(1 + aP (a)) − E[log(1 + |He |2 P (a))]}+ p(a)da Z0 ∞ = {log(1 + aP (a)) − E[log(1 + |He |2 P (a))]}+ p(a)da 0 Z ∞ − {log(1 + aP (a)) − E[log(1 + |He |2 P (a))]}+ p(a)da J

Finally for any ε0 > 0, we can find J sufficiently large such that the second term above is less than ε0 . Thus we have that − RK (P ) = E {log(1 + |Hmax|2 P (Hmax)) − E[log(1 + |He |2 P (Hmax ))]}+ − ε0 .

Since for the optimal P (Hmax ) the {·}+ inside the expectation is redundant, we have established (10) i.e., the achievability part of Lemma 3. C. Proof of Theorem 4 + Let P ∗ (Hmax ) be the power allocation that maximizes RK (P ) in (9). We have + − RK (P ) − RK (P ) ≤ E {log(1 + |Hmax |2 P ∗ (Hmax )) − log(1 + |He |2 P ∗ (Hmax ))}+ − E log(1 + |Hmax |2 P (Hmax)) − log(1 + |He |2 P (Hmax )) # " 2 ∗ 1 + |He | P (Hmax ) = Pr(|He |2 ≥ |Hmax |2 )E log |He |2 ≥ |Hmax |2 2 ∗ 1 + |Hmax | P (Hmax ) " # 2 |H | e ≤ Pr(|He |2 ≥ |Hmax |2 )E log |He |2 ≥ |Hmax |2 , |Hmax |2 ≤

1 2 log 2 K +1

(32) (33)

(34) (35) 2

1+|He | a where (33) follows from substituting the bounds in Lemma 3, (34) follows from the fact that log 1+|H 2 max | a is increasing in a for |He |2 ≥ |Hmax |2 , and where the last step (35) follows from Lemma 8 (see Appendix IX) and the fact that Pr(|He |2 ≥ |Hmax |2 ) = 1/(1 + K), since we assumed the channel coefficients to be i.i.d.

16

D. Discussion Theorem 4 guarantees an arbitrarily small gap between upper and lower bounds on the sum-secrecycapacity, that holds for any fixed coherence period, provided the number of users is large enough. In [10] two schemes are presented — a variable rate and a constant rate — for the case of a single receiver in slow fading environment. Straightforward extensions of these schemes for multiple receivers reveals the following. The variable rate scheme achieves the our upper bound in (9), whereas the constant rate scheme achieve our lower bound in (10). Since these two expressions coincide as the number of receivers tends to infinity, one deduces that the gains of variable rate schemes become negligible in this limit. Colluding Attacks: We noted earlier that any number of statistically equivalent eavesdroppers does not affect our capacity as long as they do not collude. If the eavesdroppers collude then they can combine the received signals and attempt to decode the message. The upper and lower bounds in Lemma 3 can be extended by replacing the term |He |2 with ||He ||2 , where He is the vector of channel gains of the colluding eavesdroppers. One conclusion from these bounds is that the secrecy capacity is positive unless the colluding eavesdropper population grows as log K. VIII. C ONCLUSION A generalization of the wiretap channel to the case of parallel and fading channels with multiple receivers is considered. We established the common-message-secrecy-capacity for the case of reversely degraded parallel channels and provided upper and lower bounds for the general case. For independent messages over parallel channels, the sum-secrecy capacity has been determined. For fading channels, we examined a fast fading scenario when the transmitter knows the instantaneous channels of all the legitimate receivers but not of the eavesdropper. Interestingly, the common-message-secrecy-capacity does not decay to zero as the number of legitimate receiver grows. For the case of independent messages, it was shown that an opportunistic scheme achieves the secrecy-sum-capacity in the limit of large number of users. In terms of future work, there are a number of interesting directions to pursue. Our setup for the fading channel assumes that the fading coefficients of the legitimate receivers are revealed to the sender in a causal fashion. Implicitly we are assuming the availability of an insecure, but authenticated feedback link between the receiver(s) and the sender and using it to specifically provide CSI to the transmitter. The availability of this (digital) feedback link is reminiscent to the secret key generation protocols pioneered by Maurer [16]. Indeed this feedback link can be used in a variety of ways rather than just providing CSI as is assumed here and exploring connections to key generation approach of Maurer can be fruitful. Throughout this paper we have considered Wyner’s notion of perfect secrecy: the ratio of the mutual information between the message and the output of the eavesdropper’s channel to the block length must approach zero with increasing block length. Note that this is significantly weaker than Shannon’s notion which requires that the mutual information be zero regardless of the block length. Maurer and Wolf [17] have observed that for the discrete memoryless wiretap channel, the secrecy notion of Wyner can be strengthened (without any loss in rate) in the following sense — the mutual information between the message and the output of the eavesdropper’s channel goes to zero with the block-length. It remains to be seen if analogous results can be obtained for the Gaussian wiretap channel and the fading channels considered in this work. Our achievability results are based on random coding arguments. Design of practical codes to attain these fundamental limits is a fruitful area of research [20]. The protocols investigated in this paper relied on time diversity (for the common message) and multiuser diversity (for independent messages) to enable secure communication. In situations where such forms of diversity is not available, it is of interest to develop a formulation for secure transmission, analogous to the outage formulation for slow fading channels. Secondly, the impact of multiple antennas on secure transmission is far from being clear at this stage. While multiple antennas can theoretically provide

17

significant gains in throughput in the conventional systems, a theoretical analysis for the case of confidential messages is naturally of great interest. A PPENDIX I P ROOF OF FACT 2 Let T be a binary valued random variable such that: if T = 0 the induced distribution on X is p0 (X), i.e., p(Y, Z, X|T = 0) = p(Y, Z|X)p0 (X), and if T = 1 the induced distribution on p(X) is p1 (X) i.e. p(Y, Z, X|T = 1) = p(Y, Z|X)p1(X). Note the Markov chain T → X → (Y, Z). To establish the concavity of I(X; Y |Z) in p(X) it suffices to show that I(X; Y |Z, T ) ≤ I(X; Y |Z).

(36)

The following chain of inequalities can be verified. I(X; Y |Z, T ) − I(X; Y |Z) = {I(X; Y, Z|T ) − I(X; Z|T )} − {I(X; Y, Z) − I(X; Z)} = {I(X; Y, Z|T ) − I(X; Z|T )} − {I(T X; Y, Z) − I(T X; Z)} = {I(X; Y, Z|T ) − I(T X; Y, Z)} − {I(X; Z|T ) − I(T X; Z)} = I(T ; Z) − I(T ; Y, Z) = −I(T ; Y |Z) ≤ 0.

(37) (38)

Equation (37) is a consequence of the chain rule for mutual information. Equation (38) follows from the fact that T → X → (Y, Z) forms a Markov Chain, so that I(T ; Z|X) = I(T ; Y, Z|X) = 0. A PPENDIX II P ROOF OF L EMMA 1 Suppose there exists a sequence of (n, 2nR ) codes such that, for every ε > 0, as n → ∞ ˆ i ) ≤ ε, i = 1, 2, . . . K Pr(W 6= W 1 n I(W ; Ye1n , . . . , YeM ) ≤ ε. n We first note that from Fano’s inequality we have

1 1 n H(W |Yi1n , Yi2n , . . . , YiM ) ≤ + εR i = 1, 2, . . . K. n n Combining (39) and (40) we have, for all i = 1, 2, . . . K and ε′ = ε + n1 + εR, n n nR ≤ I(W ; Yi1n , . . . , YiM ) − I(W ; Ye1n , . . . , YeM ) + nε′ n n ≤ I(W ; Yi1n , . . . , YiM |Ye1n , . . . , YeM ) + nε′ n n n n = h(Yi1n , . . . , YiM |Ye1n , . . . , YeM ) − h(Yi1n , . . . , YiM |Ye1n , . . . , YeM ,W) n n n n n n n n n ≤ h(Yi1 , . . . , YiM |Ye1 , . . . , YeM ) − h(Yi1 , . . . , YiM |Ye1, . . . , YeM , X1n , . . . , XM ,W) n n n n n = h(Yi1n , . . . , YiM |Ye1n , . . . , YeM ) − h(Yi1n , . . . , YiM |Ye1n , . . . , YeM , X1n , . . . , XM ) M X n n = h(Yi1n , . . . , YiM |Ye1n , . . . , YeM )− h(Yijn |Xjn , Yejn ) + nε′

(39)

(40)

(41) (42)

j=1

≤ ≤

M X j=1

M X j=1

h(Yijn |Yej n ) −

M X j=1

h(Yijn |Xjn , Yejn ) + nε′

I(Xjn ; Yijn |Yejn ) + nε′ ,

(43)

18

n n n where (41) follows from the fact that W → (X1n , . . . XM , Ye1n , . . . , YeM ) → (Yi1n , . . . , YiM ) form a Markov chain, and (42) holds because the parallel channels are mutually independent in (1) so that n n n h(Yi1n , . . . , YiM |Ye1n , . . . , YeM , X1n , . . . , XM )

=

We now upper bound each term in the summation (43). We have I(Xjn ; Yijn |Yejn )

≤ =

n X

k=1 n X k=1

M X j=1

h(Yijn |Xjn , Yejn ) .

I(Xj (k); Yij (k)|Yej (k))

(44)

I(Xj (k); Yij (k), Yej (k)) − I(Xj (k); Yej (k))

(45)

= nI(Xj ; Yij , Yej |Q) − nI(Xj ; Yej |Q) = nI(Xj ; Yij |Yej , Q) ≤ nI(Xj ; Yij |Yej ),

(46) (47)

where (44) follows from the fact that the channel is memoryless,and (46) is obtained by defining Q to be a (time-sharing) random variable uniformly distributed over {1, 2, . . . , n} independent of everything else. The random variables (Xj , Yij , Yej ) are such that, conditioned on Q = k, they have the same joint distribution as (Xj (k), Yij (k), Yej (k)). Finally (47) follows from the fact that the mutual information is concave with respect to the input distribution p(Xj ) as stated in Fact 2. Combining (47) and (42) we have R≤

M X j=1

I(Xj ; Yij |Yej ) + ε′ ,

= min

1≤i≤K

M X j=1

≤ Q max M j=1

i = 1, 2, . . . K

I(Xj ; Yij |Yej ) + ε′ min

p(Xj ) 1≤i≤K

M X j=1

I(Xj ; Yij |Yej ) + ε′ .

(48)

(49)

The last step follows P from that fact that for any input distribution p(X1 , X2 , . . . , XM ), the objective function min1≤i≤K M j=1 I(Xj ; Yij |Yej ) only depends on the marginal distributions p(X1 ), . . . , p(XM ). Accordingly it suffices to take X1 , X2 , . . . , XM as mutually independent random variables. Finally note that (49) depends on the joint distribution across the channels. Accordingly, we tighten the upper bound by considering the worst distribution in P = P1 × P2 × . . . × PM which gives R ≤ min Q max P

M j=1

min

p(Xj ) 1≤i≤K

M X j=1

I(Xj ; Yij |Yej ) + ε′ .

(50)

A PPENDIX III P ROOF OF L EMMA 2 Fix the distributions p(U1 ), p(U2 ), . . . , p(UM ) and the (possibly stochastic) functions f1 (·), . . . , fM (·). Let εE and εR be positive constants, to be quantified later. With respect to these quantities, define R = min

1≤i≤K

M X j=1

{I(Uj ; Yij ) − I(Uj ; Yej )}+ − εR

Rej = I(Uj ; Yej ) − εF ,

j = 1, 2, . . . M.

(51)

19

In what follows, whenever typicality is mentioned it is intended to be ε−weak typicality (see, e.g., [3]). The set T (Uj ) denotes the set of all sequences that are typical with respect to distribution p(Uj ) and the set T (Xj , Uj ) denotes the set of all jointly typical sequences (xnj , unj ) with respect to the distribution p(Xj , Uj ). Tunj (Xj |Uj ) denotes the set of all sequences xnj conditionally typical with respect to a given sequence unj according to p(Xj |Uj ). 1) Codebook Generation: n(R+Rej ) • Codebook Cj for j = 1, 2, . . . , M has a total of Mj = 2 length n codeword sequences. Each sequence is selected uniformly and independently from the set T (Uj ). nR • We randomly partition the Mj sequences into 2 message bins so that there are Qj = 2nRej codewords per bin. • The set of codewords associated with bin w in codebook Cj is denoted as

Cj (w) = {unj1(w), unj2(w), . . . , unjQj (w), }, w = 1, 2, . . . 2nR , j = 1, 2, . . . M. (52) S nR Note that Cj = 2w=1 Cj (w) is the codebook on channel j. 2) Encoding: To encode message w, the encoder randomly and uniformly selects a codeword in the set Cj (w) for all 1 ≤ j ≤ M. Specifically, • Select M integers k1 , k2 , . . . , kM , where kj is selected independently and uniformly from the set {1, 2, . . . Qj }. n • Given a message w, select a codeword ujk (w) from codebook Cj (w) for j = 1, 2, . . . M. j n • The transmitted sequence on channel j is denoted by xj = xj (1), xj (2), . . . , xj (n). The symbol xj (t) is obtained by applying the (possibly stochastic) function fj (·) on the tth element of the codeword unjkj (w). n n n 3) Decoding: Receiver i, based on its observations (yi1 , yi2 , . . . , yiM ) from the M parallel channels, declares message w according to the following rule. • Let Si = {j|1 ≤ j ≤ M, I(Uj ; Yij ) > I(Uj ; Yej )} denote the set of channels where receiver i has larger mutual information than the eavesdropper. The receiver only considers the outputs yijn from these channels. • Receiver i searches for a message w such that, for each j ∈ Si , there is an index lj such that (unjlj (w), yijn ) ∈ T (Uj , Yij ). If a unique w has this property, the receiver declares it as the transmitted message. Otherwise, the receiver declares an arbitrary message. 4) Error Probability: We show that, averaged over the ensemble of codebooks, the error probability is smaller than a constant ε′ (to be specified), which approaches zero as n → ∞. This demonstrates the existence of a codebook with error probability less than ε′ . We do the analysis for user i and, without loss of generality, assume that message w1 is transmitted. c n n • False Reject Event: Let E1j be the event {(Ujk (w1 ), Yij ) ∈ / T (Uj , Yij )}. Since Ujn ∈ T (Uj ) by j c construction and Yij is obtained by passing Uj through a DMC, it follows that Pr(E1j ) ≤ δ, where c δ → 0 as ε → 0. Accordingly if E1 denotes the event that message w1 does not appear typical, then we have ! M [ c Pr(E1c ) = Pr E1j ≤ Mδ. (53) j=1

•

False Accept Event: As before, let Si ⊆ {1, 2, . . . , M} denote the subset of channels for which I(Uj ; Yij ) > I(Uj ; Yej ). In what follows the index j will only refer to channels in Si . Let Erj denote the event that there is a codeword in the set Cj (wr ) (r > 1) typical with Yijn . Also let

20

Er be the event that message wr has a codeword typical on every channel.

Pr(Erj ) = Pr(∃l ∈ {1, 2, . . . , Qj } : (Ujln (wr ), Yijn ) ∈ T (Uj , Yij )), Qj

≤

X

Pr((Ujln (wr ), Yijn ) ∈ T (Uj , Yij ))

≤

X

2−n(I(Uj ;Yij )−3δ)

l=1 Qj

l=1 −n(I(Uj ;Yij )−I(Uj ;Yej )−3δ+εF )

≤2

j∈S

,

where the last inequality follows since Qj = 2n(I(Uj ;Yej )−εF ) . Finally, the probability of Er can be computed as \ Pr(Er ) = Pr( Erj ) j∈Si

=

Y

Pr(Erj )

P P −n

(54)

j∈Si

= 2−n =2

j∈Si (I(Uj ;Yij )−I(Uj ;Yej )−3ε+εF ) M + j=1 ({I(Uj ;Yij )−I(Uj ;Yej )} −3ε+εF )

,

where (54) follows by independence of codebooks and channels. The probability of false accept event EF is then given by nR 2[

Pr(EF ) = Pr(

r=2

Er )

≤ 2nR 2−n

P

M + j=1 ({I(Uj ;Yij )−I(Uj ;Yej )} −3δ+εF )

≤ 2−n(3M δ−M εF +εR ) , which vanishes with increasing n for any εR and εF that satisfy the relation εR > MεF − 3Mδ > 0. The probability of error averaged over the ensemble of codebooks is less than ε′ = max Mδ, 2−n(3M δ−M εF +εR ) . This demonstrates the existence of a codebook with error probability less than ε′ . 5) Secrecy Analysis: We now bound the equivocation at the eavesdropper for a typical code in the ensemble. Informally, since the codebook Cj has 2n(I(Uj ;Yej )−εF ) codewords per bin, the eavesdropper’s equivocation is near perfect when observing the output of channel j, i.e., n1 I(W ; Yejn ) ≤ ε′F for some ε′F (to be specified) such that ε′F → 0 as εF → 0. Since we are sending the same message on each of the M channels, the eavesdropper can potentially reduce the equivocation by combining the channel outputs. However in doing so, his equivocation reduces by at most Mε′F since the codewords on each channel are independently selected.3 The following Lemma is proved at the end of this section. Lemma 6: A typical code from the ensemble in our achievability scheme satisfies the following: For any j = 1, 2, . . . M, we have n1 I(W ; Yejn ) ≤ ε′F , where ε′F = ε′F (δ, εF ) tends to zero as δ → 0 and εF → 0. 3

It is important that the codewords be independently selected. If they are not, say the same codeword is repeated on each channel, the eavesdropper equivocation can be significantly reduced by combining the channel outputs.

21

Using the above lemma we now upper bound the mutual information at the eavesdropper as 1 n n n I(W ; Ye1n , . . . , YeM ) = h(Ye1n , . . . , YeM ) − h(Ye1n , . . . , YeM |W ) n m X n n = h(Ye1 , . . . , YeM ) − h(Yejn |W )

(55) (56)

j=1

≤ n h(Ye1n , . . . , YeM |W )

Pm

n j=1 h(Yej |W )

M X j=1

I(W ; Yejn ) ≤ Mnε′F ,

(57)

where = since the codewords in the sets C1 (W ), C2 (W ), . . . , CM (W ) are independently selected. Hence the normalized mutual information increases only by a fixed amount due to observations on multiple channels. By choosing ε in (13) to equal Mε′F , we satisfy the secrecy constraint. It remains to prove Lemma 6. Proof: Since there are Qj = 2nRej codewords per message bin Cj (W ) and each codeword is equally likely to be selected 1 H(Ujn |W ) = Rej (58) n = I(Uj ; Yej ) − εF ,

where the last equality follows from the definition of Rej in (51). Since the number of codewords in each bin is less than 2n(I(Uj ;Yej )−εF ) , we can select a code that satisfies Fano’s inequality 1 ∆ 1 H(Ujn |W, Yejn ) ≤ γ = + εF Rej . n n The equivocation at the eavesdropper can be lower bounded as H(W |Yejn ) = H(W, Ujn |Yejn ) − H(Ujn |W, Yejn ) ≥ H(Ujn |Yejn ) − nγ = H(Ujn ) − I(Ujn ; Yejn ) − nγ = H(Ujn , W ) − I(Ujn ; Yejn ) − nγ = H(W ) + H(Ujn |W ) − I(Ujn ; Yejn ) − nγ ≥ H(W ) + nI(Uj ; Yej ) − I(Ujn ; Yejn ) − nγ − nεF .

(59)

(60) (61) (62)

Here (60) follows from substituting (59), (61) from the fact that W is deterministic given Ujn and (62) follows by substituting (58). We now show that for a suitably chosen ε′ > 0 I(Ujn ; Yejn ) ≤ nI(Uj ; Yej ) + nε′ . First note the following 1 − log p(yjn ) − nH(Yj ) ≤ δ, ∀yjn ∈ T (Yj ) n 1 − log p(yjn |unj ) − nH(Yj |Uj ) ≤ δ, ∀(yjn , unj ) ∈ T (Yj , Uj ). n

(63)

(64)

Let J be an indicator function which equals 1 if (yjn , unj ) ∈ T (Yj , Uj ). From (64) we note that I(Ujn ; Yjn |J = 1) ≤ nI(Uj ; Yj ) + 2nδ .

(65)

22

Now we can upper bound I(Ujn ; Yjn ) as I(Ujn ; Yjn ) ≤ I(Ujn ; Yjn , J) = I(Ujn ; Yjn |J) + I(Ujn ; J) ≤ I(Ujn ; Yjn |J = 1) + I(Ujn ; Yjn |J = 0) Pr(J = 0) + Hb (J) ≤ nI(Uj ; Yj ) + 2nδ + nε log |Y| + 1,

(66) (67)

where (66) follows from the fact that I(Ujn ; J) ≤ Hb (J), the binary entropy of J. The inequality (67) follows from the fact that Hb (J) ≤ 1, Pr(J = 0) ≤ ε, I(Ujn ; Yjn |J = 0) ≤ n log |Y|, and (65). We now select 1 ε′ = 2δ + δ log |Y| + . n Combining (62) and (67) we have 1 I(W ; Yejn ) ≤ ε′ + γ + εF n = 2δ + ε|Y| +

2 + εF Rej + εF n

(68)

∆

= ε′F

A PPENDIX IV P ROOF OF C OROLLARY 2 First observe that the channel in (15) has the same capacity as the corresponding reversely degraded broadcast channel (see Fact 1) given by the following model: on channel j, let πj (1), . . . , πj (K + 1) denote set of legitimate receivers and eavesdropper ordered from the strongest to the weakest. For each ∆ ∆ 2 0 ≤ k ≤ K, the channel for user πj (k + 1) is Yˆπj (k+1)j = Yˆπj (k)j + Zˆkj with Yπj (0)j = Xj and σπ(0)j = 0. 2 2 The noise random variables Zˆkj ∼ N (0, σπ(k+1)j − σπ(k)j ) are independent. The converse in Theorem 1 (with the appropriate Fano’s inequality) immediately extends to continuous alphabets. The achievability argument relies on weak typicality and also extends to the Gaussian case. Furthermore, since I(Xj ; Yîj |Yêj ) is a continuous and concave function in p(X) (see Fact 2), the power constraint can be incorporated. common CK,M (P )

=

M X

min I(Xj ; Yîj |Yêj ) . Q max p(X ), i∈{1,2,...,K} j=1 P X ]≤P E[ M j=1 M j=1

j 2 j

(69)

Now observe that maxp(Xj ),E[Xj2 ]≤Pj I(Xj ; Yîj |Yêj ) denotes the capacity of a Gaussian wiretap channel [12]. Accordingly we have + 1 Pj 1 Pj ˆ ˆ max I(Xj ; Yij |Yej ) = log 1 + 2 − log 1 + 2 . (70) 2 σij 2 σej p(Xj ),E[Xj2 ]≤Pj One then deduces (16).

23

A PPENDIX V A LTERNATIVE ACHIEVABLE S CHEME IN T HEOREM 2 Following [9], our approach is to discretize the continuous valued coefficients and thus create parallel channels, one for each quantized state. The number of parallel channels increases as the quantization becomes finer. In what follows we only quantize the magnitude of the fading coefficients. The receiver can always rotate the phase, so it plays no part. We quantize the channel gains into one of the q values: A1 = 0 < A2 < . . . < Aq < Aq+1 = J (Any slot where the channel gain of any user exceeds J is simply skipped). Receiver i is in state l ∈ {1, 2, . . . , q} at 2 time is pessimistically discretized √ t if Al ≤ |Hi(t)| < Al+1 . When in state l, the receiver’s channel gain to Al . Since there are K independent users, there are a total of M = q K possible super-states, which we number as S1 , S2 , . . . , SM . Denote the quantized gain of user i in Sj by the double subscript Sij . Let p(Sj ) denote PMthe probability of state Sj . Also let pi (Al ) be the probability that a user i is in state l i.e., pi (Al ) = k=1:Sik =Al p(Sk ). In super-state Sj , the channel of user i and the eavesdropper are p yij (t) = Sij x(t) + zi (t), yej (t) = He (t)x(t) + ze (t). By selecting Uj ∼ CN (0, P ) and Xj = Uj , the argument in the summation in (18) (with the eavesdropper output (Yej , He )) is {I(Uj ; Yij ) − I(Uj ; Yej , He )}+ = {I(Xj ; Yij ) − I(Xj ; Yej , He )}+ p = {I(Xj ; Sij Xj + Zi ) − I(Xj ; He Xj + Ze , He )}+ = {log(1 + Sij P ) − E[log(1 + |He |2 P ]}+ . Substituting in (18), we have that the following rate is achievable common RQ (P ) = min

1≤i≤K

= min

1≤i≤K

M X j=1 q

X l=1

p(Sj ){log(1 + Sij P ) − E[log(1 + |He |2 P )]}+

(71)

pi (Al ){log(1 + Al P ) − E[log(1 + |He |2 P )]}+ ,

(72)

where the second equality follows from rewriting the summation over the states of each individual user. By taking q → ∞ (with J fixed), one can invoke the dominated convergence theorem to show that the above some converges to Z J min {log(1 + xP ) − E[log(1 + |He |2 P )]}+ pi (x)dx 1≤i≤K 0 Z ∞ Z ∞ 2 + min {log(1 + xP ) − E[log(1 + |He | P )]} pi (x)dx − {log(1 + xP ) − E[log(1 + |He |2 P )]}+ pi (x)dx 1≤i≤K

0

J

(73)

It can be easily seen that min

1≤i≤K

Z

0

∞

{log(1 + xP ) − E[log(1 + |He |2 P )]}+ pi (x)dx < ∞.

Hence we have that for any ε > 0 and for J sufficiently large Z ∞ {log(1 + xP ) − E[log(1 + |He |2 P )]}+ pi (x)dx < ε J

Since ε > 0 is arbitrary, combining (74) with (73) we obtain the expression for Rcommon (P ) in (6).

(74)

24

A PPENDIX VI J USTIFICATION ∆

FOR 2

(23) +

We show that fP (x) = {log(1 + xP ) − E[log(1 + |He | P )]} defined in Section V-B satisfies conditions of Dominated Convergence theorem so that the expectation and limit can be interchanged as in (23). Recall, that the dominated convergence theorem states that if an (x) is a sequence of real valued functions R such that lim R n an (x) = a(x) R and there is a b(x) such that for each x, |an (x)| ≤ b(x) and x b(x)dx < ∞, then limn x an (x)dx = x a(x)dx. Accordingly, let us define g(x) = log(1 + x) + logγ 2 . We will show that for each P > 0, fP (x) ≤ g(x). For the case when P < 1, the claim is immediate. Observe that for P ≥ 1, + fP (x) = log(1 + xP ) − E[log(1 + |He |2 P )] + ≤ log(1 + xP ) − E[{log(|He |2 P )}+ ] + ≤ log(1 + xP ) − {E[log(|He |2 P )]}+ (75) + γ = log(1 + xP ) − {− + log P }+ loge 2 + γ − log P = log(1 + xP ) + loge 2 γ ≤ log(1 + x) + = g(x) loge 2 Here (75) follows from the fact that the function {v}+ is convex in v, so by Jensen’s inequality E[{v}+ ] ≥ {E[v]}+ . Since fP (x) ≥ 0 and it can be seen that E[g(|H|2)] < ∞ the condition for the dominated convergence theorem are satisfied. A PPENDIX VII P ROOF OF THE U PPER B OUND IN L EMMA 3 Consider the channel with one receiver and one eavesdropper. Y (t) = Hmax (t)X(t) + Z(t) Ye (t) = He (t)X(t) + Ze (t).

(76)

Along the lines of Lemma 4 in Section VI-A one deduces that the sum-secrecy-capacity of the channel (2) is upper bounded by the secrecy capacity of the genie-aided-channel (76). It remains to show that an upper bound on the secrecy capacity of this channel is R+ (P ) = max E {log(1 + |Hmax |2 P (Hmax)) − log(1 + |He |2 P (Hmax ))}+ . (77) P (Hmax ):E[P (Hmax )]≤P

In what follows we will denote the eavesdropper’s channel output by Yê (t) = (Ye (t), He (t)). The joint distribution of the noise variables (Z(t), Ze (t)) is selected to be such that if |he (t)| ≤ |hmax (t)| we have X(t) → Y (t) → Ye (t), otherwise we have X(t) → Ye (t) → Y (t). Suppose that there is a sequence of (n, 2nR ) codes that achieve perfect secrecy in Definition 6. Following

25

the derivation of the upper bound Theorem 3, we have n n nR ≤ I(W ; Y n |Hmax ) − I(W ; Yên |Hmax ) + nε n ≤ I(W ; Y n , Yê |H n ) − I(W ; Yˆ n |H n ) + nε

= I(W ; Y

max n |Hmax , Yên ) + nε n n |Hmax , Yên ) + nε

e

max

n

≤ I(X n ; Y n X ≤ I(X(t); Y (t)|Hmax (t), Yê (t)) + nε

(78)

≤

(80)

(79)

t=1

n X

I(X(t); Y (t)|Hmax (t), Ye (t), He (t)) + nε

t=1

where (78) follows from the fact that W → (X n , Yên ) → Y n follows a Markov chain and (79) from the fact that the channel is memoryless. Now we invoke the following result for the Gaussian wiretap channel [12]: √ √ Lemma 7: Let (X, Y, Yê ) be random variables such that Y = γX + Zr and Ye = µX + Zr . Suppose that Zr ∼ CN (0, 1) and Ze ∼ CN (0, 1) and that the joint distribution of (Zr , Ze ) satisfies X → Y → Ye if |µ| ≤ |γ| and X → Ye → Y otherwise. Then we have max I(X; Y |Ye ) = log(1 + γ P¯ ) − log(1 + min(γ, µ)P¯ ) . (81) p(X),E[|X|2 ]≤P¯

Accordingly note that from (80), t I(X(t); Y (t)|Hmax (t), Ye (t), He (t)) ≤ EHmax ,He (t)

"

1 + |Hmax (t)|2 E[|X(t)|2] log 1 + |He (t)|2 E[|X(t)|2 ]

+ #

.

(82)

Furthermore, equality is obtained if the sequence X n is a sequence of Gaussian random variables, n conditionally independent given Hmax . Accordingly, we can write p t X(t) = gt (Hmax )T (t)

where T (t) is a sequence of i.i.d. zero-mean, unit-variance complex Gaussian random variables. We will t now show that it suffices to have gt (Hmax ) = γ(Hmax (t)) i.e., the function gt (·) is time-invariant and only depends on the current value Hr (t). Note that, " + # 1 + |Hmax (t)|2 E[|X(t)|2 ] t EHmax log ,He (t) 1 + |He (t)|2 E[|X(t)|2 ] " + # t 1 + |Hmax (t)|2 gt (Hmax ) t = EHmax log ,He (t) t 1 + |He (t)|2 gt (Hmax ) " " ## + t 1 + |Hmax (t)|2 gt (Hmax ) t−1 = EHmax (t),He (t) EHmax log Hmax (t) t 1 + |He (t)|2 gt (Hmax ) " + # t t−1 [gt (H 1 + |Hmax (t)|2 EHmax max )|Hmax (t)] = EHmax (t),He (t) log (83) t t−1 [gt (H 1 + |He (t)|2 EHmax max )|Hmax (t)] " + # 2 1 + |Hmax (t)| γt (Hmax (t)) = EHmax (t),He (t) log (84) 1 + |He (t)|2 γt (Hmax (t)) " + # 1 + |Hmax |2 γt (Hmax ) = EHmax ,He log , (85) 1 + |He |2 γt (Hmax )

26

+ here (83) follows from the fact that the function log 1+ax is concave in x > 0 for each a and b, so 1+bx t t−1 [gt (H Jensen’s inequality can be applied and (84) follows by defining γ(Hmax (t)) = EHmax max )|Hmax (t)] and (85) follows since the random variables Hmax (t) and He (t) have the same distribution for each t, so we can drop the index t. Combining (85), (82) and (80), we have that " + # n X 1 + |Hmax|2 γt (Hmax ) nR − nε ≤ EHmax ,He log 1 + |He |2 γt (Hmax ) t=1 " P + # 1 + |Hmax |2 n1 nt=1 γt (Hmax ) P (86) ≤ nEHmax ,He log 1 + |He |2 n1 nt=1 γt (Hmax ) " + # 1 + |Hmax |2 γ(Hmax ) = nEHmax ,He log (87) 1 + |He |2 γ(Hmax ) P where (86) follows from Jensen’s inequality and (87) follows by defining γ(Hmax ) = n1 nt=1 γt (Hmax ). Finally note that from the power constraint we must have E[γ(Hmax )] ≤ P and this gives the upper bound stated in (9) in Lemma 3. A PPENDIX VIII C ONVERGENCE CLAIM IN S UBSECTION VII-B Recall that we only transmit in those periods where the channel gains of all users is less than J. We denote the quantized gains by (Aq1 = 0, Aq2 , . . . , Aqq ). For convenience, let Aqq+1 = J. Note that we have added the superscript q to to be explicit regarding the number of partitions. While keeping J fixed, as we take q → ∞ we need to show the expression in (31) converges to − RK (P ) = lim

q→∞

=

Z

J

q X l=1

+ Pr(Aql ) log(1 + Aql P (Al )) − E[log(1 + |He |2 P (Aql ))] 2

0

(88)

+

{log(1 + aP (a)) − E[log(1 + |He | P (a))]} p(a)da

Let us define the function + fq (a) = log(1 + Aql P (Aql )) − E[log(1 + |He |2 P (Aql ))] ,

a ∈ [Aql , Aql+1 ].

(89)

The left hand side of (88) now becomes lim

q→∞

q X

Pr(Aql )fq (Aql )

l=1

= lim

q→∞

q Z X l=1

Aql+1

Aql

p(a)fq (a)da = lim

The second equality follows from the fact that Pr(Aql ) = [Aql , Aql+1 ]. Now observe that ∆

q→∞

R Aql+1 Aql

Z

J

p(a)fq (a)da

p(a)da and fq (a) is a constant for a ∈

lim fq (a) = f (a) = {log(1 + aP (a)) − E[log(1 + |He |2 P (a))]}+

q→∞

(90)

0

(91)

and since 0 ≤ fq (a) ≤ f (a), from the result in Appendix VI we have that fq (a) satisfies the conditions for the dominated convergence theorem. Hence the limit and the integral in (90) can be interchanged, yielding (88).

27

A PPENDIX IX H ELPER L EMMA IN THE PROOF OF T HEOREM 4 Lemma 8: Let H1 , H2, . . . , HK , He be i.i.d. unit mean exponentials. For K ≥ 2, we have " # |He |2 E log |He |2 ≥ |Hmax |2 ≤ 2 log 2 |Hmax |2

First note the following. Fact 3 ( [5]): Let V1 , V2 , . . . , VK , VK+1 be i.i.d. exponential random variables with mean λ and Vmax (K + 1) denotes the largest of these exponential and Vmax (K) the second largest. The joint distribution of (Vmax (K), Vmax (K + satisfies Vmax (K + 1) = Vmax (K) + Y, (92) where Y is an exponential random variable with mean λ and is independent of Vmax (K) Proof: We have " # |He |2 |Hmax |2 + Y 2 2 E log |He | ≥ |Hmax | = E log |Hmax |2 |Hmax |2 Y ≤E |Hmax |2 1 = E[Y ]E |Hmax |2 1 =E |Hmax |2

(93) (94) (95) (96)

where (94) follows from the identity log(1 + x) ≤ x for x > 0, (95) follows from the independence of Y and Hmax , and (96) from the fact that E[Y ] = 1. Since |Hmax |2 ≥ max(|H1 |2 , |H2 |2 ) we obtain 1 1 E ≤E ≤ 2 log 2. |Hmax|2 max(|H1 |2 , |H2|2 ) R EFERENCES [1] J. Barros and M. R. D. Rodrigues, “Secrecy capacity of wireless channels,” in Proc. Int. Symp. Inform. Theory, Seattle, July 2006. [2] G. Caire and S. Shamai, “On the capacity of some channels with channel state information,” IEEE Trans. Inform. Theory, vol. 45, pp. 2007–2019, 1999. [3] T. M. Cover and J. A. Thomas, Elements of Information Theory. John Wiley and Sons, 1991. [4] I. Csiszár and J. Körner, “Broadcast channels with confidential messages,” IEEE Trans. Inform. Theory, vol. 24, pp. 339–348, 1978. [5] H. A. David, Order Statistics. New York: Wiley, 1981. [6] W. Diffie and M. E. Hellman, “New directions in cryptography,” IEEE Transactions on Information Theory, vol. IT-22, no. 6, pp. 644–654, 1976. [7] A. A. El Gamal, “Capacity of the product and sum of two un-matched broadcast channels,” Probl. Information Transmission, pp. 3–23, 1980. [8] A. Fiat and M. Naor, “Broadcast encryption,” in Proceedings of the 13th annual international cryptology conference on Advances in cryptology, Santa Barbara, CA, 1994, pp. 480–491. [9] A. Goldsmith and P. Varaiya, “Capacity of fading channels with channel side information,” IEEE Trans. Inform. Theory, vol. 43, pp. 1986–1992, Nov. 1997. [10] P. Gopala, L. Lai, and H. E. Gamal, “On the secrecy capacity of fading channels,” IEEE Trans. Inform. Theory, submitted, Oct., 2006. [11] A. Khisti, A. Tchamkerten, and G. W. Wornell, “Secure broadcasting with multiuser diversity,” in Proc. 44th Allerton Conf. on Communication, Control and Computing, 2006. [12] S. K. Leung-Yan-Cheong and M. E. Hellman, “The Gaussian wiretap channel,” IEEE Trans. Inform. Theory, vol. 24, pp. 451–56, 1978. [13] L. Li and A. J. Goldsmith, “Optimal resource allocation for fading broadcast channels- part I: Ergodic capacity,” IEEE Trans. Inform. Theory, vol. 47, pp. 1083–1102, Mar. 2001.

28

[14] Z. Li, R. Yates, and W. Trappe, “Secrecy capacity of independent parallel channels,” in Proc. 44th Allerton Conf. on Communication, Control and Computing, 2006. [15] ——, “Secret communication with a fading eavesdropper channel,” in Proc. Int. Symp. Inform. Theory, 2007. [16] U. M. Maurer, “Secret key agreement by public discussion from common information,” IEEE Trans. Inform. Theory, vol. 39, pp. 733–742, Mar. 1993. [17] U. M. Maurer and S. Wolf, “Information-theoretic key agreement: from weak to strong secrecy for free,” in EUROCRYPT, 2000. [18] R. Negi and S. Goel, “Secret communication using artificial noise,” in Proc. IEEE Vehicular Tech. Conf, 2005. [19] C. E. Shannon, “Communication theory of secrecy systems,” Bell System Technical Journal, vol. 28, pp. 656–715, 1949. [20] A. Thangaraj, S. Dihidar, A. R. Calderbank, S. W. McLaughlin, and J. M. Merolla, “Applications of ldpc codes to the wiretap channel,” IEEE Trans. Inform. Theory, 2007, submitted. [21] D. Tse, “Optimal power allocation over parallel Gaussian broadcast channels,” unpublished, 1999. [22] D. Tse and P. Viswanath, Fundamentals of Wireless Communication. Cambridge University Press, 2005. [23] M. van Dijk, “On a special class of broadcast channels with confidential messages,” IEEE Transactions on Information Theory, vol. IT-43, no. 2, pp. 712–14, 1997. [24] A. D. Wyner, “The Wiretap Channel,” Bell Syst. Tech. J., vol. 54, pp. 1355–87, 1975.