2) Step 2: For a given channel partitioning, find the opti- mal power allocation ... channel gain value we find assigning it to which of the M groups results in least ...
Capacity of Fading Broadcast Channels with Limited-Rate Feedback Rajiv Agarwal and John Cioffi Department of Electrical Engineering Stanford University, CA-94305
Abstract— In this paper, we study a fading broadcast channel (BC) with perfect channel state information at the receiver (CSIR) and only a quantized version of it at the transmitter due to limited-rate links for channel feedback at each user. We find an achievable region for the fading BC under this condition using super-position coding and show that it is sumrate optimal. We also derive a closed-form expression for finding channel partitioning, which turns out to be the same in form as that for water-filling of power over time in fading channels. Using the derived closed form expression with temporal waterfilling of power at the transmitter in an iterative manner, we show numerically that a single iteration is adequate to achieve most of the capacity. Thus the complexity of finding the optimal (global maximum) or close to optimal (local maximum) channel partitioning is greatly reduced as compared to using a searchbased k-mean clustering algorithm like Lloyd’s algorithm that requires multiple iterations.
I. I NTRODUCTION One important scenario for multiuser wireless communications is the broadcast channel where a single transmitter sends independent information to many receivers, for example in the downlink of a cellular system. The capacity region of a fading BC with perfect channel state information at the transmitter (CSIT) and the receivers (CSIR) along with the capacity-achieving transmission strategy was found in [1]. However, knowing the channel perfectly at the transmitter requires 1) perfect channel measurements at the receiver and 2) perfect feedback of the estimates to the transmitter. The second requirement is usually harder to meet than the former in a practical system, especially when the cardinality of channel state space is large and the receivers have limitedrate feedback links to the transmitter. In case of discrete fading distributions this means that if the cardinality of the fading distribution is N < ∞ i.e. at any transmission instant the channel can be in one of N possible states with certain probability distribution, the receiver is allowed to feedback only log2 (M ) < log2 (N ) bits to the transmitter for CSIT. In this paper, we study the fading BC with perfect CSIR and limited CSIT arising from limited-rate channel feedback links at the users. The fading BC with limited-rate feedback is non-degraded, non-less-noisy and non-more-capable in general and hence its capacity region is unknown [2]. Common multiuser transmission techniques for the BC like super-position coding (SPC), which requires one of the users to always decode the other users’s message, or Dirty Paper coding (DPC), which requires perfect CSIT for one of the users, cannot be used in general. We derive an achievable region for the fading BC with limited-rate for the symmetric case i.e. when the two users have i.i.d. fading, same log2 (M )
number of bits for channel feedback and are restricted to have identical channel partitioning1 , using super-position coding. For each point on the boundary of the achievable region, we find the optimal channel partitioning at the users. We show that the derived rate-region is sum-rate optimal. Thus, we find the sum-capacity of the fading BC with limited-rate feedback for the symmetric case. Notice that finding the optimal channel partitioning at the users is complex when N and M are large. In [3], the authors derived the capacity of a single-user fading channel with limited-rate feedback and showed that Lloyd’s algorithm for k-mean clustering can be applied to find the optimal (capacity-achieving) channel partitioning. The algorithm works in an iterative manner. Lloyd’s Algorithm: 1) Step 1: Assume an arbitrary initial channel partitioning into M groups. 2) Step 2: For a given channel partitioning, find the optimal power allocation for the M groups that minimizes distortion/cost defined to be the ‘negative’ of ‘ergodic capacity minus the power penalty’. 3) Step 3: For a given power allocation for the M groups, find the optimal channel partitioning that minimizes cost. 4) Step 4: Repeat steps 2 and 3 until convergence. In Lloyd’s algorithm, the complexity of Step 2 is O(N ) using the expression for optimal power allocation (waterfilling). The complexity of Step 3 is O(N M ) as for each channel gain value we find assigning it to which of the M groups results in least distortion. So the overall complexity for a single √iteration (assuming M ¿ N ) is O(N ). It takes about 2 N iterations [4] for the algorithm to converge, where this convergence too can be to a local optima. To avoid convergence to a local optima, the authors in [3] suggest to run Lloyd’s algorithm from multiple different starting guesses (Step 1). The same technique can be used for the fading BC with limited-rate feedback as well, with a modified cost (distortion) function. Alternatively, we derive a closed-form expression for finding a close-to optimal2 channel partitioning given a power allocation (Step 2). For the single-user case, the derived 1 We will use the phrase ‘channel partitioning’ throughout the paper to refer to partitioning of the channel gain values (assumed to belong to a discrete set) into groups or effectively a quantization of the channel gain values. 2 As seen by numerical examples, the channel partitioning arrived at by using the proposed algorithm is either optimal or very close to it.
closed-form expression turns out to be the same in form as that for water-filling of power over time in fading channels. For the fading BC, the expression is a modified water-filling due to the fact that users now cause interference to each other. Using the proposed closed-form expression for finding channel partitioning, we can thus start from an intelligent starting point as compared to an arbitrary starting point (Step 1), which may reduce the number of iterations. Indeed, we show numerically that a single iteration is now adequate to achieve capacity (the starting point in itself was optimal) or most of the capacity (the starting point was very close to the optimal). Thus the complexity of finding the optimal (global maxima) or close to optimal (local maxima) channel partitioning is greatly reduced when using a search-based k-mean clustering algorithm like Lloyd’s algorithm where √ it was O(N 2 N ) assuming M ¿ N . Further, when M is large, we propose an alternative iterative water-filling algorithm for finding the channel partitioning using the derived closed-form expression, which makes the complexity of Step 3 in the iteration reduce from O(N M ) to O(N ), now independent of M . The rest of the paper is organized as follows. In Section II we describe the fading broadcast channel with limited-rate feedback (system under study). In Section III, we find an achievable rate region for the fading BC with limited-rate feedback using SPC and show that it is sum-rate optimal. In Section IV, we derive closed-form expressions for channel partitioning for both the single-user fading channel and the two-user fading BC. In Section V, numerical results are provided for a discrete fading channel, where we compare the proposed achievable rate region with a single iteration to the achievable region with the globally optimum channel partitioning. We also compare the proposed achievable region for the fading BC with limited-rate feedback to the capacity region with perfect CSIT (upper-bound). Concluding remarks are given in Section VI. II. B ROADCAST C HANNEL M ODEL Notation: When referring to random variables, small letters will denote a realization of the random variable and capital letters will denote the random variable itself. The notation |X| denotes the cardinality of the discrete random variable X. We consider a discrete-time fading broadcast channel with a single transmitter communicating independent information to 2 users3 . The transmitted symbol x[i] is composed of 2 independent information sources for the two users, where i represents the time index. The time-varying channel gain of the path to user k is denoted by hk [i], which remains constant during the ith channel use and is known to the respective user at all times (perfect CSIR). Each receiver has additive Gaussian noise. The received signal of user k is thus yk [i] = hk [i]x[i] + wk [i],
k = 1, 2
(1)
3 All the results can be easily extended to more than two users as are discussed in [5].
where wk [i] is white Gaussian noise with power N0 B, where B is the transmission bandwidth. For simplicity, we assume B = 1 Hz throughout this paper. We also define the channel power gain γk [i] = |hk [i]|2 , where the distribution of hk [i] induces a distribution on γk [i]. Further, we assume that Γ1 [i] and Γ2 [i] are discrete random variables for all i, having some joint probability mass distribution. We are interested in the case when |Γ1 | and |Γ2 | are large and the feedback link has limited rate of log2 (M ) bits for each user, thus the receivers can only feedback log2 (M ) bits for CSIT. We further assume that the fading process is stationary and ergodic. Thus f (γ1 [i], γ2 [i]) = f (γ1 , γ2 )∀i and this joint distribution f (γ1 , γ2 ) is known to the transmitter and the users. Also, the input x has an average power constraint P i.e. E[x2 ] ≤ P . Under these conditions, we derive an achievable rate region for the fading broadcast channel with limited-rate feedback. III. ACHIEVABLE R ATE R EGION When only log2 (M ) bits of channel knowledge are available at the transmitter per user, from the transmitter’s point of view, the channel can be in one of M 2 states (called component channels) at each transmission instant, assuming Γ1 and Γ2 are independent of each other. For any general Γ1 and Γ2 and an arbitrary channel partitioning at the two users, the resulting component channels may not be degraded. One simple example is when Γ1 = {10, 5, 4, 2} all with equal probability partitioned as {(10, 5), (4, 2)} and Γ2 = {12, 8, 6, 3} all with equal probability partitioned as {(12, 8), (6, 3)} and Γ1 and Γ2 are independent of each other4 . Clearly from the point of view of the transmitter, the channel can be one of 4 states {[(10, 5), (12, 8)], [(10, 5), (6, 3)], [(4, 2), (6, 3)], [(4, 2), (12, 8)]}, all with equal probability. Notice that in all component channels both the users experience fading and the fading gain value is not known to the transmitter. The last component channel is degraded while the rest are non-degraded and hence super-position cannot be used in general [6]. Furthermore, since in neither of the component channels, the channel gain is perfectly known to the transmitter for either of the users, dirty paper coding (DPC) also cannot be used as a possible transmission strategy. As found in [6], time-division multiplexing and more-capable like transmission are possible transmission strategies in any component channel, but both are sub-optimum in general. Although, not much can be said about the capacity of the fading BC with limited-rate feedback for the general case, if we restrict to identical fading for the two users i.e. Γ1 = Γ2 independent of each other and identical channel partitioning as well, all component channels are degraded. This is easily seen as illustrated in Figure 1. 4 Notice that we do not need to consider groups like {(10, 4), (5, 2)} because they do not convey to the transmitter if the channel is in good state or bad. The channel partitioning thus is a quantization, where each representative value corresponds to a closed-region and the regions corresponding to different representative values do not overlap.
Fig. 1. M 2 degraded component channels with identical fading and identical channel partitioning at the two users. In component channels with solid line, users have i.i.d. fading, with dash-dot line, User 1 is better and with dashed line, User 2 is better.
Henceforth in this section we will assume Γ1 = Γ2 = Γ and |Γ| = N . Out of these M 2 component channels, in M component channels, the two user have i.i.d fading, in M (M −1) component channels user 1 is the better user and 2 in the remaining M (M2 −1) component channels user 2 is the better user. In any case, all component channels are degraded and super-position can be used and is capacity achieving as shown in [7]. This can be considered as a generalization of the probabilistic broadcast channel defined in [1]. In [1], each component channel was a degraded AWGN BC; whereas in our model, each component channel although degraded, still experiences fading. The transmitter will generate M 2 codebooks corresponding to each component channel with the optimum input distribution (satisfying the power constraint), chooses one of them according based on the available CSIT at any time instant and rely on the ergodicity of the channel to achieve the long-term average rate. Although the component channels are degraded, the capacity is known only in terms of a mutual information expression with a maximization over the input distribution because the component channels experience fading and the optimal input distribution depends on the underlying fading process. The optimal input distribution is not known in closed-form even for simple fading processes like i.i.d. Rayleigh fading for both the users [6]. Hence, in the discussion that follows, we will restrict to Gaussian input. An achievable region can then be easily derived in closed-form by optimal allocation
of total power P to the component channels and among the users based on their priority specified by µ1 , µ2 ∈ [0, 1] s.t. µ1 + µ2 = 1. The lagrangian objective function for optimal power allocation is to maximize µ1 R1 + µ2 R2 − λP and can be solved in closed-from using utility function as described in detail in [8], [9], where the resulting optimal (R1 , R2 ) is the rate-pair on the boundary of the capacity region. The solution essentially involves a two-dimensional waterfilling over time and over the users. The above discussion assumed that the channel partitioning at the users is pre-decided (to be identical) and is given (Figure 1). If on the other hand, the channel partitioning can be optimized over as well, the capacity region again is unknown in general. This is because, the optimal channel partitioning at the users for different priorities of the users (µ1 6= µ2 ) need not be identical and so the resulting component channels need not be degraded for a general Γ. This is easy to observe and an example is given in Section V-B. However, the sum-rate optimal point (characterized by µ1 = µ2 ) will have identical partitioning for the two users5 and hence sum-rate capacity is known and is achieved by super-position coding and M 2 codebooks with optimal power allocation for any general Γ. Although the capacity region is unknown when channel partitioning can also be optimized over, if we restrict to identical partitioning for the two users for any (µ1 , µ2 ), the BC can be in one of M 2 component channels as already shown in Figure 1, all of which are degraded. Hence an achievable region can then be proposed using super-position coding, where the optimal (identical) channel partitioning and power allocation among the component channels and the users needs to be found for any given priorities of the users characterized by µ1 , µ2 . For a given priority µ1 , µ2 ∈ [0, 1] s.t. µ1 + µ2 = 1 of the two users, the objective is to maximize " Ã ! M (j ) X γ1 P1 1 Q , Iµ ≥µ(j1 ) µ1 Ej1 log 1 + 1 th N0 j1 =1 Ã !# (j ) γ2 P 2 1 p(j1 ) + Iµ