Tomlinson-Harashima Precoding with Adaptive ... - Semantic Scholar

8 downloads 0 Views 421KB Size Report
for multiuser transmission in cellular networks is proposed. This approach applies ..... The greedy THP-AMC algorithm adapts the modulation order and the user ...
Tomlinson-Harashima Precoding with Adaptive Modulation for Fixed Relay Networks Taiwen Tang ∗ , Chan-Byoung Chae ∗ , Robert W. Heath, Jr. ∗ and Sunghyun Cho ∗ Email: {ttang, cbchae, rheath}@ece.utexas.edu Department of Electrical & Computer Engineering Wireless Networking & Communications Group The University of Texas at Austin, Austin, Texas † Email: [email protected] Samsung Advanced Institute of Technology, Suwon, Korea

Abstract— In this paper, a relaying strategy that uses multiple input multiple output (MIMO) fixed relays with linear processing for multiuser transmission in cellular networks is proposed. This approach applies to the two hop relaying scenario for coverage enhancement, where the base station transmits data to multiple users through one fixed relay (multiuser transmission). The fixed relay processes the received signal by applying a linear transformation and forwards the processed signal to multiple users. We propose an implementable multiuser precoding strategy that combines Tomlinson-Harashima precoding at the base station and linear signal processing at the relay with adaptive stream selection and QAM modulation. We propose a reduced complexity algorithm to select a subset of users to avoid an exhaustive search over all user permutations. Simulations illustrate the sum rate using Tomlinson-Harashima precoding without coding.

I. I NTRODUCTION Using fixed relays in cellular systems to boost coverage has received significant attention [1], [2]. Fixed relays are low cost and low transmit power elements that receive and forward data from the base station to the users via wireless channels, and vice versa. Using fixed relays enhances coverage in cellular networks when carefully placed at the cell edge or in regions with significant shadowing. Because they implement a subset of base station functions, fixed relays are a low cost and low complexity solution to meet the requirement of high data rate communication. The main challenge in a fixed relay network is providing a high capacity link between the base station and the relay, while at the same time providing multiple data links to multiple users. A natural solution to this problem is to exploit the advantages of multiple-input multiple-output (MIMO) communication. It is well known that MIMO communication uses multiple antennas to enhance the system capacity and improve resilience against fading [3]. Initial work on MIMO relay channels [4] [5], however, only deals with the pointto-point MIMO relay channel. The point-to-multipoint case for the relay channel has received less attention. Research on the MIMO broadcast channel shows that dirty paper coding This material is based in part upon work supported by Samsung Advanced Institute of Technology.



and multiuser transmission for point-to-multipoint transmission significantly outperforms time division multiple access (TDMA) that schedules one user in each time slot [6]. In this paper we assume that both the base station and the fixed relay have multiple antennas but that each mobile user has only a single receive antenna for simplicity. Used in this way a high throughput MIMO link can be created between the base station and the fixed relay, then the MIMO broadcast channel can be used to deliver data to multiple users. We consider the special case of MIMO fixed relays with linear processing to enhance multiuser transmission in the downlink of a cellular system. We assume a two-hop communication where in one time-slot the relay stores the message, applies a linear filter, and forwards it to the users in the second timeslot. We also assume the practical case where the relay node does not decode the signal from the source, but simply process the received signal with a matrix multiplication. Further we neglect the direct connection between the source and the users. This is realistic for the case where relays are used to improve coverage and is helpful because it simplifies the analysis. We propose to jointly design the precoder at the base station and the signal processing unit at the relay to improve throughput. The base station makes centralized decisions under the assumption of perfect channel state information about both the MIMO link and the links between the relay station and the users. This can be achieved by channel sounding in time division duplex (TDD) systems or using limited feedback approaches. Note that in our paper we assume perfect channel state information. Applying limited feedback techniques to this system, e.g., [7] [8] is an interesting future extension. We propose an implementable multiuser precoding strategy for this system that combines Tomlinson-Harashima precoding at the base station [9] and linear signal processing at the relay, adaptive stream selection, and QAM modulation [10] [11]. In comparison to decode-and-forward relaying strategies [12]– [14], a major benefit of our proposed approach is that it does not require decoding and re-encoding of the real-time data at the relay. Compared to prior work on MIMO relay design that uses linear signal processing at the relay [15]–[17], we consider simultaneous multi-user transmission whereas prior work considers only single user transmission. The proposed implementable relaying strategy using

2

N2,1 Base Station (BS)

Relay Station (RS)

h2,1

MS 1

n1

Base Station Encode w/DPC/ FEC

W

F MIMO Link

MS 2

Relay

T Mod

F

H1

W

h2,i

DL signalling/ RS scheduling

Optimization/ scheduling

N2,N

u

MIMO BC

MS Nu

B-I

Feedback

h2,N

u

Channel State Information

Fig. 1. The multiuser MIMO relay system with linear processing. The relay uses linear processing. The base station performs error correction coding and dirty paper coding along with adaptive modulation coding.

Tomlinson-Harashima precoding is similar to the application of Tomlinson-Harashima precoding in the MIMO broadcast channel [9] [18]. Different design and optimization criteria for the Tomlinson-Harashima precoder, however, are proposed in our framework. II. S YSTEM M ODEL In this section, we elaborate on the system model of the multiuser fixed relay system. First we describe the system block diagram and main assumptions of the system, then we present the downlink signal model. The system block diagram is illustrated in Fig. 1. We only consider two-hop relaying, i.e., there is at most one relay between the base station and the mobile users in our system. It considers the tradeoff of the deployment cost and the system performance. Using more than two hops adds complexity to the system operation, especially routing, increases transmission latency, and may also lead to an increase of hardware complexity. In this paper, we consider a narrowband system. Our analysis, however, can be generalized to broadband systems with OFDM (orthogonal frequency division multiplexing), where each subcarrier in the frequency domain can be viewed as a narrowband channel. The base station (BS) is deployed with Mb transmit antennas and communicates with a fixed relay node that has Mr antennas. A MIMO channel is thus created between the base station and the relay. We denote it by H1 . The BS uses precoding strategy at the transmitter. This includes an encoding operation and a subsequent linear operation using a precoding matrix F [19]. The base station encodes the data that is targeted to multiple users and sends it to the relay. The relay then broadcasts the data to multiple mobile users. In this narrowband system, the relaying operation is performed in a half time division duplex (TDD) mode. Each downlink frame consists of two phases of operations. In the first phase, the relay receives the signal from the base station. In the second phase, this signal is processed, then it is forwarded to the users. Both phases span equal time durations. In this paper, we focus on non-decode-and-forward

Fig. 2. The multiuser MIMO relay system using Tomlinson-Harashima precoding.

relays because this does not require decoding and encoding of real-time data at the relays. We use a linear signal processing unit denoted by a relaying matrix W of a size Mr × Mr to process the received signal at the relay. The transmitted signal vector intended for multiple users is denoted by s. The noise added to the received signal at the relay is represented by n1 . The number of users in the system is denoted by Nu . The relay forwards the signal from the BS to multiple users, each with one receive antenna. The channel between the relay and the ith mobile user can be represented by a vector h2,i of size Mr ×1. User i observes the following linear combination of the transmitted signals xi

= hT2,i W(H1 Fs + n1 ) + n2,i = hT2,i WH1 Fs + hT2,i Wn1 + n2,i

(1)

where xi is the scalar received signal at user i. The noise term n2,i is a scalar at the ith user and n1 is the noise vector at the relay. The elements of the noise vector n1 and the noise term n2,i follow i.i.d. complex Gaussian distribution with zero mean and variance σ12 and σ22 respectively. In this system, the total power at the base station and the relay are Pt and Pr respectively. The input signal vectors are independent inputs that are all with zero mean and unit variance. The signal vector is coded by a linear precoder F [19]. The signal after the linear precoding satisfies the total power constraint EkFsk2 ≤ Pt , where E denotes expectation. Hence the transmit power constraints at the base station can be written as Tr{FFH } ≤ Pt . (2) The received signal at the relay is processed by the relay matrix W. Denote ˜s = WH1 Fs + Wn1 . The relay power constraint is expressed as Ek˜sk2 ≤ Pr . This power constraint can be written explicitly with respect to W and F as follows 2 H Tr{W(H1 FFH HH 1 + Iσ1 )W } ≤ Pr

(3)

where Tr{·} denotes the trace of a matrix. The noise power at the ith user is simply denoted by vi = (h2,i )T WWH (h2,i )∗ σ12 + σ22 where (·)∗ denotes conjugate operation.

(4)

3

Overall, we model the multiuser MIMO relaying system as a broadcast channel in (1) with power constraints at the base station and the relay in (2) and (3), respectively. III. P RACTICAL I MPLEMENTATION OF THE MIMO F IXED R ELAY S YSTEM In this section, we propose a practical relay design using Tomlinson-Harashima precoding (THP) [9] as a specific transceiver solution for the system. Using THP follows the idea of zero-forcing dirty paper coding [9] and is implementable in practical communication systems. The proposed THP architecture also combines adaptive modulation with square MQAM modulation and adaptive stream selection based on instantaneous channel conditions. A. Multiuser Tomlinson-Harashima Precoding for Fixed Relay Systems In this subsection, we discuss using Tomlinson-Harashima precoding (THP) in the fixed relay system. We also utilize adaptive modulation with square M-QAM modulation, which works with THP, to maximize the sum rate. We call this scheme Tomlinson-Harashima precoding with adaptive modulation and coding (THP-AMC). For simplicity, we do not consider error coding for this scheme. We stack the channel vectors between the relay and the ¯2 = users into a matrix H2 = [h1,2 , ..., hNu ,2 ]T . Define H ΠH2 , which is a permutated version of the channel matrix. The matrix Π, which consists of zeros and ones, performs permutation of the rows of the matrix H2 . Applying the ¯ 2 , we have the QR decomposition to the channel matrix H following ¯ 2 = G2 Q2 H (5) where G2 is a lower triangular matrix and Q2 is unitary. Performing the singular value decomposition, H1 = U1 Σ1 V1H

(6)

where both U1 and V1 are unitary, and the nonzero diagonal elements of the matrix Σ1 are placed in descending order √ √ ( ν1 ≥ ... ≥ νmin{Mb ,Mr } ). We construct a diagonal matrix K such that KKH = diag{k1 , ..., kMr }. In our design, the linear processing matrix W is expressed with respect to K as H W = QH 2 KU1 .

(7)

The precoder is structured such that FFH = V1 ΘV1H , where Θ = diag{p1 , ..., pMu , 0, ..., 0} denotes the powers allocated to the transmit streams. We stack the powers in a vector p = [p1 , ..., pMu ]T and the vector p includes all nonzero diagonal elements of Θ. We jointly design p and k to adjust the power allocation at the base station and at the relay. The diagram of the THP is illustrated in Fig. 2. We focus on the implementation of the THP along with the SVD based designs at the relay. A feedback loop is employed at the base station to perform the Tomlinson-Harashima precoding. It is constructed based on the following factorization B = D2 G2

(8)

where D2 is a diagonal matrix D2 = diag{G2 (1, 1)−1 , ...,G2 (Mu , Mu )−1 }. This guarantees that the diagonal elements of the matrix B are ones. We select the users using the algorithm described in Section III-B. This algorithm significantly lowers the complexity of user selection. In this section, we focus on the algorithms that adaptively determine the QAM modulation of this system. Given Mu selected users, we assume that equal power is allocated to the input streams of the THP, thus only the relay power allocation is required. This lowers the implementation complexity. We show in [20] that the achievable rates of using equal power allocation and unequal power allocation at the base station are close to each other. The signal to noise ratio at the ith receiver can be derived as d¯i Pt , (9) ηi = Mu vi where d¯i = νi ki |G2 (i, i)|2 . The noise term can be written as vi =

i X

|G2 (i, j)|2 kj σ12 + σ22 .

(10)

j=1

Using TH precoder enables the base station to multicast data streams to different users. The symbol error rate (SER) with M-QAM modulation for THP can be approximated as [9] ! r 3ηi SERi ≈ Ki Q (11) 2Ri where Ri denotes the data rate when the QAM with size Mi = Ri 2Ri is used. Notice that a factor ( 2R2i −1 ) is contained in the SER expression to account for the power penalty of using THP [9]. Here Ki denotes the number of nearest neighbors associated with the periodic extended QAM constellation, and Ki = 4 for Mi ≥ 4. For user i, a target SER is specified as SERti . The condition that SERi ≤ SERti is equivalent to SNRi = 23ηRii ≥ SNRti , where SNRti is the target SNR for the ith user. This implies that   3ηi Ri ≤ log2 SNRti = log2 (ηi ) + log2 (3) − log2 (SNRti ). (12) Hence the sum rate is bounded by M

M

M

u u u 1X 1X Mu 1X Ri ≤ log2 (ηi )+ log2 (3)− log2 (SNRti ), 2 i=1 2 i=1 2 2 i=1 (13) where the QAM constellation set is restricted to be M = {1, 2, 4, 16, 64, 256, 1024}, hence the rate set RT = {0, 1, 2, 4, 8, 16, 32}. PMu Clearly, by maximizing the quantity i=1 log2 (ηi ), we may have higher sum rate. For simplicity of computation, we design the relay coefficients by maximizing a lower bound PMu (l) (l) is a lower bound of the SNR at i=1 log2 (ηi ), where ηi th the i user (l) ηi = ci ki , (14)

and ci =

νi |G2 (i, i)|2 Pt /Mu P σ2

r 1 2 kh2,i k2 νi Pt /M 2 + σ2 u +σ 1

.

(15)

4

consisting of the channel vectors of the users selected in the first n − 1 steps

The proof of this bound is available in [20]. The optimization problem is formulated as PMu log2 (ci ki ) mink − 12 i=1 PMu 2 s.t. i=1 ki (νi Pt /Mu + σ1 ) ≤ Pr , ki ≥ 0 ∀i. (16) The closed-form solution to this problem is given as ki = Pr 2 Mu mi , where mi = νi Pt /Mu + σ1 . After finding the optimum k, the rate for the ith user is obtained by truncating log2 (ηi ) + log2 (3) − log2 (SNRti ) to the nearest point in the feasible rate set RT that is no greater than log2 (ηi ) + log2 (3) − log2 (SNRti ). We may obtain Ri = 0 for some streams, which implies that this channel supports no suitable QAM modulation to meet the target of the symbol error rate. Therefore, we may drop the intended user, re-compute the number of transmit streams and update the user set. This requires a re-allocation of the transmit power at the relay based on (16) with the updated user set. B. A Greedy Reduced Complexity User Selection Algorithm The optimum selection of users requires an exhaustive search over all user permutations. When the number of users Nu is large or Mu is large, it is costly to search over the permutations of all users to maximize the sum rate. Thus it is of interest to study a user selection algorithm to determine a permutation of the Mu users out of all the users. In this subsection, we propose a greedy reduced complexity user selection algorithm to determine a user permutation with desirable channel conditions. We assume that all the users are arranged in a permutation order Π. Then the base station selects the first T users out of all users, where T ≤ Mu and Mu = min{Mr , Mb , Nu }. This corresponds to operate on the first T rows of the aggregated ¯ 2 in (5). Equal power is allocated to the channel matrix H transmitted streams, thus each stream is loaded with power Pt /T . The algorithm proposed here is a modification to the algorithm in [21] [22]. The metric for user selection is based on the lower bound to the SNR in (14). The set of users is denoted by U = {1, 2, ..., Nu }. We elaborate the selection algorithm here (the description follows [22]). Algorithm 1: Reduced Complexity User Selection 1) Initialization • Set n = 1. khu,2 k2 . Find a user s1 • Let r1,u = Pr σ 2 khu,2 k2

1 2 ν1 Pt /Mu +σ1

+σ22

such that s1 = arg maxu∈U r1,u . Let S1 = {s1 }. 2) While n ≤ Mu : • Increase n by 1. • Project each remaining channel vector onto the orthogonal complement of the subspace spanned by the channels of the selected users. The projection matrix is •

P⊥ n

= IMr − H2 (Sn−1 )H (H2 (Sn−1 ) H2 (Sn−1 )H )−1 H2 (Sn−1 ) (17)

where IMr is the Mr × Mr identity matrix, and H2 (Sn−1 ) denotes the row-reduced channel matrix

H2 (Sn−1 ) = [hs1 ,2 , ..., hsn−1 ,2 ]T . •

(18)

2 Let τn,u = khTu,2 P⊥ nk . Find a user sn such that

sn = arg max u∈U \Sn−1

τn,u khu,2

k2

Pr σ12 νn Pt /Mu +σ12

+ σ22

. (19)

Set Sn = Sn−1 ∪ {sn } This algorithm aims to select a group of Mu users who have good channels. We illustrate the performance gain of this algorithm in the simulations. •

C. THP Adaptive Modulation with Reduced Complexity User Selection The combination of the THP adaptation modulation and the reduced complexity user selection algorithm results in a low complexity system operation algorithm. We describe the entire algorithm in the following description. Algorithm 2: THP-AMC 1) Select S (the set of user indices with Mu users) based on the algorithm in Section III-B 2) While Not (Ri > 0, ∀i ∈ S) or (S is empty) PT • Update T ← i=1 I{Ri > 0}; • Update S ← ∀i ∈ S, Ri > 0 while preserving the user ordering in the S. • Perform (16) with the updated S and T ; • Determine Ri by the truncation; 3) End Notice that using this algorithm, the encoding and decoding are carried only at the ends, i.e., the base station and the users. We significantly lower the complexity and processing latency compared to the conventional decode and forward strategy. The greedy THP-AMC algorithm adapts the modulation order and the user selection with the channel state information. The adaptation functionality is deployed at the base station. IV. S IMULATION R ESULTS We assume that the channel between the BS and the relay H1 and the channel between the relay and the mobile users H2 follow i.i.d. complex Gaussian distribution with zero mean and variances α and 1 respectively. The parameter α represents the path loss for the first MIMO link. We set it to be 0.05 in the simulations. The total transmit power at the base station is Pt and the total power at the relay is Pr . We set Pt = Pr = 1 in the simulations. Here the signal to noise ratio SNR’s are Pr t defined as SNR1 = αP σ1 and SNR2 = σ2 , respectively, representing the average receive SNR’s at the receive antennas. The parameters σ1 and σ2 are the noise variances. In the simulations, we make the quantities SNR1 and SNR2 equal to each other. This implies that the path losses are the same for the first and the second link. We simulate the performance of the relaying THP with adaptive modulation. The sum rate of this scheme is compared with a capacity upper bound of this system. This is the ergodic

5

adaptively determine the number of transmit streams and the QAM modulation.

Average Sum Rate (bps/Hz)

10 Capacity upper bound -1 THP AMC (target SER=10 ) THP AMC (target SER=10 -2 )

8

R EFERENCES

6

4

2

0

5

10

15 20 SNR (dB)

25

30

Fig. 3. The performance of the Tomlinson-Harashima precoding with adaptive modulation and stream selection in the fixed relay system for the number of antennas at the BS Mb = 2, the number of antennas at the RS Mr = 3 and the number of users Nu = 5.

Fig. 4. The symbol error rate performance of the Tomlinson-Harashima precoding with adaptive modulation and stream selection in the fixed relay system for the number of antennas at the BS Mb = 2, the number of antennas at the RS Mr = 3 and the number of users Nu = 5.

capacity of the MIMO channel normalized by the time-sharing factor 21 , i.e., 12 C1 (H1 ), which serves as a reference curve. We illustrate the sum rates for the scenarios where the target symbol error rates for all users are 10−1 and 10−2 respectively. We also perform Monte Carlo simulations to study the SER performance. From Fig. 4, we observe that the adaptation meets the SER requirements in most of the SNR region. V. C ONCLUSIONS In this paper, we investigated the design problem of using two hop transmission through one non-decode-and-forward MIMO fixed relay for coverage enhancement in cellular networks. The relay uses linear signal processing to process the received signal and forward it to multiple users. This relaying strategy leverages MIMO technology to achieve high spectral efficiency and it does not require decoding and re-encoding of the real-time data at the relay. We proposed a system architecture for relay-aided broadcasting. This architecture combines Tomlinson-Harashima precoding with adaptive modulation to

[1] R. Pabst, B. H. Walke, D. C. Schultz, D. C. Herhold, H. Yanikomeroglu, S. Mukherjee, H. Viswanathan, M. Lott, W. Zirwas, M. Dohler, H. Aghvami, D. D. Falconer, and G. P. Fettweis, “Relay-based deployment concepts for wireless and mobile broadband radio,” IEEE Commun. Mag., vol. 42, no. 9, pp. 80–89, Sept. 2004. [2] H. Hu, H. Yanikomeroglu, D. D. Falconer, and S. Periyalwar, “Range extension without capacity penalty in cellular networks with digital fixed relays,” in Proc. Glob. Telecom. Conf., Nov. 29 - Dec. 3 2004, vol. 5, pp. 3053–3057. [3] D. Gesbert, M. Shafi, D.-S. Shiu, P. J. Smith, and A. Naguib, “From theory to practice: An overview of MIMO space-time coded wireless systems,” IEEE Jour. Select. Areas in Commun., vol. 21, no. 3, pp. 281–302, April 2003. [4] B. Wang, J. Zhang, and A. Host-Madsen, “On the capacity of MIMO relay channels,” IEEE Trans. Inform. Theory, vol. 51, no. 1, pp. 29–43, Jan. 2005. [5] C. K. Lo, S. Vishwanath, and R. W. Heath, Jr., “Sum-rate bounds for MIMO relay channels using precoding,” in Proc. Glob. Telecom. Conf., 2005. [6] N. Jindal and A. Goldsmith, “Dirty-paper coding versus TDMA for MIMO broadcast channels,” IEEE Trans. Inform. Theory, vol. 51, no. 5, pp. 1783–1794, May 2005. [7] D. J. Love, R. W. Heath, Jr., and T. Strohmer, “Grassmannian beamforming for multiple-input multiple-output wireless systems,” IEEE Trans. Inform. Theory, vol. 49, pp. 2735–2747, Oct. 2003. [8] B. Mondal and R. W. Heath, Jr., “An upper bound on SNR for limited feedback MIMO beamforming systems,” in Proc. of IEEE Information Theory Workshop, San Antonio, Texas, October 24-29 2004. [9] R. F. H. Fischer, Precoding and Signal Shaping for Digital Transmission, Wiley-IEEE Press, July 2002. [10] S. Catreux, V. Erceg, D. Gesbert, and R. W. Heath Jr., “Adaptive modulation and MIMO coding for broadband wireless data networks,” IEEE Comm. Mag., vol. 40, no. 6, pp. 108–115, June 2002. [11] A. J. Goldsmith and S.-G. Chua, “Variable-rate variable-power MQAM for fading channels,” IEEE Trans. Commun., vol. 45, no. 10, pp. 1218– 1230, Oct. 1997. [12] J. N. Laneman and G. W. Wornell, “Exploiting distributed spatial diversity in wireless networks,” in Proc. of Allerton Conf. on Comm. Cont. and Comp., Monticello, IL, Oct. 2000. [13] A. Sendonaris, E. Erkip, and B. Aazhang, “User cooperation diversity. part I. system description,” IEEE Trans. Commun., vol. 51, no. 11, pp. 1927–1938, Nov. 2003. [14] R. U. Nabar, H. Bolcskei, and F. W. Kneubuhler, “Fading relay channels: Performance limits and space-time signal design,” IEEE J. Select. Areas Commun., vol. 22, no. 6, pp. 1099–1109, August 2004. [15] O. Munoz, J. Vidal, and A. Agustin, “Non-regenerative MIMO relaying with channel state information,” in Proc. Int. Conf. Acoust., Speech and Sig. Proc., March 2005, vol. 3, pp. 361–364. [16] B. Rankov and A. Wittneben, “On the capacity of relay-assisted wireless MIMO channels,” in Proc. of IEEE Workshop on Signal Processing Advances in Wireless Comm., July 2004, pp. 323–327. [17] X. Tang and Y. Hua, “Optimal waveform design for MIMO relaying,” in Proc. of IEEE Workshop on Signal Processing Advances in Wireless Comm., 2005. [18] C. Windpassinger, R. F. H. Fischer, T. Vencel, and J. B. Huber, “Precoding in multiantenna and multiuser communications,” IEEE Trans. Wireless Commun., vol. 3, no. 4, pp. 1305–1316, July 2004. [19] G. Caire and S. Shamai, “On the achievable throughput of a multiantenna Gaussian broadcast channel,” IEEE Trans. Inform. Theory, vol. 49, no. 7, pp. 1691–1706, July 2003. [20] T. Tang, C. B. Chae, R. W. Heath, Jr., and S. Cho, “MIMO relaying with linear processing for multiuser transmission in fixed relay networks,” submitted to IEEE Trans. Signal Processing, Feb. 2006. [21] Z. Tu and R. S. Blum, “Multiuser diversity for a dirty paper approach,” IEEE Commun. Lett., vol. 7, no. 8, pp. 370–372, Aug. 2003. [22] G. Dirmic and N. D. Sidiropoulos, “On downlink beamforming with greedy user selection: performance analysis and a simple new algorithm,” IEEE Trans. Signal Processing, vol. 53, no. 10, pp. 3857– 3868, Oct. 2005.