550
IEEE SIGNAL PROCESSING LETTERS, VOL. 24, NO. 5, MAY 2017
Overlapped Subarray Based Hybrid Beamforming for Millimeter Wave Multiuser Massive MIMO Nuan Song, Member, IEEE, Tao Yang, and Huan Sun, Member, IEEE
Abstract—For massive multiple input multiple output systems at millimeter wave (mmWave) bands, we consider an efficient hybrid array architecture, namely, overlapped subarray (OSA), and develop a Unified Low Rank Sparse (ULoRaS) recovery algorithm for hybrid beamforming in downlink multiuser scenarios. The ULoRaS scheme takes advantage of the transmit–receive coordinated beamforming procedure to achieve large array gains. It has no dimensionality constraint and can be applied to the generalized OSA architecture, including both the fully connected and the widely discussed non-OSA cases. It is shown that the proposed ULoRaS algorithm for the novel OSA design is a good compromise of the performance and the required hardware complexity. Index Terms—Coordinated beamforming, hybrid beamforming, millimeter wave (mmWave), multiuser (MU) massive multiple input multiple output (MIMO), overlapped subarray (OSA), 5G.
I. INTRODUCTION ASSIVE multiple input multiple output (MIMO) at millimeter wave (mmWave) bands has been considered as one of the most promising candidates for future 5G communications, since high degrees of freedom, such as array gain, multiplexing gain, the capability of interference reduction, etc., can be greatly exploited [1]–[3]. A key factor involved with achieving these advantages is the design of efficient and robust beamforming techniques. As the fully digital solutions are far too complicated for massive MIMO, hybrid beamforming becomes quite promising to handle the complexity [2], [4]– [6]. The essential concept of applying hybrid architectures is to implement cost-effective variable phase shifters in the radio frequency (RF) domain before sampling [7]–[9] and thus the reduced dimensional signal processing schemes can be carried out digitally in the baseband. Hybrid beamforming techniques for single-user MIMO have been extensively studied in the literature. Two hybrid array architectures are identified, namely fully connected [10]–[14] and subarray based [15] (only analog beamforming), [16], [17], where each RF chain is connected to all the antenna elements or to only a subset of antennas, respectively. The advantage of the subarray architecture lies in reducing the number of required
M
Manuscript received November 26, 2016; revised February 23, 2017; accepted February 27, 2017. Date of publication March 13, 2017; date of current version March 27, 2017. This work was supported by the National Science and Technology Major Projects under Grant 2015ZX03002002. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Yongwei Huang. (Corresponding author: Nuan Song.) The authors are with the Bell Labs China, Nokia Shanghai Bell, Shanghai 201206, China (e-mail:
[email protected];
[email protected];
[email protected]). Color versions of one or more of the figures in this letter are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/LSP.2017.2681689
phase shifters but at a cost of the degradation in array gain, as only a subarray with a smaller number of antennas is utilized for analog beamforming. Therefore, in order to achieve a higher array gain while maintaining a relatively low complexity, we propose an overlapped subarray (OSA) based architecture to implement hybrid beamforming, where a large antenna array is partitioned into multiple subarrays that are allowed to overlap. The OSA architecture was originally proposed in the application of phased radars [18], [19] to exploit a higher array gain and to overcome the problem of high grating lobes in the non-overlapped subarray (NOSA) case, i.e, the one considered as the existing subarray architecture. Moreover, it has a lower complexity than the fully connected case, because of a reduced number of required phase shifters for analog beamforming. Especially, the OSA architecture can be generalized to include both cases of fully connected and NOSA. Some works have studied multiuser (MU) scenarios for the fully connected architecture [9], [13], [14], [20]–[22] and for the NOSA case [17], [23]. Most works assume single-antenna users or analog-only combining for multiple antennas at users. Ni and Dong [21] proposed an equal gain transmission (EGT) based hybrid precoder in a more general MU-MIMO scenario but only for the fully connected array. To the best of our knowledge, until now there are no unified hybrid beamforming solutions proposed for MU massive MIMO systems that are applicable to any array architectures. In this letter, for the proposed generalized array architecture, i.e., OSA, we develop a hybrid beamforming algorithm named Unified Low Rank Sparse recovery (ULoRaS) for downlink MU massive MIMO systems at mmWave frequencies, supporting multistream transmission for each user. The ULoRaS algorithm is designed based on the transmit–receive coordinated procedure and a practical solution, namely, greedy truncated power (GTP) method, is also derived to maximize the total effective array gain. It has no dimensionality constraint and supports various digital baseband precoding schemes. The performance of the ULoRaS algorithm is examined for the generalized OSA architecture with different configurations and compared with existing solutions, in terms of the achievable sum rate and the number of required phase shifters. II. SYSTEM MODEL AND PROPOSED ARRAY ARCHITECTURE The block diagram of a downlink MU-MIMO system with the hybrid architecture is shown in Fig. 1. The base station (BS) is mounted with MT transmit antennas that connect to NT RF chains via different array architectures, where NT < MT holds for the hybrid design. Similarly, for the kth user (k = 1, . . . , K), only the fully connected hybrid design is considered. The number of receive antennas is denoted by MR k and the corresponding number of RF chains satisfies
1070-9908 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
SONG et al.: OVERLAPPED SUBARRAY BASED HYBRID BEAMFORMING FOR MILLIMETER WAVE MULTIUSER MASSIVE MIMO
551
For different array architectures, the structure of the analog precoder matrix F RF varies, which is defined as
Fig. 1.
F RF = [ f 1 , f 2 , . . . , f N T ] ⎡ fˇ1 (1) ⎢ .. ⎢ . fˇ2 (1) ⎢ ⎢ .. .. = ⎢ fˇ1 (MS ) . . ⎢ ⎢ ⎣ fˇ2 (MS )
Hybrid architecture for MU massive MIMO systems.
Fig. 2. Examples of array architectures (a) fully connected with M s = M T = 6, (b) NOSA with M S = 3, and (c) OSA with M S = 4 and ΔM S = 2 for hybrid beamforming, where M T = 6, N T = 2. (a) Fully-Connected, (b) NOSA, (c) OSA.
NR k < MR k . Accordingly, we denote the total number of an tennas for all users by MR = K k =1 MR k and that of the RF K chains by NR = k =1 NR k . At user k, the received signal after combining is given by (k )H
(k )H
(k )
y k = W BB W RF H k F RF F BB sk (k )H
(k )H
+ W BB W RF
K
(i)
H k F RF F BB si
⎤ ⎥ ⎥ ⎥ ⎥ , ˇ fN T (1) ⎥ ⎥ ⎥ .. ⎦ . fˇN T (MS )
(2)
where fˇn (i), i = 1, . . . , MS , n = 1, . . . , NT are nonzero weights in the beamforming matrix F RF . For the OSA design, F RF has a structure containing vectors with shifted zeros. Generally speaking, the OSA also includes the fully connected case if ΔMS = 0, MS = MT and the NOSA case if ΔMS = MS = MT /NT . As a result, F RF has no zeros or becomes block diagonal, respectively. III. UNIFIED HYBRID BEAMFORMING DESIGN In MU-MIMO with the hybrid architecture at both the BS and users, it is very challenging to solve the sum-rate optimization problem under the constraints on a total transmit power and constant modulus for an RF beamformer that employs variable analog phase shifters. A more efficient way is to decouple the beamforming design into two stages, i.e., the RF part and the baseband part, separately [20], [21], [23]. The analog precoder/combiner technique aims at maximizing the effective array gain, whereas the digital precoder/decoder processing tries to suppress the interference and exploit the spatial multiplexing gain.
i=1,i= k
+ W BB
(k )H
W RF (k )H nk ,
(1)
where sk is the symbol vector of rk data streams and the channel matrix is H k ∈ C M R k ×M T . The matrices F RF ∈ C M T ×N T and (2) (k ) F BB = [ F (1) , F BB ] ∈ C N T ×r with the total numBB , F BB , . . . ber of data streams r = K k =1 rk , represent the analog RF precoder and the digital baseband precoder. The RF combiner and (k ) the baseband decoder for the kth user are denoted by W RF ∈ (k ) C M R k ×N R k and W BB ∈ C N R k ×r k , respectively. The vector nk is the complex additive white Gaussian noise with variance σn2 . Fig. 2 shows three array architectures for analog RF beamforming. In the fully connected case, in Fig. 2(a), each RF chain connects to all the antenna elements and the analog beamforming requires the total NT MT phase shifters. In the NOSA case, in Fig. 2(b), each RF chain only connects to a subset of antennas (one subarray) and these subarrays are contiguous. Accordingly, only MT phase shifters are required. The proposed OSA-based architecture is shown in Fig. 2(c), where each RF chain also connects to a subarray but these subarrays are allowed to overlap. As some antennas in the overlapping area may connect to more RF chains, we need NT MS phase shifters, where MS denotes the number of phase shifters connected to one RF chain or the number of antennas per subarray for beamforming and ΔMS is the inter-subarray distance/shift with respect to the number of antennas. A larger number of antennas connected to one RF chain result in a higher array gain but with an increased hardware complexity in terms of phase shifters. Thus, the OSA architecture is a good tradeoff between the performance and the costs.
A. Transmit–Receive Coordinated RF Beamforming For the user k, the effective array gain can be interpreted by ˇ k = W (k ) H k F RF , which is the gain of the effective channel H RF proportional to the receive signal-to-noise ratio at the RF chain’s (k ) level [24]. Therefore, to jointly design F RF and W RF , ∀k, we first remove the constant modulus constraint and formulate the following problem to maximize the effective array gain as max
(k )
K
(k )H
W RF H k F RF
F
F R F , W R F ,∀k k =1
(k )H
(k )
s.t. F H RF F RF = I N T , W RF W RF = I N R k ,
∀k, (3)
where · F denotes the operation of the Frobenius norm. The orthogonal constraint helps to minimize the interference among RF chains. The existing solution to design the joint RF beamformers in [21] applies an EGT method to obtain F RF by feeding back the phases of the conjugate transpose of the composite channel matrices from all users. However, the EGT scheme has a dimensionality constraint of NT = NR , which may not hold at all times due to the varying number of scheduled multiple-antenna users. In the following, to solve the optimization problem in (3), we propose a coordinated RF beamforming algorithm that relaxes the constraint and can be generally applied to any array architectures even with NT < NR . 1) Generalized Low Rank Approximation of Matrices (GLRAM) Based Procedure: The RF precoder matrix F RF should be common to all the users. Inspired by [25], where an
552
IEEE SIGNAL PROCESSING LETTERS, VOL. 24, NO. 5, MAY 2017
iterative GLRAM algorithm is derived to solve a similar mathematical problem in image compression, we apply GLRAM to the RF beamforming design. According to the linear algebra, (3) can be solved iteratively and for each iteration it is decoupled into ⎞ ⎛
Require: F R F = empty matrix, xn , P For n = 1, . . . , N T ¯ ( 0 ) = xn Initialize q = 1, f n
⎟ ⎜ K ⎟ ⎜ H (k ) (k )H H ⎜ max trace ⎜F RF H k W RF W RF H k F RF ⎟ ⎟ FRF ⎠ ⎝ k =1
Repeat ¯ ( q −1 ) /P f ¯ ( q −1 ) 2 ¯ (q ) = P f f n n n
P
s.t. F H RF F RF = I N T
TABLE I GTP ALGORITHM
(4)
¯ ( q ) xn ¯ (q ) = f f n n
¯ ( q ) ¯ ( q −1 )
Until f n − f n
< γ, q = q + 1 (q )
fn
(k )
for given W RF , ∀k and ⎞ ⎛ K ⎜ (k )H (k ) ⎟ H max trace ⎝ W RF H k F RF F H RF H k W RF ⎠ . (k ) W R F ,∀k k =1
=e
j ·a n g l e
2
¯ (q ) f n
xn
(q ) F R F |f n
Return F R F = ¯ ( q ) 2 ¯ (q ) f ¯ ( q ) H · P , α = 1/f Update P = I M T − α f n n n 2 End
Qk
(k )H
(k )
s.t. W RF W RF = I N R k ,
∀k
(5)
for a given F RF . In our case, GLRAM is used to iteratively update F RF (k ) and W RF so that the effective array gain is maximized. The GLRAM-based RF beamforming procedure is summarized as follows. T (k ) a) Initialize: W RF = I N R k , 0N R k ,M R k −N R k ∀k = 1 · · · K, the iteration index i, and the threshold ∈ R (an arbitrary small number). b) Set i = i + 1 and compute P (i) by (4) and solve its optimization problem via eigenvalue decomposition (EVD) of P (i) and obtain F RF (i) = U P (:, 1 : NT ), which is a MATLAB notation, meaning the NT eigenvectors of P (i) corresponding to NT largest eigenvalues. c) For each user k = 1, . . . , K, compute Qk (i) by (5) and solve its optimization problem via EVD of Qk (i) and (k ) obtain W RF (i) = U Q k (:, 1 : NR k ), which means the NR k eigenvectors of Qk (i) corresponding to NR k largest eigenvalues;
K
ˇ
ˇ d) Compute abs( K k =1 H k (i) F − k =1 H k (i−1) F ), if it is greater than , go to Step b); otherwise, the convergence is achieved. During iterations, the RF precoder in Step b) at the BS and the RF combiners in Step c) at individual users are calculated based on their given counterparts until convergence. Since such an iterative procedure requires the coordination between the BS and users, we name it as the coordinated RF beamforming. It has been shown in [25] that the GLRAM algorithm converges very fast, i.e., within four iterations. 2) Practical Solution for GLRAM-Based Procedure: In the above scheme, additional constraints, i.e., constant modulus of RF beamformers as well as structured property of F RF , should be taken into account for the practical design. Therefore, we develop an efficient algorithm, namely the GTP method, to obtain the RF precoder matrix F RF in Step b), as well as a simple (k ) solution for RF combiners at users W RF in Step c). RF Precoder in Step b): On the one hand, due to the structure of F RF in the OSA design, there are zeros in this matrix, which destroys the traditional eigenvalue problem (4) and accordingly EVD cannot be applied directly. On the other hand, we can consider that F RF is “sparse” and accordingly (4) turns out to be a sparse eigenvalue problem. Thus, we propose a novel
greedy method, i.e., GTP, for a sequential recovery of f n , n = 1, . . . , NT in F RF , shown in Table I. At the initial step, f 1 is obtained by solving ¯ max f H 1 P f 1 , s.t. f 1 = 1 and f1 [j] = 0, j ∈ F1 , (6) where f 1 contains zeros at jth positions. The nth index set Fn , n = 1, . . . , NT denotes the range of 1 + (n − 1)ΔMs : Ms + (n − 1)ΔMs and its complement F¯n follows F¯n ∪ Fn = U, where U is a universal index set of 1 : MT . A truncated power method has been proposed and proven in [26] to solve such a sparse eigenvalue problem. At the qth iteration, we first update (q ) (q −1) (q −1) the full vector via f¯ 1 = P f¯ 1 /P f¯ 1 2 and then trun(q ) ¯ cate f 1 by restricting the elements indexed by F¯1 to zeros. The mask vector xn is used for truncation, defined as 1, j ∈ Fn xn [j] = (7) 0, j ∈ F¯n . In Table I, denotes the Hadamard product. The iteration of (q ) f¯ 1 will be terminated until convergence, where γ ∈ R is an (q ) arbitrary small number. We then remove the contribution of f¯ 1 from the matrix P . F RF can be constructed by simply taking phases of f¯ n with the operation angle (·) [16]. RF Combiner in Step c): To avoid the computationally expensive calculation of EVD, a low-complexity implementation is to use the standard power method for obtaining the eigenvectors U Q k (:, 1 : NR k ) [27]. Under the constant modulus constraint, similarly, we propose a heuristic solution to take phases, i.e., W RF = ej ·angle (U Q k (:,1:N R k ) ) . (k )
(8)
B. Digital Baseband Beamforming (k )
After the RF beamforming matrices F RF and W RF , ∀k are obtained, a digital block diagonalization (BD) precoding scheme operating in a reduced dimension in the baseband is applied to maximize the achievable sum rate under a transmit power constraint. In our hybrid case, the BD technique is used to design the precoding matrix F BB based on the effective channel ˇ k . Since our hybrid beamforming design is decoupled matrix H into analog and digital stages, other digital baseband precoding schemes can also be applied.
SONG et al.: OVERLAPPED SUBARRAY BASED HYBRID BEAMFORMING FOR MILLIMETER WAVE MULTIUSER MASSIVE MIMO
553
Fig. 3. Throughput performance of the proposed ULoRaS-BD algorithm for various cases (rk = 2). Fig. 4. Performance versus complexity in terms of percentage normalized by those of the fully connected case.
C. Remarks 1) Implementation: For the GLRAM-based procedure in Section III-A1, Step b) is crucial for the implementation of the coordinated hybrid beamforming. If time division duplex is considered, according to the reciprocity, P can be estimated in the uplink from reference signals sent by users, who apply (k )∗ W RF as transmit beamformers. If frequency division duplex is assumed, the explicit feedback of the composite beamformed (k )H channel state information to represent W RF H k or the implicit feedback in terms of the indexes of beamforming matrices is required. 2) Unified Solution: Even though the proposed coordinated RF beamforming is developed for the OSA architecture, it can also be applied to the NOSA and fully-connected cases without any modifications. Particularly for the latter, the GTP algorithm in Table I becomes the standard power method to efficiently calculate the EVD of a matrix in Step b). Therefore, our ULoRaS algorithm fits different array architectures and supports various baseband precoding schemes for the hybrid beamforming design. IV. SIMULATION RESULTS In this section, we evaluate the performance of the proposed ULoRaS algorithm for MU-MIMO systems with three array architectures using Monte Carlo simulations. The mmWave channel is generated by the widely used clustered model [10] with eight clusters and ten rays per cluster, where the channel gain follows the complex normal distribution and the angles of arrival/departure are assumed to follow the uniform distribution within [−π, π] and [−π/6, π/6]. Uniform linear arrays with a half wavelength for the interelement distance are considered at both the BS and users. It is assumed that the BS has MT = 128 antennas and NT = 8 RF chains. There are total K = 4 users, where each user is mounted with MR k = 8 antennas connecting to NR k = 2 RF chains. A. Multiuser Multiple Input Multiple Output As a reference performance, we consider the “Hybrid EGTBD” scheme proposed for the fully connected case in [21] as well as the fully digital BD precoding technique [28]. Fig. 3
shows the throughput performance of the ULoRaS algorithm in the cases of fully connected, NOSA, as well as OSA with various overlapping configurations, where the number of data streams of two are considered for each user. It can be observed that for the fully connected architecture, the performance of our proposed ULoRaS-BD algorithm approaches that of the fully digital BD technique. Moreover, the ULoRaS-BD scheme in the case of OSA with ΔMS = 4 performs similarly to that of the “Hybrid EGT-BD” scheme in the fully connected case, but has a complexity reduction of more than 20%. With a fixed NT and the increased inter-subarray distance ΔMS , the performance of ULoRaS degrades due to a reducing number of antennas per subarray and accordingly a decreasing array gain. B. Performance versus Complexity Furthermore, the throughput performance versus the complexity in terms of the number of required RF phase shifters is assessed in various configurations. Fig. 4 shows this behavior for the transmission of rk = 1 and rk = 2 data streams per user. The performance and the complexity depicted are calculated in the form of percentage, normalized by those of the fully connected case. We can observe that a performance of greater than 90% in the fully connected case can be achieved with a complexity of below 50% for the single-stream (per user) transmission, and around 85% performance with approximately 50% complexity for the multistream transmission rk = 2. Therefore, it is preferable to design the OSA architecture with an inter-subarray distance, i.e., MT /NT /4 ≤ ΔMS ≤ MT /NT /2, which exhibits a good tradeoff between the performance and the required hardware complexity. V. CONCLUSION In this letter, for hybrid beamforming in mmWave MU massive MIMO systems, we propose a generalized OSA architecture as well as develop a unified solution (i.e., ULoRaS) that can be applied to any array configurations, including the fully connected and the NOSA cases. We show that the ULoRaS algorithm outperforms the existing EGT scheme for the fully connected array. By using OSA, it is able to maintain its relatively good performance with an around 50% complexity reduction.
554
IEEE SIGNAL PROCESSING LETTERS, VOL. 24, NO. 5, MAY 2017
REFERENCES [1] F. Rusek et al., “Scaling up MIMO: Opportunities and challenges with very large arrays,” IEEE Signal Process. Mag., vol. 30, no. 1, pp. 40–60, Jan. 2013. [2] W. Roh et al., “Millimeter-wave beamforming as an enabling technology for 5G cellular communications: theoretical feasibility and prototype results,” IEEE Commun. Mag., vol. 52, no. 2, pp. 106–113, Feb. 2014. [3] E. Bj¨ornson, E. G. Larsson, and T. L. Marzetta, “Massive MIMO: Ten myths and one critical question,” IEEE Commun. Mag., vol. 54, no. 2, pp. 114–123, Feb. 2016. [4] S. Sun, T. S. Rappaport, R. W. Heath, A. Nix, and S. Rangan, “MIMO for millimeter-wave wireless communications: Beamforming, spatial multiplexing, or both?” IEEE Commun. Mag., vol. 52, no. 12, pp. 110–121, Dec. 2014. [5] R. W. Heath, N. Gonzalez-Prelcic, S. Rangan, W. Roh, and A. Sayeed, “An overview of signal processing techniques for millimeter wave MIMO systems,” IEEE J. Sel. Top. Signal Process., vol. 10, no. 3, pp. 436–453, Feb. 2016. [6] S. Han, I. Chih-Lin, Z. Xu, and C. Rowell, “Large-scale antenna systems with hybrid analog and digital beamforming for millimeter wave 5G,” IEEE Commun. Mag., vol. 53, no. 1, pp. 186–194, Jan. 2015. [7] X. Zhang, A. F. Molisch, and S.-Y. Kung, “Variable-phase-shift-based RF-baseband codesign for MIMO antenna selection,” IEEE Trans. Signal Process., vol. 53, no. 11, pp. 4091–4103, Nov. 2005. [8] V. Venkateswaran and A.-J. V. der Veen, “Analog beamforming in MIMO communications with phase shift networks and online channel estimation,” IEEE Trans. Signal Process., vol. 58, no. 8, pp. 4131–4143, Aug. 2010. [9] A. Liu and V. K. Lau, “Phase only RF precoding for massive MIMO systems with limited RF chains,” IEEE Trans. Signal Process., vol. 62, no. 17, pp. 4505–4515, Sep. 2014. [10] O. E. Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath, “Spatially sparse precoding in millimeter wave MIMO systems,” IEEE Trans. Wirel. Commun., vol. 13, no. 3, pp. 1499–1513, Mar. 2014. [11] A. Alkhateeb, O. E. Ayach, G. Leus, and R. W. Heath, “Channel estimation and hybrid precoding for millimeter wave cellular systems,” IEEE J. Sel. Top. Signal Process., vol. 8, no. 5, pp. 831–846, Oct. 2014. [12] A. Alkhateeb, O. E. Ayach, G. Leus, and R. W. Heath, “Hybrid precoding for millimeter wave cellular systems with partial channel knowledge,” in Proc. Inf. Theory Appl. Workshop, 2013, pp. 1–5. [13] F. Sohrabi and W. Yu, “Hybrid digital and analog beamforming design for large-scale antenna arrays,” IEEE J. Sel. Topics Signal Process., vol. 10, no. 3, pp. 501–513, Apr. 2016. [14] A. Sayeed and J. Brady, “Beamspace MIMO for high-dimensional multiuser communication at millimeter-wave frequencies,” in Proc. IEEE Glob. Commun. Conf., 2013, pp. 3679–3684.
[15] O. E. Ayach, R. W. Heath, S. Rajagopal, and Z. Pi, “Multimode precoding in millimeter wave MIMO transmitters with multiple antenna sub-arrays,” in Proc. IEEE Glob. Commun. Conf., 2013, pp. 3476–3480. [16] X. Gao, L. Dai, S. Han, I. Chih-Lin, and R. W. Heath, “Energy-efficient hybrid analog and digital precoding for mmwave MIMO systems with large antenna arrays,” IEEE J. Sel. Areas Commun., vol. 34, no. 4, pp. 998–1009, Apr. 2016. [17] X. Zhu, Z. Wang, L. Dai, and Q. Wang, “Adaptive hybrid precoding for multiuser massive MIMO,” IEEE Commun. Lett., vol. 20, no. 4, pp. 776–779, Apr. 2016. [18] J. S. Herd, S. M. Duffy, and H. Steyskal, “Design considerations and results for an overlapped subarray radar antenna,” in Proc. IEEE Aerosp. Conf., 2005, pp. 1087–1092. [19] A. Hassanien and S. A. Vorobyov, “Phased-MIMO radar: A tradeoff between phased-array and MIMO radars,” IEEE Trans. Signal Process., vol. 58, no. 6, pp. 3137–3151, Jun. 2010. [20] A. Alkhateeb, G. Leus, and R. W. Heath, “Limited feedback hybrid precoding for multi-user millimeter wave systems,” IEEE Trans. Wirel. Commun., vol. 14, no. 11, pp. 6481–6494, Nov. 2015. [21] W. Ni and X. Dong, “Hybrid block diagonalization for massive multiuser MIMO systems,” IEEE Trans. Commun., vol. 64, no. 1, pp. 201–211, Jan. 2016. [22] A. Adhikary, J. Nam, J.-Y. Ahn, and G. Caire, “Joint spatial division and multiplexing—The large-scale array regime,” IEEE Trans. Inf. Theory, vol. 59, no. 10, pp. 6441–6463, Oct. 2013. [23] R. A. Stirling-Gallacher and M. S. Rahman, “Linear MU-MIMO precoding algorithms for a millimeter wave communication system using hybrid beam-forming,” in Proc. IEEE Int. Conf. Commun., 2014, pp. 5449–5454. [24] D. J. Love and R. W. Heath, “Equal gain transmission in multiple-input multiple-output wireless systems,” IEEE Trans. Commun., vol. 51, no. 7, pp. 1102–1110, Jul. 2003. [25] J. Ye, “Generalized low rank approximations of matrices,” Mach. Learn., vol. 61, nos. 1–3, pp. 167–191, 2005. [26] X.-T. Yuan and T. Zhang, “Truncated power method for sparse eigenvalue problems,” J. Mach. Learn. Res., vol. 14, no. 1, pp. 899–925, 2013. [27] G. H. Golub and C. F. V. Loan, Matrix Computations. Baltimore, MD, USA: The Johns Hopkins Univ. Press, 1996. [28] Q. H. Spencer and A. L. S. M. Haardt, “Zero-forcing methods for downlink spatial multiplexing in multiuser MIMO channels,” IEEE Trans. Signal Process., vol. 52, no. 2, pp. 461–471, Feb. 2004.