Available online at www.sciencedirect.com
ScienceDirect Procedia Engineering 64 (2013) 84 – 93
International Conference On DESIGN AND MANUFACTURING, IConDM 2013
Hardware Implementation of Adaptive SVD Beamforming algorithm for MIMO-OFDM Systems Sellathambi. Da*, Srinivasan. Jb, Rajaram. Sc a
PG student,ME Communication Systems, Thiagarajar College of Engineering, Madurai, India b PG student,ME Wireless Technologies, Thiagarajar College of Engineering, Madurai, India c Associate Professor,Thiagarajar College of Engineering, Madurai, India
Abstract This paper presents an adaptive hardware design for computing Singular Value Decomposition (SVD) of the radio communication channel characteristic matrix. The hardware developed is suitable for computing the SVD of a maximum of 4 ×4 real-value matrices used in MIMO-OFDM standards, such as the IEEE 802.11n. The information of the right singular-vector matrix can be fed back to the transmitter for beamforming to improve the error performance when facing the channel matrix with rank deficiency problem. The time for the SVD of one complex matrix is limited to about 400 ns. When the channels have short coherent time, the information derived by SVD should be sent from the receiver to the transmitter as soon as possible to keep the beamforming performance. The algorithms to decompose the channel matrix were implemented using the Virtex-VI xc6vlx75t-3ff484 FPGA from Xilinx as the target device with minimum path delay. The implementation concentrates on utilizing the features of the FPGA to speed up operations and reduce the area required. © Authors. Published by Elsevier ©2013 2013The The Authors. Published by Ltd. Elsevier Ltd. Open access under CC BY-NC-ND license. Selection and peer-review under responsibility of the organizing and review committee of IConDM 2013.
Selection and peer-review under responsibility of the organizing and review committee of IConDM 2013
Keywords: Beamforming; multiple-input multiple-output (MIMO)-orthogonal frequency division multiplexing (OFDM); singular value decomposition (SVD)
1. Introduction Increasing demand for high-performance 4G broadband wireless is enabled by the use of multiple antennas at
* Corresponding author. Tel.: +91-967 787 0854; E-mail address:
[email protected]
1877-7058 © 2013 The Authors. Published by Elsevier Ltd. Open access under CC BY-NC-ND license. Selection and peer-review under responsibility of the organizing and review committee of IConDM 2013 doi:10.1016/j.proeng.2013.09.079
Comment [S1]: Elsevier to u and page numbers.
D. Sellathambi et al. / Procedia Engineering 64 (2013) 84 – 93
both base station and subscriber ends. Multiple antenna technologies enable high capacities suited for Internet and multimedia services, and also dramatically increase range and reliability. Multiple antennas at the transmitter and receiver provide diversity in a fading environment. By employing multiple antennas, multiple spatial channels are created, and it is unlikely all the channels will fade simultaneously. At the receiver, multiple antennas are used to separate spatial multiplexing streams and for interference mitigation, which makes aggressive frequency reuse a reality.
Fig. 1. Block diagram for SVD in MIMO-OFDM
Transmitter precoding for multiple-input multiple-output orthogonal frequency division multiplexing (MIMOOFDM) is an effective way of leveraging the diversity gains afforded by a multiple transmit-multiple receive antenna system in a frequency selective environment. In point-to-point multiple-input multiple-output (MIMO) systems, a transmitter equipped with multiple antennas communicates with a receiver that has multiple antennas. Most classic precoding results assume narrowband, slowly fading channels, meaning that the channel for a certain period of time can be described by a single channel matrix which does not change faster. In practice, such channels can be achieved, for example, through OFDM. The precoding strategy that maximizes the throughput, called channel capacity, depends on the channel state information available in the system. If the channel matrix is completely known, singular value decomposition (SVD) precoding is known to achieve the MIMO channel capacity. In this approach, the channel matrix is diagonalized by taking an SVD and removing the two unitary matrices through pre- and post-multiplication at the transmitter and receiver, respectively. Then, one data stream per singular value can be transmitted (with appropriate power loading) without creating any interference whatsoever. 1.1. Literature Review In the last few years, numerous algorithms have been developed for precoding. SVD based algorithms are discussed below. In this [6] paper tells a hardware-efficient VLSI architecture for steering matrix computation using a hardware optimized Givens rotation SVD algorithm. It utilizes bidiagonalization, diagonalization, and Givens rotation to achieve high processing throughput. The resulting VLSI implementations with 0.18um technology requires 3.3us to complete the SVD of one complex matrix, which is still more than 8 times the critical requirement (i.e., 400 ns) in IEEE 802.11n systems. This [7] paper based on adaptive SVD beamforming algorithm with perturbation theory. Nevertheless, the computational cost is also high. The algorithm with iterative division will apparently cause performance degradation in practical hardware implementations with severe quantization effect. In this paper [8] first-order perturbation, updates can be computed recursively, resulting in a highly efficient algorithm that has lower complexity than the earlier least-mean square (LMS)-based algorithm. In this [9] paper, a systolic algorithm for the SVD of arbitrary complex matrices based on the cyclic Jacobi method with parallel ordering is presented. A novel two-step, two-sided unitary transformation scheme, tailored to the use of coordinate
85
86
D. Sellathambi et al. / Procedia Engineering 64 (2013) 84 – 93
rotation digital computer algorithms for high-speed arithmetic, is employed to diagonalize a complex 2×2 matrix. An expandable array of O(n2) complex 2×2 matrix processors computes the SVD of an n×n matrix in O(n log n ) time. In this [10] paper, they described a minimum- area matrix decomposition architecture that is programmable to perform QRD and SVD with variable precision. In this [12] paper, an SVD processor system is presented in which each processing element is implemented using a simple CORDIC unit. The internal recursive loop within the CORDIC module is exploited, with pipelining being used to multiplex the two independent microrotations onto a single CORDIC processor. A matrix decomposition architecture was proposed in [21] according to the GolubKahan SVD (GK-SVD) algorithm [22]. It achieves higher processing throughput than [19] with lower hardware cost. Based on the matrix decomposition architecture in [21], a hardwarein [23] by modifying the GK-SVD algorithm and using a high-speed Givens rotation design. In hardware implementations, all the elements will be represented in finite precision. The orthogonal property among singular vectors, column vectors of U and V, may be corrupted and induce the interferences among transmitted substreams. The destruction of orthogonal property among singular vectors caused by quantization error may not be prevented. However, we can use orthogonality reconstruction for fixed point operation to operation to preserve the most orthogonality. The destruction of orthogonal property among singular vectors caused by quantization error may not be prevented. However, we can use orthogonality restoration for fixed point operation to eliminate the destruction caused by deflation stage and improve the performance. For the architecture of orthogonality restoration, Modified Gram-Schmidt is utilized. (1) The storages of left singular vectors before or after Gram Schmidt operation is shown in Fig.1. With dedicate task arrangement, the storages of the right singular vectors can be outputted for Gram Schmidt operation or Hvi computation. The results of right singular vectors can be stored after Gram Schmidt operation. 2. System model 2.1. Block Diagram
Fig. 2. Proposed SVD model.
87
D. Sellathambi et al. / Procedia Engineering 64 (2013) 84 – 93
SVD engine is depicted in Fig.2. It consists of six
2.2. In a MIMO system, assume that the maximum number of transmitter and receiver antennas is MR and MT, respectively. This means that we have possibly MR T different sizes of channel matrices (i.e., 1×1, 1×2,..., MR ×MT for Square and Nonsquare Channel Matrix: The maximum size of channel matrix is M R ×MT in a MIMO system. Hence, it is intuitive to design an SVD engine to support the maximum channel size. For the smaller channel matrix, we can extend it to the maximum-size channel matrix by inserting zeros. If the size of a given matrix is NR × NT, the extended channel matrix is (2) After extending the original channel matrix by inserting zeros, the SVD operation of the original channel is exactly the same as that of the maximum-size channel matrix. The extended channel shown in the referenced algorithms. Note that the value of d in Algorithm depends on the size of the original channel matrix. Therefore, d is still equal to min(NR, NT ).Fig.2shows the block diagram of zero padding unit. A given channel matrix H N R×NT is extended to H MR×MT by inserting zeros, and the multiplexer is used to construct the positive semiR1 based on (4). We also apply the zero padding scheme to singular calculation unit and partial update unit. According to [7], Fig. 2 illustrates the architecture of singular calculation unit. Three multiplexers is used to NT and NR < NT. We employ [9] to realize partial update unit as shown in Fig. 7. consider two cases of NR Schmidt Scheme for Nonsquare Channel Matrix: In [15], we apply the Gram-Schmidt R > NT .Due to the fact that the entries of a channel matrix, as well as the entries of its singular vectors, are always complexkth entry being 1. With this setting, we can rewrite [15] 3. Design of the Algorithm In Algorithm 1, the positive semide nite matrix R1 is estimated by a moving average of the recent received signal vectors. In many MIMO OFDM-based standards, the channel matrix H is already known by channel estimation. Therefore, we can utilize the information to evaluate accurate R1 NT R1 = HHH, NR HHH, NR < NT . (3) sequentially.
88
D. Sellathambi et al. / Procedia Engineering 64 (2013) 84 – 93
Given, H , N R , N T
R1
H H H, NR
NT
H
NT
HH , N R
d
min( N R , N T )
Step1) Update and Deflation for i 1 : (d - 1) Initialize w i (0) adjacent subcarrier ' swi ( ) i
( 0)
w i ( 0) H w i ( 0)
i
( 0)
a/
i
( 0)
update with i th pair w i (n 1)
w i ( n)
i
( n)( Ri
( n) I ) w i ( n)
i
apply deflation Ri
1
w i (n 1) w i (n 1) H
Ri
end Step 2 Derivation of ui , vi , if N R i
i
, i 1,2,..(d 1)
NT i
, vi
wi
, ui
Hvi i
i
else i
i
, ui
wi
, vi
Hui i
i
end Step 3 Partial Update for u d , vd , if N R d
d
NT tr ( Rd ) , vd
Rd (:,1)
Rd (:,1)
, ud
Hvd d
else d
tr ( Rd ) , u d
Rd (:,1)
Rd (:,1)
, vd
Hud d
89
D. Sellathambi et al. / Procedia Engineering 64 (2013) 84 – 93
Step 4 Gram if N R N T fork wd
Schmidt for remaining sin gular vectors
1: ( N R ek
k
NT )
d k 1
ek , u i
ui
i 1
ud
wd
k
else fork wd
k
wd
1: ( N R
NT )
d k 1
ek
k
k
ek , vi
vi
i 1
vd
k
wd
k
wd
k
end - end
4. Result and Discussion 4.1. Synthesis Report Table 1 Device properities Project File:
xil.xise
Module Name:
AdaptiveSVD
Target Device:
xc6vlx75t-3ff484
Product Version:
ISE 14.1
Design Goal:
Balanced
Design Strategy:
Xilinx Default (unlocked)
Environment:
System Settings
Table 2 Device utilization Logic Utilization
Used
Available
Utilization
Number of occupied Slices Distribution
3,149
17,280
Number of bonded IOBs
572
680
84%
Number of DSP48Es
64
64
100%
Number of Utilization
9,300
69,120
13%
18%
90
D. Sellathambi et al. / Procedia Engineering 64 (2013) 84 – 93
4.2. Simulation Results
Fig. 3. Simulation Output of Zero Padding Unit
Fig. 4. Simulation Output of Partial Update Unit;
D. Sellathambi et al. / Procedia Engineering 64 (2013) 84 – 93
Fig. 5. Simulation Output of Gram Schmidt Unit;
Fig. 6. Simulation Output of Adaptive svd;
91
92
D. Sellathambi et al. / Procedia Engineering 64 (2013) 84 – 93
4.3. Comparison Result Table 3 Comparison table JSSC
ACSSC
ISCAS
This Work
D. Markovic et al
C.Studer et al
C. Senning et al
Support Antenna Configuration
4x4
4x4
4x4
1x1, 1x2, 1x3,1x4, 2x1, 2x2, 2x3,2x4, 3x1, 3x2, 3x3,3x4, 4x1, 4x2, 4x3,4x4
SVD
U,S
U,S,V
V,S
U,S,V
Throughput
50K
86.4K
303K
479K
Frequency
100 MHZ
133 MHZ
149 MHZ
600 MHZ
Power Efficiency
1.47
3.5
N\A
3.83
Technology
90 nm
180 nm
180 nm
40 nm
5. Conclusion and Future work Thus the above architecture reduces the overhead of hardware and computational complexity of Beamforming which is the highest computational complexity module of 3GPP MIMO-OFDM standards. The demand of highthroughput wireless transmissions, such as IEEE 802.11n WLAN systems and IEEE 802.16e WiMAX systems, continues to grow. Our design can achieve 9.8 M channel-matrices/s. Eigen values are derived iteratively so the beamforming is performed in real time and method we employed can support all MIMO system. In successive matrix processing, the equivalent processing time required for each matrix can even be reduced to 90ns. By reduction of complexity of the highest computational module the Beamforming and Equalization will become efficient. In the critical case, we have only 46 to derive the precoding matrices of 128 subchannels. In other words, the time for the SVD of one complex matrix is limited to about 400 ns. When the channels have short coherent time, the information derived by SVD should be sent from the receiver to the transmitter as soon as possible to keep the beamforming performance. The decomposing time and accuracy will therefore greatly affect the beamforming performance. These design strategies enable the use of SVD to be effectively applied to the high-throughput wireless communication applications. In future this work can be extended to deal with different transmit and receive antenna sets such as 8x8 and 16x16. References [1] [2] [3] [4] [5]
Alamouti 1451 1458, Oct. 1998. -division-duplex (FDD) Proc. IEEE 43rd Veh. Technol. Conf., May 1993, pp. 508 511. IEEE Commun. Lett.,
Sampath.H and Paulraj.A vol. 6, no. 6, June 2002, pp. 239 41. IEEE Trans. Inf. Theory,vol.51,8,pp. Love.D.J and Heath.R.W 2967 2976, Aug. 2005. Golub.G.H and Van Loan.C.F, Matrix Computations. Baltimore,MD: The Johns Hopkins Univ., 1996.
D. Sellathambi et al. / Procedia Engineering 64 (2013) 84 – 93 [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27]
Foschini.G.J and Gans.M.J Wireless Pers.commun., vol. 6, no. 3, pp. 311 335, Mar. 1998. IEEE Trans. Signal Process., pp. 615 622, Feb. 2008. Willink.T.J fourthgeneration MIMO-OFDM: Broadband wireless system: Design, Sampath.H, Talwar.S, Tellado.T, Erceg.V, and Paulraj.A IEEE Commun. Mag., vol. 40, no. 9,p. 143 149, Sep. 2002. Proc. IEEE Int. Symp. Circuits Syst., May 1992, Hemkumar.N.D and Cavallaro.J.R vol. 3, pp. 1061 1064. Deprettere.F, SVD and Signal Processing: Algorithms, Analysis and Applications. Amsterdam, The Netherlands: Elsevier, 1988. Cheng Zhou Zhan, Yen-Liang Chen Iterative Superlinear-Convergence SVD Beamforming algorithm and VLSI Architecture for MIMO-OFDM Systems IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 6, JUNE 2012 Liu, Z, Dickson, K., McCanny A floatinggurable Adaptive Singular Value Decomposition Engine Design for High-Throughput MIMO-OFDM Yen-Liang Chen IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS ,2012. IEEE Trans. Commun., vol. 46, pp. 357 366, Mar. Raleigh.G.G 1998. David J. Love, -Input MultipleIEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 51, NO. 7, JULY 2003. Goldsmith.A, Jafar.S.A, Jindal.N, and Vishwanath.S, Capacity limits of MIMO channels IEEE J. Sel. Areas Commun., vol. 21, no 5, pp. 684 702, Jun. 2003. Specification of IEEE 802.11n physical layer [Online]. Available: http://www.ieee802.org/11/ Markovic.D, Nikolic.B, and State Circuits, vol. 42, no. 4, pp. 922 934, Apr. 2007V. Design and implementation trade1990. -efficient steering matrix computation architecture for MIMO Senning.C Proc. IEEE Int. Symp. Circuits Syst., May 2008, pp. 304 307. communication syst Erceg et al. IEEE802.16.3c-01/29r4; http://www. ieee802.org/16 /tg3/contrib/ 802163c-01_29r4.pdf -OFDM for wireless communications: Signal detection with enhanced channel 1477, Sep.2002. FPGA Implementation of Precoding using Low Complexity SVD for MIMO-OFDM Systems Chiueh.T and P. Y. Tsai, OFDM Baseband Receiver Design for Wireless Communications. New York: Wiley, 2007. Reconfigurable Adaptive Singular Value Decomposition Engine Design for HighYen-Liang Chen, Cheng-Zhou Zhan, Throughput MIMO-OFDM Systems IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 2012. -Bell Labs Internal Tech. Memo, 1995. Telatar.E
93