Design and FPGA Implementation of Reduced Complexity MIMO-MLD Systems * * Yousef Abdullah AI-Zahrani+, Saleh AI-Marshed+, Abdullah AI-Dhotyan , Ahmed Iyanda SUlyman , * Saeed AI-Dosari , Mohamed Elnamaky! and Saleh AI-Shebeili! +King Abdulaziz City for Science and Technology (KACST), Riyadh 11442, Saudi Arabia '
Department of Electrical Engineering, King Saud University (KSU), Riyadh 11421, Saudi Arabia
Email s : { asulyman . dosari } @k s u.edu.sa !P rince Sultan Advanced Technologies Research Institute (PSATRI), KSU, Riyadh 11421, Saudi Arabia Email:
[email protected]
Abstract-Multiple-antenna systems, also known as multiple input-multiple output (MIMO) radio, improve the capacity and reliability of radio communication systems. Of considerable concern
however
is
the
huge
complexity
involved
in
the
implementation of such systems. Therefore, the design of low complexity, low cost, MIMO systems that keep most of the advantages and benefits of the full-complexity system has gained significant attentions recently. In this paper, we design and implement on field programmable gate array reduced-complexity
MIMO-maximum
(FPGA)
likelihood
board, a detection
(MLD) system whose performance is as close as possible to the optimal MLD (full-complexity) system while making significant cut back in the over-all hardware/software complexity (and therefore the operating cost) of the system.
Index
Terms-MIMO-MLD,
reduced-complexity
systems, antenna selection, constellation partition,
I.
MIMO
FPGA.
INTRODUCTION
Multiple input-multiple output (MIMO) communication system employing coding techniques appropriate to multiple antenna transmission have recently been embraced as an effective means to achieve high data rate over wireless channels. Several wireless standards such as wireless-fidelity (Wi-Fi) (IEEE 802.11), and worldwide interoperability for microwave access (WiMAX) (IEEE 802.16) have thus incorporated MIMO technologies in the system. An example of practical implementation of MIMO technology in Wi-Fi system using spatial multiplexing (SM) is shown in Fig 1, where a 54 Mega bits per second (Mbps) radio channel is used to deliver 108 Mbps data, using 2x2 MIMO-multiplexing system. The operation of the system can be summarized as follows. First the client send data at 108 Mbps to the wireless network via the card adapter, then the encoder divides the data stream into two (or more) slower sub-streams, then the transmitter send each sub-stream to a separate antenna for simultaneous transmission on the same radio channel. The signal will experience reflection off objects, creating multiple
978-1-4244-9991-5/11/$26.00 ©2011 IEEE
paths from the transmitting end to receiving end. Two antennas at the receiver side receive the multiplexed signal, and the receiver uses MIMO detection algorithms to unscramble the signal, in order to recover the transmitted data streams at 108 Mbps. Of considerable concern however is the increased complexity incurred in the implementation of MIMO systems. For example if maximum likelihood detection (MLD) algorithm is used at the receiver for MIMO detection, the receiver complexity in the resulting MIMO-MLD system grows exponentially with the number of signal points and transmit antennas. Therefore, with higher number of antennas and signal constellations, the complexity of MIMO-MLD becomes too prohibitive for real-time implementations. The design of low-complexity, low-cost MIMO systems that keep much of the advantages and benefits of the optimal MLD (full-complexity) MIMO system have thus been the focus of system designers recently [1]- [6].
Wireless Network interface ,
108 Mbps
Client
I
MIMO encoder
� 54 Mb
MIMO transmitte
Wireless Channel
( 108 Mbps -
MIMO receiver
\.....(
� b.:
Access Point
Fig. 1. High-Speed transmission in Wi-Fi system using MIMO technology
The objective of this paper is to design and implement on field programmable gate array (FPGA) board a reduced complexity MIMO-MLD systems whose performance are as close as possible to the full-complexity system while making
348
significant cut back in the over-all hardware/software complexity (and therefore the operating cost) of the system. To this end, we examine the performance and complexity of the full-complexity MIMO-MLD system and various reduced complexity versions, and from the analysis, the MIMO configurations that best optimize performance while reducing complexity significantly were selected and implemented on FPGA using MLD detector. All results presented on system performance were first tested in MATLAB and then translated into hardware blocks using Xilinx System Generator (SysGen). Once the hardware designs were completed, the bit streams required for the FPGA implementation were generated using Xilinx synthesis tools. DESIGN OF REDUCED-COMPLEXITY MIMO SYSTEMS
II.
A. Antenna Selection The first approach explored for realizing the above-stated goals is to employ some form of antenna selection. Motivation for this consideration is the fact the multiple antenna deployment requires multiple radio frequency (RF) chains (consisting of amplifiers, analog to digital converters (ADC), channel estimation circuits, etc.), which are very expensive and consume high power, therefore when antenna selection is implemented in the signal detection algorithm for recovery of the transmitted data, this result in significant complexity reduction. Here we focus mainly on receiver side complexity reduction and have explored the use of antenna selection with MIMO-MLD detection algorithms at the receiver side. Fig 2 displays a general block diagram for the reduced-complexity antenna-selection-based NxL choose K-antennas MIMO MLD system considered in this paper, where N, L and K are the number of transmit, receive and selected antennas, respectively.
called constellation partition (CP) [3]. The main idea in the proposed method is to conduct the MLD search for the closest constellation point to the received signal, in a specific way to reduce the number of search operations so that the processing complexity is reduced. For example for the 16-quadture amplitude modulation (QAM) constellation shown in Fig 3, we partition the constellation into four quadrants. Then we divide the MLD search operation into two stages: quadrant search, and symbol search. In the quadrant search, we search only the signal points that are nearest to the origin for each quadrant (1 + i, I - i, -I + i, and -I - i), the bold red points in Fig 3. This detects the closest quadrant for the received signal. In the second search (which we call symbol search), the received signal is compared only to the remaining symbols in the detected quadrant to finally detect the signal. This process will reduce the software complexity for MIMO-MLD system by (3/4)x[1 - (M/M)] xlOO% for the MLD detection of rectangular 16-QAM signals shown in the figure, using repetition encoding at the transmitter, known as MIMO maximum ratio transmission (MRT) [8], where M and�) are the all signal points in the constellation and signal points that are nearest to the origin for each quadrant, respectively. This software complexity reduction amounts to 37.50%, 56.25%, 65.62% and 70.31% for 8-QAM, 16-QAM, 32-QAM and 64QAM, respectively. For space-time encoded signal at the transmitter, more complexity saving is even expected. M-QAM
4 ��-- ,--,. --..---,--�---,
I
I
� !II C 'm !II
0
-1
l�
-------.
2
,5
Signal
I
.--
3 r-.
Mo
�- --.
I I
I I .
I I .
I
-------.
I
- --
jI
.-
I
-2
processing and decoding
Fading
-3
Signals I
�����
1 _________
I
______
Constellation Partition (CP)
Antenna selection is a well-established method for hardware complexity reduction. In this paper, we explore an alternative method that would yield processing (software) complexity reduction which can be used itself alone or in conjunction with antenna selection. The proposed design is
978-1-4244-9991-5/11/$26.00 ©2011 IEEE
-4
l
-3
J
Fig. 2. Antenna selection at the receiver in MIMO-MLD system
B.
-4
-2
-1
o
l
2
1-
4
3
real Fig. 3. 16-QAM signal constellation search in MLD algorithm: bold red points represent quadrant search, small blue points represent symbol search
III.
DESIGN VERIFICATION
/ VALIDATION
For the two cases of reduced-complexity MIMO systems designed above, we used MATLAB/Simulink and Xilinx tools for the design verifications. The block diagram used for the design implementation and tests is described in Fig 4 and 5.
349
We created a MATLAB/Simulink model which contains both floating-point and fixed-point models and fed it with the same real data and tested the output. Discrepancies were limited to the variation of number of bits used in the fixed model but both results were consistent. First, we coded new MATLAB codes to fit the Xilinx AcceIDSP tool that represent that hardware architecture design for each case. Then, we created Xilinx Coregen-based blocks to be used later by Xilinx System Generator tool in MATLAB/Simulink. The high level Simulink DSP blocks were translated to register transfer level (RTL) language by system generator tool. Implementation was done considering FPGA prototyping platform based on
Xilinx Virtex5 (XC5VLXIlOT). Eventually, bit-stream files were generated and downloaded onto FPGA using Xilinx Foundation ISE tools. The outputs from the FPGA hardware device were monitored using ChipScope Pro Ananlyzer. Fig 6 displays a photograph of the actual set-up used.
Fig. 6. Photo of Set-up for Design Test
IV.
SUMMARY OF RESULTS FROM DESIGN TEST
Fig. 4. Set-up for Design Test
MATLAB / Simulink
AcceLDSP / System Generator
ISE Foundation Simulator
CllipScope Pro Analyzer
Fig. 5. FPGA implementation workflow
978-1-4244-9991-5/11/$26.00 ©2011 IEEE
This section presents the summary of performance results for the optimal MLD and the following cases of reduced complexity systems; antenna selection, CP and combination of antenna selection and CP. Fig 7, 8, 9, 10 and II presents the results for the cases of full complexity and reduced complexity MIMO-MRT systems [8]. Fig 7 shows the simulation results for comparison between full-complexity 4x4 MIMO-MLD and 4x4 MIMO-MRT with two antennas selected at the receiver-side for the detection process (4x4 select 2, antenna selection system). It is observed from these figures that 4x4 MIMO-MLD system where two out of four available antennas at the receiver are selected for signal detection operation would have very little performance degradation compared to the full complexity 4x4 MIMO MLD system. The performance of a direct 4x2 MIMO system is also shown in Fig 8 for comparison, where it is easy to see that the antenna-selection-based 4x4 choose two antenna system has superior performance close to the full complexity system. However, in terms of hardware/software complexity, as shown in Table I, the receiver-side complexity is much reduced in CP and 4x4 choose two-antenna (antenna selection) system compared to 4x4 full-complexity system. Thus we designed reduced-complexity version of 4x4 MIMO-MLD system which retains most of the advantages of the full-complexity system, but at much lower complexity. Fig 9, 10 and 11 compares the performance of the proposed CP algorithm with existing MIMO-MLD and antenna selection systems, for all following cases: MIMO-MLD vs. CP, antenna selection vs. CP, and antenna selection vs. combination of antenna selection and CP, respectively. An
350
important observation from the results in these figures is that the performance of the system with constellation partition is exactly the same as the standard MLD, while cutting back significantly the complexity of the system. This trend is observed in the case of using constellation partition alone in 4x4 system or with antenna selection in 4x4 choose two antenna system. Our hardware implementations and tests thus focus on the l6-QAM reduced complexity 4x4 MIMO-MLD configuration presented in these figures.
lO'
��.� ::::1 . --e-- 4x4 select 2, Antenna Selection (64QAM) ,. 0 • 4x2, MLD (64QAM)
••
.
....... 4x4 select 2, Antenna Selection (32QAM)
10°
. . . .6 .. 4x2, MLD (32QAM) - -+--1--+---+--1 -+-.. . 4x4 select 2, Antenna Selection (16QAM) �." 4x2, MLD (16QAM) --a-- 4x4 select 2, Antenna Selection (BQAM)
�!Qo�........ .
••••
_
i
... *. . .. .� . . . . ...... -. . ........ . .....,.� �,,' .. . . ..... . . ..... � .. �" '\. ... . . .,�
�: . .,
. -+--" ''*" '-.. -
1O.2 �-+---+
TABLE I . RECEIVER-SIDE COMPLEXITY ANALYSIS FOR 4x4 MIMO-MLD SYSTEMS
.
Eb/No, dB
System
SystemComplexityfor 16-QAMsignals
Fig. 8. SER performance ofMIMO systems, for 4x2 MLD and 4x4 select 2 (antenna selection)
Full -complexity 4x4 MIMO-MLD
- 4 RF chains (amplifiers, antennas, and demodulators) - 336multiplications
/ divisions,
256 additions
/ subtractions
�.
j
-�- 4x4, MLD (64QAM) 4x4, CP (64QAM) 1
-�- 4x4, MLD (32QAM) ---- 4x4, CP (32QAM)
Reduced-complexity 4x4 MIMO-MLD selecting 2 antennas at the receiver
--E)- 4x4, MLD (16QAM)
- 2 RF chains (amplifiers, antennas, and demodulators)
-+- 4x4, CP (16QAM)
- I RF switch - 176 multiplications
/ divisions,
128 additions
--�- 4x4. MLD (BCAM)
� iiii: h
/ subtractions
(antenna selection)
,
- 4x4, CP (SQAM)
�--fi �� ",-= �.-:�
""""
i
'�=== Reduced-complexity 4x4 MIMO-MLD
(CP)
- 4 RF chains (amplifiers, antennas, and demodulators) - 147 multiplications
/ divisions,
112 additions
, '......
/ subtractions
Reduced-complexity selecting 2 antennas at the receiver (antenna selection and
j
, """ .....,
"
''\ �
2
4x4 MIMO-MLD
""
"'\
I
l\-
,
\\.
- 2 RF chains (amplifiers, antennas, and demodulators) - I RF switch
/ divisions,
- 77 multiplications
56 additions
I
It '\. "\\. \
,
/ subtractions
....
-2
I
10
EblNo, dB
CP)
Fig. 9. SER performance of 4x4 MIMO systems, for MLD and CP
10'
-
-& - 4x4. MLD (64CAM)
,
10'
- � - � ." .. ,. Moo", "''''- ,-"' , --a-- 4x4, CP (64QAM) --+-- 4x4 select 2, Antenna Selection (32QAM) I -+- 4x4. CP (32QAM)
� 4x4 select 2, Antenna Selection (64QAM)
- -li, - 4x4. MLD (32CAM) � 4x4 select 2, Antenna Selection (32QAM) -�- 4x4. MLD (16CAM) -+- 4x4 selec' 2. Ao'eooa Selec';oo (16QAM)
--&- 4x4, MLD (SQAM) --e- 4x4 select 2, Antenna Selection (SQAM)
,
---, r-� '" . " ",=,:::' '" "! ..... � ' � :a..,::--. '---a.... - ...
----�
-
-
-
B �
10� -2
-
-
-
10'
,
'"
-
-
.... \.
-"
-10
�
-
-
--....- 4x4 select 2, Antenna Selection (16QAM) 10'
. �
-� 'A� - '-.. "" i'!,'" --- - �"" � -
2
�
-
10'
2
I"'¢!\ -
"
-
" "'\' !:I
Eb/No, dB
j I
T
-
-
-
- -
, , " , '\. .. '1\"" "\ \ '\ '" \
1t-�--� � - -t": � --
n
.. �" " -
"-,.. ' '" "
-
'� � � �t � �� ..'\:t ..� �
, 10
-2
I
-+- 4x4. CP (16CAM) I --e- - 4x4 select 2, Antenna Selection (BQAM) ; I -&- 4x4, CP (BQAM)
--, -- " --'" '" ,-, --":'--.. ,
EblNo, dB
,
\
10
Fig. 7. Symbol Error Rate (SER) performance ofMIMO systems,
Fig. 10. SER performance of 4x4MIMO systems, for antenna
for 4x4 MLD and 4x4 select 2 (antenna selection)
selection and CP
978-1-4244-9991-5/11/$26.00 ©2011 IEEE
351
-er
- 4x4 select 2, ---+-- 4x4 select 2, -+- 4x4 select 2, � 4x4 select 2, --�- 4x4 select 2, --+-- 4x4 select 2, -�- 4x4 select 2, ---- 4x4 select 2, -
0
10'
:.::s;
-
,
-
� .�
,
-8
selection (320AM) .,.
selection (SOAM)
•
t T T
-
1
� "'-. " """
-2
"
SYSTEMS FOR THE XILINX VIRTEX5, XC5VLXllOT DEVICE
-tl
t
.�
� ;
10
EblNo, dB
Reduced-ComplexityMIMO-MLDfor 16-QAMsignals
1;
t
......
"-
PERFORMANCE SUMMARY OF REDUCED COMPLEXITY 4x4 MIMO-MLD
selection (16QAM)
......
""' 3
TABLE II . selection (640AM)
r-..
.......
"2
antenna selection (64QAM) combination of CP and antenna antenna selection (32QAM) combination of CP and antenna antenna selection (16QAM) combination of CP and antenna antenna selection (8QAM) combination of CP and antenna
Fig. II. SER performance of 4x4 MIMO systems, for antenna
Information
Maximum Frequency
MinimumPeriod
4 8.8 MHz
20.511 ns
127.4 MHz
7.84 8ns
63.7 MHz
15.706ns
Antenna selection
Constellation partition (CP) Combination ofCP
and
antenna
selection and combination of antenna selection and CP
selection
Next we present the synthesized results from our hardware design validation test-bed. Fig 12 shows the interface developed for comparing the results obtained from MATLAB (floating point), and Xilinx's FPGA tools (fixed point), for the reduced-complexity MIMO-MLD system (l6-QAM signals). The blank rectangular block implements the reduced-complexity MIMO-MLD system in MATLAB (floating point simulation), while the shaded block with "X" mark implements the reduced-complexity MIMO-MLD system in Xilinx's AcceIDSP. Both blocks were fed with the same input data and their outputs were monitored and compared. Since FPGA hardware (shown in Fig 6) has direct access to the logics in the Xilinx's AcceIDSP, this block
TABLE III . FPGA RESOURCE UTILIZATION OF THE REDUCED COMPLEXITY 4x4 MIMO MLD SYSTEMS FOR THE XILINX VIRTEX5,
Reduced-ComplexityMIMO-MLD for 16-QAMsignals
Antenna selection
Count
Percentage Usage
5271 of
of
of
64 of 64
of
of
56 of 64
Count
Percentage Usage
of
15%
17280
2114 4%
of
3%
69120 7394 18%
69120
100%
Combination of antenna selection andCP
2617 25%
13126 23%
69120
DSP 4 8Es
of
69120
16217 LUTs
Usage
3313 5%
69120
Output
Percentage
17280
34 82 Registers
Count
4 359 30%
17280
I--E::J
Constellation partition (CP)
Information
Slices
MATLAB
XC5VLXI1 OT DEVICE
of
10%
69120
87%
36 of 64
56%
Output
Fig. 12. XilinxiSimulink interface for reduced-complexity 4x4 MIMO MLD system, 16-QAM signals
978-1-4244-9991-5/11/$26.00 ©2011 IEEE
represents hardware implementations for the reduced complexity MIMO-MLD systems examined in this paper. Table II presents the performance summary of the reduced complexity MIMO-MLD systems. From this table, it is observed that the hardware processing is much faster using the proposed CP algorithm. For example, while an antenna selection system operates at 48.8 MHz and takes 20.511 ns, CP algorithm operates at 127.4 MHz and takes total of 7.848
352
ns. Thus CP significantly improves processing speed, while retaining same error rate performance compared to existing systems. Table III present the implementation results of the reduced complexity MIMO systems on a Xilinx Virtex5 (XC5VLXllOT), for each of the key functional units of the design; slices, registers, look-up tables (LUTs) and DSP48Es. The DSP48E are a DSP slices introduced by Xilinx and available in Virtex-4 and Virtex-5 FPGAs, where it has been used to implement the Adders, subtractors, control unit, and complex multipliers [7]. The FPGA resource utilization is reduced in 4x4 constellation partition system compared to antenna selection system, as shown in Table III. Constellation partition saves about 17% of the number of slices, 5% of registers, 19% of look-up tables, and 13% of DSP48Es. The combination of antenna selection and constellation partition has much reduced complexity compared to either antenna selection or constellation partition alone.
V.
CONCLUSION
The design and implementation of reduced complexity MIMO-MLD system on FPGA has been presented. This paper considers the design of low-complexity, low-cost MIMO systems that keep much of the advantages and benefits of the full-complexity MIMO systems. We consider the traditional antenna selection method for the design, and also introduce a new method referred to as constellation partition (CP), and then we combine these two methods. It is observed that hardware processing is much faster using the proposed CP algorithm. For example, while antenna selection systems operates at 48.8 MHz and takes 20.511 ns, CP algorithm operates at 127.4 MHz and takes total of 7.848 ns. Thus CP significantly improves processing speed, while retaining same error rate performance compared to existing systems. The designs were tested and the results indicate significant complexity reductions.
REFERENCES [I]
A. I. Sulyman, and M. lbnkahla, "Performance of MIMO system with antenna selection over nonlinear fading channels," IEEE Journal Set. Topics in Signal Processing, vol. 2, no. 2, pp. 159-169, Apr. 2008.
[2]
Markus Rupp, Gerhard Gritsch and Hans Weinrichter. "Approximate ML detection for MIMO systems with very low complexity," Proc. IEEE Int'! Corif. AcoustiCS, Speech, and Signal Processing, Montreal, Canada, May 2004.
[3]
A. I. Sulyman, Y. AI-Zahrani, S. AI-Dosari, A. AI-Sanie, and A. AI Shebeili, "A Two-Stage Constellation Partition Algorithm for Reduced Complexity MIMO-MLD Systems," Proc. IEEE Int'! workshop on Wireless Local Networks, WLN 2010, Denver, Colorado, USA, pp. 757-760, Oct. 2010.
[4]
A. Ghrayeb and T. M. Duman, "Performance Analysis of MIMO Systems with Antenna Selection over Quasi-Static Fading Channels," IEEE Trans. Vehicular Technol., Vol. 52, No. 2, pp. 281-287, Mar. 2003.
[5]
D. Wubben, R. Bonke, V. Kuhn, and K.-D. Kammeyer, "Near Maximum-Likelihood Detection of MIMO systems using MMSE Based Lattice-Reduction," Proc. IEEE-ICC2004, vol.2, pp.798-802, Jun. 2004.
[6]
S. Sanayei, and A. Nosratinia, "Adaptive Antenna and MIMO Systems for Wireless Communications: Antenna Selection in MIMO Systems," IEEE Communications Magazine, pp. 68-73, Oct. 2004.
[7]
Ahmed Saeed, M. Elbably, G. Abdelfadeel, and M. I. Eladawy, "Efficient FPGA implementation of FFT/IFFT Processor," International Journal of Circuit, Systems and Signal Processing,
Volume 3, Issue 3, pp. 1 03-1 1 0, Jun. 2009. [8]
T. K. Y. Lo, "Maximum Ratio Transmission," IEEE Trans. On Commun., vol. 47, pp. 1458-1461, Oct. 1999.
ACKNOWLEDGMENT •
•
•
The authors thank Mr. Nawaf AI-Mutairi and Mr. Vasser AI-Hudhaif, formerly undergraduate students of Electrical Engineering Department at King Saud University (KSU), for their helps. The authors also thank Prince Sultan Advanced Technology Research Institute (PSATRI) at KSU for providing FPGA equipments to aid this work. This work is supported by a grant from the National Plan for Science and Technology (NPST), project number ELE928-02-09, KSU, Saudi Arabia.
978-1-4244-9991-5/11/$26.00 ©2011 IEEE
353