ICCCCT-10
VLSI Design of Mixed radix FFT Processor for MIMO OFDM in wireless communications 1,
N Kirubanandasarathy
Dr.K.Karthikeyan2 & K.Thirunadanasikamani3
lResearch scholar, St.Peter's University, Avadi, Chennai - 600054. India 2 Associate Scientist - R&D, ABB Global industries & Services Ltd., Chennai. - 600 089. India 3Professor, Dept of CSE, St.Peter's University, Avadi, Chennai - 600 054. India. E-Mail :
[email protected] 2
[email protected]@yahoo.com
Abstract
-
transmitted over a number of lower rate sub-carriers. OFDM technique has been widely implemented in high-speed digital communications to increase the robustness against frequency selective fading or narrowband interface. It is also used for wideband data communications over mobile radio FM channels, xDSL, DAB and DVB-TIH. An efficient FFT processor is required for real-time operation in OFDM. In wireless communication data rate has been increased by a minimum factor of 4 while migrating from one generation to next generation. The upcoming standard 802.IIn WLAN, however, can achieve more than 600 Mbits/s by virtue of MIMO OFDM technology. This ongoing evolution has accelerated the development of system-on-chip (SoC) platforms to support the physical layer of those technologies.
Orthogonal frequency division multiplexing
(OFDM) is a popular method for high data rate wireless transmission. OFDM may be merged with antenna arrays at the transmitter and receiver to increase the diversity gain and/or to heighten the system capacity on time-variant and frequency-selective channels, resulting in
a
Multiple-Input
Multiple-Output
(MIMO)
configuration. The IEEE 802.11n standard based on the MIMO
OFDM
system
provides
a
very
high
data
throughput from the original data rate of 54 Mb/s to the data rate in excess of 600 Mb/s, because the technique of the MIMO can increase the data rate by extending an OFDM-based
system.
However,
the
IEEE
802.11n
standard also increases the computational and the hardware current
complexities Wireless
greatly,
Local
compared
Area
with
Network
the
(WLAN)
standards. It is a challenge to realize the physical layer of the MIMO OFDM system with minimal hardware complexity
and
power
consumption
especially
the
computational complexity in VLSI implementation. The Fast
Fourier
Transform
1
Inverse
Fast
Fourier
The SoC platforms must satisfy two demands in order to support this wireless technology; First, the platform must be able to satisfy the enormous data rate. A single DSP chip can currently support only upto 54 Mbits/s. The second requirement is flexibility. Wireless communication is obviously less reliable than wired communication. For example, the IEEE standard 802.11a has various communication modes with possible data rates of 6, 9, 12, 18, 24, 36, 48, and 54 Mbits/s. For the SoC to adapt to different operating conditions and standards there need to be not only the real time conversion of mode in a wireless communication protocol, but also a conversion between different protocols.
Transform (FFTIIFFT) processor is one of the highest computationally complex modules in the physical layer of the IEEE 802.11n standard. However to improve the signal processing capability and to reduce the power consumption as well as the hardware cost of a FFT processor have become challenging targets. In this paper present a pipelined Fast Fourier Transform (FFT)
1
Inverse
Fast
Fourier
Transform
(IFFT)
processor for the applications in a MIMO OFDM based IEEE 802.11n WLAN baseband processor is presented. High throughput, memory reduction, low power and complex multiplier reduction are achieved by using higher mixed radix FFT in MIMO-OFDM. The mixed radix 4/2 with bit reversal FFT architecture is proposed to
design
the
prototype
FFTIIFFT
processor
for
MIMO-OFDM systems. The proposed processor with minimal
hardware
complexity
reduces
the
power
The FFT processor is widely used in mobile systems for image and signal processing applications. It is a main module of OFDM-based systems, such as the MC-CDMA receiver, MIMO-OFDM and WLAN chips. The requirement for low-power FFT [1] architectures for telecommunication systems in portable form is becoming more and more important. Due to the characteristic of non-stop processing at
consumption.
Keywords:
FFT, IFFT, MIMO OFDM I.
Introduction
OFDM is a special case of multi-carrier transmission, where a single data stream is
978-1-4244-7770-8/10/$26.00 ©2010 IEEE
98
ICCCCT-10
sample rate, the pipelined FFT is the leading architecture for high throughput or Low-power solutions.
(
Researchers have proposed a number of low power techniques for FFT processors [2]. In C. Sidney Burrus [4], a cache-memory-based architecture was presented, which uses an algorithm that offers good data locality to increase speed and energy efficiency. An ordering-based pipelined radix4 FFT was presented in [3]. The coefficient ordering reduces the switching activity between successive coefficients fed to the complex multiplier, which leads to lower power consumption. The power of a pipelined FFT processor is dominated by the size of storage blocks. Therefore, the author Lihong Jia, Yonghong GAO, Jouni Isoaho, and Hannu Tenhunen [5] proposed a progressive word length instead of fixed word length, using a shorter word length for stages in which the word length's impact on size is significant and a longer word length for stages in which the word length's impact on precision is significant. In [8], the authors proposed a low-power FFT architecture based on multirate signal processing and asynchronous circuit technology. Normally the number of arithmetic multiplications and additions are used as a measure of computational complexity. Several methods for computing FFT / IFFT are discussed. In [7] these are basic algorithms for implementation of FFT and IFFT blocks. II.
TX Data
RX Data
k
=
W;k
=
e
_/n nk N
TX RF
Prefix
PS
Remore
Orannel
' FFT
Estimation
(yehe
Prefix
\11110 Orannel
-]0 F_ Syochrooaation
and
Figure
RX RF
-
I: MIMO-OFDM
III. FAST FOURIER TRANSFORM
The Equation (1) requires N complex multiplications and (N-l) complex additions for each value of the DFT. To compute all N values therefore requires a total of N/\2 complex multiplications and N(N-I) complex additions. Since the amount of computation, and thus the computation time, is approximately proportional to N2, it will result a long computation time for large values of N. For this reason, it is very important to reduce the number of multiplications and additions. An efficient algorithm to compute the DFT is called Fast Fourier Transform (FFT) algorithm or radix-2 FFT algorithm, and it reduce the computational complexity from O(N') to O(N log2(N). [4]-[5]. IV. Mixed Radix 4-2
The mixed-radix algorithm is based on sub transform modules with highly optimized small length FFTs which are combined to create larger FFTs. There are efficient modules for factors of 2, 3, 4, and 5. The modules for the composite factors of 4 and 2 are faster than combining the modules for 2*2 and 2*3. Besides, the operation of the complex multiplication takes a lots of power in the FFT processor. In order to save power consumption, higher radix FFT algorithm can be used to reduce the number of complex multiplications. Three-step radix4 FFT algorithm is chosen in our design to save complex multiplications. A mixed radix algorithm is
O,1, .....N-1
Equation (1)
where
�bl1'ing
C)tlie
Demroularion
The DFT is defmed as:
n=O
�bp�ng
Add
' IFFT
MillO
The Discrete Fourier Transfer (DFT) plays an important role in many applications of digital signal processing including linear filtering, correlation analysis and spectrum analysis etc.
N-J
�ocarri e r
Receiver
The general transceiver structure of MIMO OFDM is presented in Fig. I. The system consists of N transmitter antennas and M receiver antennas. In this paper the cyclic prefix is assumed to be a longer than the channel delay spread. The OFDM signal for each antenna is obtained by using IFFT and can be detected by fast Fourier transform FFT.
=
I.()
)
l J - - --- -
MIMO-OFDM
X[k] Lx[n]W;k
�P
Tralll miner
is the DFT coefficient.
99
ICCCCT-10
a combination of different radix-r algorithms. That is, different stages in the FFT computation have different radices. For instance, a 64-point long FFT can be computed in two stages using one stage with radix-4 processing elements, followed by a stage of radix-2 processing elements. This adds a bit of complexity to the algorithm compared to radix-r, however it gives more options in choosing the transform length. The Mixed-radix FFT algorithm is based on sub-transform modules with highly optimized small length FFT, which are combined to create large FFT. However, this algorithm does not offer the simple bit reversing for ordering the output sequences.
sequences is exampled. As shown in the Figure 3, the block diagram for 64-points FFT is composed of total
-.
'"
c..
-.
�
Mixed Radix
..,
... '"
4-2 SF
na=3,k:3=3
--.,.
��
4-2 SF
":f0,k:fO
"" ... .. --
�
...
->
�A
... ...
...
�, -
4-2 SF
na=4,k:f4
4-2 SF
--
":f0,k:fO
x(l + 8k4)
Figure
2: The basic butterfly for mixed-radix 4/2 DIF FFT
Figure.3 Proposed Mixed-Radix
algorithm.
V.
4-2 Butterfly for 64 point FFT
six-teen Mixed-Radix 4/2 Butterflies. In the fIrst stage, the 64 point input sequences are divided by the 8 groups which correspond to n3=0, n3=1, n3=2, n3=3, n3=4, n3=5, n3=6, n3=7 respectively. Each group is input sequence for each Mixed-Radix 4/2 Butterfly. After the input sequences pass the fIrst Mixed-radix 4/2 Butterfly stage, the order of output value is expressed with small number below each butterfly output line in the fIgure 3. The proposed Mixed-Radix 4/2 is composed of two radix-4 butterflies and four radix-2 butterflies. In the fIrst stage, the input data of two radix-4 butterflies which are expressed with the equation B4 (0, n3, kj), B4 (i, n3, kl), are grouped with the x(n3), x(N/4±n3), x(N/2±n3), x(3N/4±n3) and x(N/ 8±n3), x(3N/8±n3), x(5N/8±n3), x(7N/8±n3) respectively. After each input group data passes the fIrst radix-4 butterflies, and the output data is multiplied by the special twiddle factors. Then, these output sequences are fed
Mixed-Radix FFT Algorithms with Bit Reversing
The mixed-radix 4/2 butterfly unit is shown in Figure 2. It uses both the radix-4 and the radix-2 algorithms which can perform fast FFT computations and can process FFTs that are not power of four. The mixed-radix 4/2, which calculates four butterfly outputs based on X(0)-X(3). The proposed butterfly unit has three complex multipliers and eight complex adders. Four multiplexers represented by the solid box are used to select either the radix-4 calculation or the radix-2 calculation. In order to verify the proposed scheme, 64-points FFT based on the proposed Mixed-radix 4/2 butterfly with simple bit reversing for ordering the output
100
.. '
«, .... ...
ICCCCT-10
as input to the second stage which is composed of the radix-2 butterflies. After passing the second radix-2 butterflies, the output data are multiplied by the twiddle factors. These twiddle factors WQ (1+k) are the unique multiplier unit in the proposed Mixed Radix 4/2 Butterfly with simple bit reversing the output sequences. Finally, we can also show order of the output sequences in Fig. 3. The order of the output sequence is 0,4,2,6,1,5,3 and 7 which are exactly same at the simple binary bit reversing of the pure radix butterfly structure. Consequently the proposed mixed radix 4/2 butterfly with simple bit reversing output sequence include two radix 4 butterflies, four radix 2 butterflies, one multiplier unit and additional shift unit for special twiddle factors.
900 850 800 750 700 650 -+-Area
Figure
5: Area Comparison
VI. RESULT
5000 4000 3000 2000 1000 o
Employing the parametric nature of this core, the OFDM block is synthesized on one of Xilinx's Virtex-II Pro FPGAs with different configurations. The results of logic synthesis for 64 point FFT based MIMO- OFDM using Radix-2, Radix-4, split Radix and mixed radix 4/2 are presented in Table 1. We analyse the 64-point FFT based OFDM is chosen to compare the number of CLB slices and power for different FFT architectures shown the Fig 4,5 and 6. Table
___ Power in mW
1: Comparison of FFT Algorithm based on CLB Slices, Utilization factor, and power
64 point FFf
CLB Slices I 7680
Utilization factor
Power mW
Radix-2
851
Il.l%
4685.60mW
765
9.96%
3012.51mW
835
10.8%
4492.64mW
750
9.77%
3831.63mW
in
Figure
6: Power analysis
VII. Conclusion
FFT Radix-4
In this paper, we design an FFT processor for different algorithms for MIMO-OFDM modem are identified. It was found during the algorithm design that many blocks need complex multipliers and adders, therefore special attention needs to be given to optimize these circuits and maximize reusability. In particular, the models have been applied to analyze the performance of mixed-radix FFT architectures used in MIMO-OFDM. Actual hardware resource requirements were also presented and simulation results were given for the synthesized design. The 64point Mixed Radix FFT based MIMO-OFDM architecture was found to have a good balance between its performance and its hardware requirements and is therefore suitable for use in MIMO-OFDM systems.
FFT Split Radix FFT Mixed Radix (4/2 FFT)
REFERENCES [I]
Shousheng.
He
Implementation Figure
and
Mats
Torkelson,
"Design
and
of a 1024-point Pipeline FFT Processor",
IEEE Custom Integrated Circuits Conference, May. 1998, pp.
4: FFT Analysis
131-134.
101
ICCCCT-10
[2]
Shousheng He and Mats Torkelson, "Designing Pipeline FFT Processor
for
OFDM
(de)Modulation",
IEEE
Pursuing the Ph.D. in the field of VLSI and communication,
Signals,
Department
Systems, and Electronics, Sep. 1998, pp. 257-262. [3]
Processor",
IEEE
Parallel
Processing
C. Sidney Burrus, "Index Mapping for Multidimensional
Dr.
Engineering
from
Madurai
Kamaraj
University,
Madurai, India, in 2002, the M.Eng. degree in power systems from
Lihong Jia, Yonghong GAO, Jouni Isoaho, and Hannu Tenhunen, "A New VLSI-Oriented FFT Algorithm and
Anna University, Chennai, India, in 2004, and the Ph.D. degree
Implementation", IEEE ASIC Conf., Sep. 1998, pp. 337-341.
from the Department of Electrical Engineering, Indian Institute of
Martin Vetterli and Pierre Duhamel, "Split-Radix Algorithms for Length-ptmDFT's", IEEE Trans. Acoust, Speech, and
Technology Madras, Chennai, India, in 2008. Currently, he is
Signal Processing, Vol. 37, No. I, Jan. 1989, pp. 57-64.
Associate Scientist (R&D) with ABB Global industries & Services
Daisuke
Takahashi,
"An
Extended
Split-Radix
FFT
Private Ltd., Chennai. His fields of interest include Power quality
Algorithm", IEEE Signal Processing Letters, Vol. 8, No. 5, May. 2001, pp. 145-147. [8]
Engineering,
K. Karthikeyan received the B.Eng. degree in Electrical and
Electronics
1977, pp. 239-242.
[7]
communication
include VLSI and Communication system.
FFT
Acoust., Speech, and Signal Processing, Vol. ASSP-25, June.
[6]
and
Pipeline
Formulation of the DFT and Convolution", IEEE Trans.
[5]
Electronics
Shousheng He and Mats Torkelson, "A New Approach to Symposium, April. 1996, pp.776-780.
[4]
of
St.Peter's University, Avadi, Chennai., India. His fields of interest
and Power electronics applications in Power system.
Y.T. Lin, P.Y. Tsai, and T.D. Chiueh, "Low-power variable length fast Fourier transform processor", lEE Proc. Comput.
K.Thirunadanasikamani
Digit. Tech, Vol. 152, No. 4, July. 2005, pp. 499-506.
received
the
B.Eng.
degree
in
Electronics & Communication Engineering from Bharathithasan
BIOGRAPHIES
University, Trichy, India, in 1989 and M.Eng. degree in Computer science and Engineering from NIT, Trichy, India in 1997. He is
N.Kirubanandasarathy received the B.Eng. degree in Electrical
currently working as a Professor in the department of CSE,
and Electronics Engineering from Madurai Kamaraj University,
St.Peter's University, Avadi, Chennai., India. His fields of interest
Madurai, India, in 2002 and M.Eng. degree in Applied Electronics
include Communication system and soft computing.
from Anna University, Cheunai, India in 2004. He is currently
102