VLSI implementation of high throughput MIMO ... - Semantic Scholar

11 downloads 0 Views 1MB Size Report
Indian Journal of Engineering & Materials Sciences. Vol. 19, October 2012, pp. 307-319. VLSI implementation of high throughput MIMO OFDM transceiver for. 4.
Indian Journal of Engineering & Materials Sciences Vol. 19, October 2012, pp. 307-319

VLSI implementation of high throughput MIMO OFDM transceiver for 4th generation systems J Raja* & M Kannan Department of Electronics Engineering, MIT, Anna University, Chennai 600 044, India Received 24 February 2012; accepted 13 August 2012 This paper aims to maximize throughput by minimizing power as minimum as possible. Scores of optimization techniques such as FFT, IFFT and memory optimization are available for reducing power of mobile OFDM systems. An approach for achieving reduction in power of MIMO OFDM system by optimizing FFT architecture is addressed in this paper. Memory references in MIMO OFDM transceiver are costly due to their long delay and high power consumption. To implement fast Fourier transform (FFT) algorithms on MIMO OFDM, involves many memory references to access butterfly inputs and twiddle factors. Conventional FFT implementations require unused memory references to load the same twiddle factors for butterflies from different stages in FFT diagrams. To minimize memory references due to twiddle factors for implementing FFT algorithms in MIMO OFDM systems, memory reference reduction method is incorporated here. Twiddle factor is calculated using binary scaling technique. The proposed FFT structure is the combination of memory reference reduction method with binary scaling technique and Radix-4 booth multiplier. Here multipliers in FFT are realized with shifters, adders and subtractors. The proposed structure is evaluated using performance parameters such as BER and SNR. Structural realization and analysis pertaining to timing, power and throughput are implemented in Virtex-4 and analysis is carried out in Altera respectively. Keywords: Multiple input multiple output (MIMO), Orthogonal frequency division multiplexing (OFDM), Fast Fourier transform (FFT), Inverse fast Fourier transform (IFFT), Inter symbol interference (ISI)

Multiple input multiple output (MIMO) system consists of multiple antennas at the transmitter and receiver ends to improve link reliability and data rates of the wireless communication system. Orthogonal frequency division multiplexing is efficient in synchronizing the received signal under fading environment and has been used in past times in applications that require a huge data rate. Fast Fourier transform (FFT)/ inverse FFT (IFFT) processors are proposed for multiple-input multiple-output orthogonal frequency division multiplexing based IEEE 802.11n. Here the processor not only supports the operation of FFT/IFFT but also provides sufficient throughput rates but the drawback is hardware complexity is more, compared with conventional approaches1. FFT architecture which is optimized for a processor with a memory system containing a cache is discussed elsewhere2. The processor uses a cached memory with increased efficiency and performance for MIMO system, but the achieved throughput is very less. The architecture is mainly focused on minimum mean square error detector in decoding _______________________ *Corresponding author (E-mail: [email protected])

section for wireless specification, but the difficulty is hardware complexity3. The reported architecture4 is implemented in matrix inversion circuit in MIMO detection wireless specification, the problem is the increase in hardware complexity and throughput is less compared with conventional approach. Variable length FFT processor design based on Radix 2, Radix 4 and Radix 8 with complex multipliers containing 3 multiplications described earlier5, consumes more power. ASIC based MIMO for OFDM provides data rate of 192 Mbps with a 20 MHz bandwidth for IEEE 802.11a standard. Here the paper mainly focuses on silicon complexity of MIMO OFDM system. Throughput of the system is very less compared with other systems6. MIMO OFDM Base band transceiver implementation is based on ASIC. Verification is based on testbed and GUI monitor. Here the authors have developed separate testbeds for MIMO OFDM system for verification purpose but haven’t concentrated on throughput and BER7. In order to meet IEEE 802.11n requirements8,9 the processor not only supports the operation of FFT/IFFT in 128 points and 64 points but can also provide different

308

INDIAN J. ENG. MATER. SCI., OCTOBER 2012

throughput rates for simultaneous data sequences. The difficulty is number of point’s increases more and more. Hence, the cost and complexity of the system is also increased. IEEE 802.11a established for WLAN standard, provides 54 Mbps throughput using SISO OFDM transceiver. Here the throughput of SISO OFDM is very less; to increase the throughput, MIMO OFDM is preferred10. The IEEE 802-11n provides data rate up to 600 Mbps with transmission speed of 80 MHz. Latency is incurred by pre-processing the channel matrices for MIMO detection11. Different approaches for design and implementation of 4×4 MIMO OFDM transceiver which provides date rate of 1 Gbps is discussed elsewhere12-15. Here MMSE MIMO detector is used to reduce the latency but the disadvantage is that it requires larger circuit for high speed transmission. In any n bit 2’s complement multiplier the maximum number of addition and subtraction operation is n/216. However, the above mentioned results are still insufficient in day to day requirements like transmission speed and power. In this paper, the VLSI implementation of MIMO OFDM transceiver suitable for low power application has been studied. The FFT optimization helps to reduce the delay time, power thereby simultaneously increasing speed. Orthogonal Frequency Division Multiplexing Orthogonal frequency division multiplexing (OFDM) is a technique which is a subset of frequency division multiplexing in which a single channel uses multiple sub-carriers on neighbouring frequencies. The sub-carriers of an OFDM system overlap each other. Normally the neighbouring channels which overlap may obstruct one another. Since the subcarriers in an OFDM system are exactly orthogonal to one another, they are capable of overlapping without interfering. As a result, OFDM systems are capable of maximizing spectral efficiency without causing neighbouring channel interference. Since multiple sub-carriers are used to transmit in an individual channel, an OFDM communication system must perform several steps, which are described in Fig.1. In an OFDM system, each channel is divided into a number of sub-carriers. A serial bit stream is converted into several parallel bit streams. After the bit stream division among the individual sub-carriers, each subcarrier is modulated. The receiver performs the viceversa process, which first divides the incoming signal into appropriate sub-carriers and then demodulate them individually before recovering the original bit stream.

The data is modulated into waveform at inverse fast Fourier transform (IFFT) stage of the transmitter. The IFFT is used to modulate each sub-channel onto the appropriate carrier. The receiver performs the vice-versa process of the transmitter. FFT is used to demodulate the data. One of the main drawback in wireless communication systems is ISI which is caused due to the multi-path reflections in the channel. In order to reduce ISI a cyclic prefix is added. The added cyclic prefix is a part of the first stage of a symbol which is described in Fig. 2. This addition is necessary because it enables multi-path representations of the source signal to fade, so that it won’t interfere with the subsequent symbol. After the addition of cyclic prefix to the sub-carrier channels, it must be transmitted as a single signal. Thus, the process of parallel to serial conversion stage is done by adding all sub-carriers and combining them into a single signal. At the end of this process all sub-carriers are generated perfectly in a simultaneous manner. MIMO OFDM MIMO OFDM (multiple input multiple output, orthogonal frequency division multiplexing) is a technology that combines MIMO and OFDM together to transmit data in wireless communications, in order to deal with frequency selective channel effect. The OFDM signal on each subcarrier can overcome narrowband fading. Therefore, OFDM can transform

Fig. 1 OFDM structure

Fig. 2 Cyclic prefix insertion format

RAJA & KANNAN: VLSI IMPLEMENTATION OF HIGH THROUGHPUT MIMO OFDM TRANSCEIVER

frequency-selective fading channels into parallel flat ones. Then by combining MIMO and OFDM technology together, MIMO algorithms can be applied in broadband transmission A MIMO OFDM system transmits data modulated by OFDM from multiple antennas simultaneously. At the receiver, after OFDM demodulation, the signals are recovered by decoding each sub-channel from all transmitting antennas. In the basic structure of MIMO OFDM shown in Fig. 3, the signals are modulated by OFDM modulator, then they are transmitted by MIMO system, finally, the signals are recovered by the OFDM demodulator. Therefore, MIMO OFDM achieves spectral efficiency, increased throughput and thus the inter-symbol interference (ISI) can be reduced. Fast Fourier Transform Conventional FFT algorithm

The modulation of OFDM can be done using an IDFT. The fast implementation of IDFT is accomplished by deploying IFFT, which reduces processing time and hardware usage. The demodulation is through an DFT or better by using FFT, which is an efficient way of implementation. In order to reduce numerous calculations involved in calculation of DFT, FFT is used. To achieve the increased efficiency an additional step is involved (to reverse order the data). These additional steps won’t increase the computational complexity of the FFT calculation. As a result, FFT is an extremely efficient algorithm that provides a good implementation in hardware. If N is very large, the computation efficiency is increased by dividing DFT successively in the smaller calculation. The DFT of discrete signal x(n) can be computed directly using Eq. (1). N −1

X ( K ) = ∑ x(n)W nkN , k=0,1,2,3............N-1

Calculation of FFT is done using decimation in time (DIT) or decimation in frequency (DIF) algorithm. If it is DIT, first twiddle factor is multiplied and later it is summed up. It is computed with the help of Eq. (2). The butterfly diagram pertaining to DIT is shown in Fig. 4. N −1

N −1

X ( k ) = ∑ x (n)W Nnk =

∑ x(n)W

n=0

∑ x(2r )W

∑ x(n)W

nk N

n = 0 ( odd )

N −1 2

+ ∑ x( 2r + 1)W N( 2 r +1) k

2 rk N

r =0

… (2)

r =0

N −1 2

=

N −1

+

n = 0 ( even )

N −1 2

=

nk N

∑ x (r )W 1

r =0

rk N 2

+W

N −1 2

∑x

k N

2

( r )W Nrk

r =0

= X 1 ( k ) + W Nk X 2 ( k )

2

(k = 0,1,2,  N − 1)

If it is DIF, first the input is summed up and then the twiddle factor is multiplied. It is computed using the Eq. (3). The butterfly diagram pertaining to DIF is shown in Fig. 5. N −1

X (k ) =



x ( n )W Nnk =

n=0 N −1 2

=



=



n=0 N −1 2

=

N −1 2



x(n +

n=0

n=

x ( n )W Nnk

N 2

N (n+ )k N )W N 2 2

N k  N  nk 2 ) W N  x(n ) + W N x (n + 2  



∑  x ( n ) + ( − 1) n=0

N −1



x ( n )W Nnk +

n=0

x ( n )W Nnk +

n=0 N −1 2

N −1 2



k

x (n +

N  nk ) WN 2 

( k = 0 ,1,  N − 1)

… (3)

The conventional N point FFT, requires (N/2) log2 N multiplication operations and N log2 N addition operations. If it is N bit multiplier, it will generate 2N partial products. The entire FFT butterfly module of

… (1)

n=0

Fig. 3 MIMO OFDM structure

309

Fig. 4 Radix 2 DIT FFT butterfly diagram

310

INDIAN J. ENG. MATER. SCI., OCTOBER 2012

Fig. 6 Modified DIT FFT butterfly diagram Fig. 5– Radix 2 DIF FFT butterfly diagram

For example, in DIT butterfly diagram shown in Fig. 3, the butterflies with twiddle factor W160 appears in Stage 1, Stage 2, Stage 3 and Stage 4. Hence, these twiddle factors are grouped together and computed in any order in the 16-pt radix-2 DIT FFT diagram. This can be achieved using the below identities (5)-(8). m

N/ 4

m−N / 4

N

N

N

W =W .W

m−N / 4

=W .W = −W =W3NN/4.WmN−3N/4=jWmN−3N/4 N/2

m−N / 2

m−N / 2

N

N

N

=W

Fig. 5a Block diagram of conventional FFT butterfly module

conventional FFT is shown in Fig. 5a. Due to the partial products the delay is increased and it consumes more power. Hence, it will affect the throughput of the mobile system. Thus, there is a need for reducing the consumed power with decreased delay thereby dragging down the throughput of the system for which we propose modified FFT algorithms which helps to overcome the above mentioned issues. Modified fast Fourier transform algorithm

During FFT implementation identical twiddle factors occupies enormous memory. In this paper identical twiddle factors are grouped together and computed, before computing the other twiddle factors. The twiddle factors are fed in to the structure only once and the amount of memory due to the identical twiddle factors can be minimized. The twiddle factors are computed using the binary scaling method. In binary scaling the twiddle factors are calculated by simply shifting the bits. The entire multiplication operation is completely replaced by shifting operation.

N

(N/4, N/2)

(N/2, 3N/4)

m

(3N/4, N)

m

… (5) … (6) … (7)

(0, N/4)

m

m

m

= − jWN

… (8) W164

The butterflies with twiddle factor which appears in Stage 4, Stage 2 and Stage 3 are computed. Hence, twiddle factors are grouped together and computed in any order in the 16-pt radix-2 DIT FFT diagram. Similarly, the butterflies with twiddle factors W162 and W166 appears only in Stage3 and Stage4 of the 16-pt radix-2 DIT FFT diagram. These twiddle factors are grouped together and computed, without affecting the computations of other butterflies. The butterflies can be computed together by loading only one twiddle factor

W

m N

m and it is described in Eqs (9)-(11).

m = (n mod N/2i-1)x2i-1

W

W

( n + ( N / 2 i +1 )) mod( N / 2 i −1 )) x 2 i −1 k N

( n mod N / 2 N

i −1

)) x 2

i −1

=−

… (9)

=−

jW

jW

( n mod N / 2 i −1 )) x 2 i −1 N

… (10)

( n mod N / 2 i −1 )) x 2 i −1 N

… (11)

where i is the number of stages. Finally, twiddle factors W161, W163, W165 and W167 appear only in Stage 4. These twiddle factors are grouped together and computed, without affecting the computations of other butterflies. Each twiddle factor is loaded only once and redundant memory references for identical twiddle factors are removed which is shown in Fig. 4. Twiddle factor W165 can be replaced by -jW161 with a simple derivation as described in Eq. (12).

RAJA & KANNAN: VLSI IMPLEMENTATION OF HIGH THROUGHPUT MIMO OFDM TRANSCEIVER

W16 5

= W16 –j*(2π/16)*5 = (W16 –j*(2π/16)*4 ) * (W16 –j*(2π/16)*1) = -jW16 1 6

… (12) 7

Similarly, twiddle factors W16 and W16 can be replaced by -jW162 and -jW163, respectively. The entire FFT butterfly module of modified FFT is shown in Fig. 6a. Modified FFT is the combination of binary scaling, which will convert floating point in to fixed point and Radix 4 multiplier, which will generates N/3 partial product. Here, binary scaling technique is used to calculate the twiddle factor. Binary scaling is a programming technique used mainly in DSP to perform a pseudo floating point using integer arithmetic. It is both faster and more accurate than directly using floating point. In binary scaling the floating point is multiplied with 256, which will convert floating point in to fixed point and it is given in Table 1.

The above problems are nullified by using modified Radix-4 booth algorithm. It requires only N/3 partial product. The procedure of Radix-4 booth algorithm is described as: (i) if N is even then extend the sign bit, (ii) append a 0 to the LSB of the multiplier and (iii) based on the block value, each partial product will be 0, +X, -X, +2X and -2X. The negative values of X are obtained by taking the 2’s complement. Based on the above result the booth encoded table is developed and tabulated in the Table 2. The recoding is done by encoding three bits of multiplier X. The proposed FFT processor described above has been implemented using Xilinx Virtex-4 FPGA. Based on the analysis, device utilization summary is given in Table 3. From the results, we can observe that the modified FFT algorithm requires only 10% of slices, 11% of flip flops and 7.5% of LUTs compared to the conventional FFT Algorithm.

Radix-4 multiplier

The complexity of conventional FFT is dominated by multiplier. Ordinary multipliers generate 2N partial products. Due to the partial products, the latencies and power consumption of the MIMO OFDM transceiver are increased. The Radix-4 booth multiplier generates N/3 partial product. It consumes less power and short delay because of its high speed parallel multiplier. The proposed FFT structure is the combination of binary scaling technique and Radix-4 booth multiplier. In general Radix-4 booth multiplier is preferred rather than Radix-2 booth multiplier because of the following disadvantages of Radix-2 booth multiplier: (i) the number of addition, subtraction and shift operations in Radix-2 booth algorithm are variable and it is very difficult to incorporate in designing of parallel multipliers, (ii) the Radix-2 multiplier becomes suitable only when there are isolated 1’s and (iii) the number of partial product is N.

Table1 Binary scaled value S.No 1 2 3 4 5

Twiddle factor 1+j0 0+j1 0-j1 0.707+j0.707 -0.707-j0.707

Binary scaled value 256+j0 0+j256 0-j256 181+j181 -181-j181

Table 2 Radix-4 encoded table Block 000 001 010 011 100 101 110 111

Partial product 0*X 1*X 1*X 2*X -2*X -1*X -1*X 0*X

Table 3 Device utilization summary Modified FFT algorithm Logic Utilization

Fig. 6a Block diagram of modified FFT Butterfly Module

311

No of slices No of slices Flip flop No of 4 input LUTS No. of bonded IOBs No of GCLKS No. of DSP 48s

Conventional FFT algorithm

Used

Available

Used

Available

74 92

6144 12288

782 1040

6144 12288

142

12288

1080

12288

187

240

321

240

1

32

1

32

12

32

32

32

INDIAN J. ENG. MATER. SCI., OCTOBER 2012

312

Throughput of the mobile system entirely depends on the power consumption and delay. The proposed system consumes less power and short delay. Hence, the throughput of the mobile system is increased. Based on the results, the performance of the MIMO OFDM is compared and given in Table 4. Results of Simulation Code development and simulation were carried out on a system running on Intel P4 configuration having 4 GB RAM working in windows 7 Platform. The MIMO-OFDM structure for determination of constellation points and BER were coded and simulated in MATLAB. Structural realisation of MIMO OFDM transceiver is implemented in Xilinx. Analysis pertaining to timing, frequency is done in Altera. The experimental results and analysis obtained using Matlab, Xilinx and Altera are presented and discussed. MATLAB implementation output QAM able to carry higher data rates than the other schemes. QAM sends digital information by Table 4– Performance Comparisons S.No

Parameter

MIMO OFDM

Power dissipation 597 mW 32 Number of memory reference Storage space in 32 bytes 698 No. of clock cycle Throughput 600 Mbps Delay 9.43 ns Logic elements 9940

1 2 3 4 5 6. 7.

Mixed Radix Modified FFT FFT architecture architecture based based MIMO MIMO OFDM OFDM

540 mW 24

111 mW 15

30

12

600 700 Mbps 8.89 ns 9813

540 1.4 Gbps 4.04 ns 5844

periodically adjusting the phase and amplitude. FourQAM uses four combinations of phase and amplitude. Moreover, each combination is assigned a 2-bit digital pattern. For example, to generate the bit stream (1,1,0,0,1,1). Because each symbol has a unique 2-bit digital pattern, these bits are grouped in two’s so that they can be mapped to the corresponding symbols. In our example, the original bit stream (1,1,0,0,1,1) grouped into three symbols (11,00,11). The constellation plot in Fig. 7 shows the symbol map of 4-QAM with each possible phase (Θ) and amplitude (A) of a carrier signal in polar coordinate form. Bit error rate (BER) is defined as the ratio of number of error bits and total number of bits. There are many ways of reducing BER. Here, we focused on Alamouti code and modulation techniques. In this study we have taken AWGN is communication channel where noise gets spread over the entire spectrum of frequencies. BER has been calculated by comparing the transmitted signal with the received signal and computing the error count over the total number of bits. For any given modulation, the BER is normally expressed in terms of signal to noise ratio (SNR). The bit error rate is improved from 10-6 with the almost need of 10 dB of SNR as shown in Fig. 8. FPGA implementation output

Table 5 shows the percentage minimization of power, memory references, space in bytes and clock cycles, and Table 6 shows the reduction in number of slices, Flip flops, LUT’s and I/O blocks.

Table 5 Percentage reduction of parameters compared with conventional approaches Sl. No 1

Parameter

A

B

% of power 90 20 dissipation 2 % of 75 45 memory reference 3 % of space 94 38 in bytes 4 % of clock 86 77 cycle A – Mixed Radix FFT + MIMO OFDM Transceiver; B – Modified FFT + MIMO OFDM Transceiver C – Modified FFT + SISO OFDM Transceiver

C 5 11

9 19

Fig. 7 QPSK signal constellation points

RAJA & KANNAN: VLSI IMPLEMENTATION OF HIGH THROUGHPUT MIMO OFDM TRANSCEIVER

Table 6 Device utilization expressed in percentage

313

Synthesis output SISO OFDM RTL view

Sl. No

Utilization

A

B

1 2 3 4

% of Slices % of Slices flip flop No. of 4 Input LUTS % of bonded IOBs

3 1 1.1 78

13 8 9 99.99

A – Modified FFT; B– Conventional FFT

The OFDM structure was synthesized and simulated in Vertex-4 as shown in Fig. 9. The proposed method can achieve average of 11% reduction in number of memory references, 9% saving of memory space due to the twiddle factor and average of 19% reduction in number of clock cycles. MIMO OFDM RTL view

MIMO OFDM structure was synthesized and simulated in Vertex-4 FPGA as shown in Fig. 10. The proposed structure was constructed using four transmitting and four receiving antennas. Transmitter outputs and receiver inputs are combined using spatial multiplexing technique. The proposed method can achieve average of 20% reduction in number of memory reference, 38% saving of memory space due to twiddle factor and average of 77% reduction in the number of clock cycle as compared with conventional approaches which is evident in Table 5. FFT RTL View

Fig. 8 SNR versus BER

To implement the conventional FFT method, it requires 1040 number of FFs, 321 numbers of I/O blocks and 1080 number of LUTs. Number of FFs and number of LUTs decide the complexity and cost which is shown in Fig. 11.

Fig. 9 OFDM transceiver RTL output

314

INDIAN J. ENG. MATER. SCI., OCTOBER 2012

Fig. 10 MIMO OFDM transceiver RTL output

Fig. 11 FFT RTL output

Proposed FFT RTL View

Analysis output

To implement the proposed FFT method, it requires 92 numbers of FFs, 187 numbers of I/O blocks and 142 numbers of LUTs. When compared with conventional method, the proposed method requires only 1% of FFs, 78% of IOBs and 1.1% of LUTs. This is depicted in Fig. 12.

Power analysis

Power consumed by the MIMO OFDM transceiver system is 111 mW is shown in Fig. 13. Total power is the sum of the static power, dynamic power and I/O power.

RAJA & KANNAN: VLSI IMPLEMENTATION OF HIGH THROUGHPUT MIMO OFDM TRANSCEIVER

315

Fig. 12 Modified FFT with binary scaling technique and radix-4 Booth multiplier RTL output

Fig. 13 Power analysis Output Timing analysis

Simulation output FFT Simulation output

Fmax of MIMO OFDM system is 235 Mhz. Based on the Fmax Throughput of the system is calculated as 1.4 Gbps. Fmax of MIMO OFDM transceiver is shown in the Fig. 14.

Simulation output of FFT is shown in Fig. 15. The binary inputs in0, in1, in2, in3, in4, in5, in6 and in7 are passed through the butterfly units. Finally, the real part outputs are taken from outr0,outr1,outr2,outr3,

316

INDIAN J. ENG. MATER. SCI., OCTOBER 2012

Fig. 14 Timing analysis Output

Fig. 15 FFT simulation output

outr4,outr5,outr6 and outr7 and imaginary part outputs are taken from outi0,outi1,outi2,outi3, outi4,outi5,outi6 and outi7. Modified FFT simulation output

Simulation output of modified FFT with binary scaling radix-4 booth multiplier technique is shown in the Fig. 16. The binary inputs in0, in1, in2, in3, in4, in5, in6 and in7 are passed through the memory

reference reduction technique and higher radix multiplier. Finally, the real part outputs are taken from outr0,outr1, outr2, outr3, outr4, outr5, outr6 and outr7 and imaginary part outputs are taken from outi0, outi1, outi2,outi3, outi4, outi5, outi6 and outi7. Simulation output

Simulation output 4×4 MIMO OFDM transceiver is shown in Fig. 17. The inputs in, in2, in3 and in4 are

RAJA & KANNAN: VLSI IMPLEMENTATION OF HIGH THROUGHPUT MIMO OFDM TRANSCEIVER

317

Fig. 16 Modified FFT with binary technique and radix-4 booth multiplier simulation output

Fig. 17 MIMO OFDM transceiver Simulation output

passed through the modified FFT algorithm in the transmitter section. The transmitted outputs are spatially multiplexed and the transmitted data are received by receiver using MRC (maximum ratio combining) techniques. Finally, the outputs are taken from out1, out2, out3 and out4.

proportional to throughput. Maximum throughput is achieved by minimizing power as minimum as possible. The proposed system power is 111 mW which is minimum. Hence, throughput of the system is increased as shown in Fig. 18.

Performance analysis Power analysis

Proposed system generates short delay which is very less compared with other systems. Delay is inversely proportional to the throughput. Maximum throughput is achieved by minimizing delay as

Proposed system requires less power which is very less compared with other systems. Power is inversely

Delay analysis

INDIAN J. ENG. MATER. SCI., OCTOBER 2012

318

minimum as possible. The gate delay of proposed system is 4.04 ns which is very small. Hence, the throughput of the system can be increased as shown in Fig. 19.

hardware. When compared with other system the proposed system requires very minimum number of logic elements, hence the cost and complexity of the system is reduced as shown in Fig. 21.

Throughput analysis

Power dissipation versus throughput

The throughput of the proposed system is 1.4 Gbps, this can be achieved with the help of modified FFT architecture. Modified FFT architecture reduces redundant memory reference, number of partial products and number of non zero digits as shown in Fig. 20. Logic elements analysis

Only 58% of logic elements is required to implement the proposed system. The number of logic elements determines the cost and complexity of the

Maximum throughput is obtained at minimum power that is 111 mW power. When the power increases more and more correspondingly throughput of the system gradually decreases as shown in Fig. 22. From the graph we conclude that the proposed system consumes less power and transmits maximum number bits per second. Memory reference versus performance parameters of different MIMO OFDM systems

The main objective of this paper is to reduce the power and increase the throughput. Memory reference

Fig. 18 Power versus different MIMO OFDM systems

Fig. 20 Throughput versus different MIMO OFDM systems

Fig. 19 Delay versus different MIMO OFDM systems

Fig. 21 Logic element analysis versus different MIMO OFDM systems

RAJA & KANNAN: VLSI IMPLEMENTATION OF HIGH THROUGHPUT MIMO OFDM TRANSCEIVER

319

throughput. The circuit was implemented in a 28 nm (transistor gate size) library with less circuit area and evaluated in lower power dissipation. This can be achieved by optimizing FFT architecture (memory reference reduction and binary scaling methods are used to minimize redundant memory references) and Radix-4 booth multiplier algorithm (Radix-4 multiplier technique is used to reduce the power dissipation) which reduces number of partial products. References

Fig. 22 Power versus throughput

Fig. 23 Memory Reference versus Performance parameters of Different MIMO OFDM Systems

in FFT is costly due to the long delay and high power consumption. Memory reference reduction method is used in FFT to eliminate redundant memory reference. Based on the result the graph is plotted in Fig. 23. From the graph we conclude that the proposed method delivers 1.4 Gbps maximum throughput with minimal number of memory references of 15 with the corresponding delay of 4.04 ns and the power is 111 mW. Conclusions In this work, we have presented a VLSI implementation of high throughput MIMO OFDM transceiver for 4G system which achieves 1.4 Gbps

1 Fu Bo & Ampadu Paul, J Signal Process Syst, 56(1) (2009) 59-68. 2 Chang Y & Park S C, IEICE Trans Fundamentals, E87-A (11) (2004) 3020-3024. 3 Kim Hun Seok, Zhu Weijun, Bhatia Jatin, Mohammed Karim, Shah Anish & Daneshrad Babak, EURASIP J Adv Signal Process, 2008. 4 LaRoche Isabelle & Roy Sebastien, An efficient regular matrix inversion circuit architecture for MIMO processing, IEEE Int Symp on Circuits and Systems (ISCAS), May 2006, pp.4819-4822. 5 Lin Y T, Tsai P Y & Chiueh T D, IEE Proc Comput Digit Technol, 152(4) (2005) 499-506. 6 Perels D, Haene S, Luethi P, Burg A, Felber N, Fichtner W & Bolcskei H, IEEE Trans VLSI Syst, 5 (2005) 215-218. 7 Greisen Pierre, Haene Simon & Burg, EURASIP J Embedded Syst, 2008, Article ID242584. 8 Reisis D & Vlassopoulos N, IEEE Trans Circuits Syst I, 55(11) (2008) 3438-3447. 9 Radhouane R, Liu P & Modlin C, in Proc. IEEE Int Symp Circuits Syst, 1 (May 2000) 116-119. 10 Yoshizawa Shingo & Miyanaga Yoshikazu, VLSI implementation of SISO-OFDM and MIMO-OFDM transceivers, IEEE Int Symp Communtions and Information Technologies (ISCIT), No. T2D-4, Oct 2006. 11 Yoshizawa Shingo, Yamauchi Yasushi & Miyanaga Yoshikazu, A complete pipelined MMSE detection architecture in a 4x4 MIMO-OFDM receiver, IEEE Int Symp on Circuits and Systems (ISCAS), May 2008, pp. 1248-1251. 12 Yoshizawa Shingo, Yamauchi Yasushi & Miyanaga Yoshikazu, VLSI Architecture of a 4x4 MIMO-OFDM with an 80-MHz Channel Bandwidth Transceiver, IEEE Int Symp on Circuits and Systems (ISCAS), May 2009, pp. 4244-4247. 13 Yoshizawa Shingo, Yamauchi Yasushi & Miyanaga Yoshikazu, VLSI Implementation of a 4x4 MIMO-OFDM Transceiver for 1Gbps Data Transmission, IEEE Int Symp on Circuits and Systems (ISCAS), May 2010, pp. 1743-1746. 14 Lamarca Rey F & Vazquez M G, IEEE Trans Signal Process, 53 (3) (2009) 1741-1755. 15 Shin M & Lee H, A high-speed four-parallel radix-2 4 FFT/IFFT processor for UWB applications, Proc. IEEE Int. Symp. Circuits and Systems, May 2008, pp.960-963. 16 Ma G K & Taylor F J, IEEE ASSP Mag, Jan 1990, pp. 6-20.