An Integrated 256-point Complex FFT Processor for Real ... - CiteSeerX

IEEE Proceedings of Instrumentation and MeasurementTechnology Conference, vol. 1, pp. 96-101,Ottawa, Canada, May 19-21, 1997

An Integrated 256-point Complex FFT Processor for Real-time Spectrum Analysis and Measurement Ediz Çetin, Richard C. S. Morling and Izzet Kale Department of Electronic Systems University of Westminster London W1M 8JS, United Kingdom Tel: +44-171-911-5146. Fax: +44-171-580-4319 E-mail: [email protected]

hence enabling the measurement of induction motor’s speed in real-time without the need for a sensor [1]. A number of dedicated FFT processors implementations have been reported in the literature [4-8]. The FFT processor architecture presented in this paper differs from these, in that a bit-parallel pipelined butterfly processor is used rather than its bit-serial counterpart. Also, instead of having a column of butterfly processors, a single butterfly processor is used. In addition to this, the processor is programmable in the sense that the basic architecture enables it to be used for different size FFT operations and it is capable of other mathematical operations such as windowing, filtering and fast convolution. The IC designed in this paper is intended as a demonstrator of the basic architecture and is restricted to 256-point transforms by the size of the onchip RAM.

Abstract -This paper describes in detail the design of a custom CMOS Fast Fourier Transform(FFT) processor for computing 256-point complex FFT. The FFT is well suited for real-time spectrum analysis in instrumentation and measurement applications. The FFT butterfly processor consists of one parallel-parallel multiplier and two adders. It is capable of computing one butterfly computation every 100ns thus it can compute 256-complex point FFT in 102.4µs excluding data input and output processes. I. INTRODUCTION The Discrete Fourier Transform (DFT) is of considerable importance in instrumentation, measurement and Digital Signal Processing (DSP) applications. However, the computation burden of the DFT had prevented it from being widely implemented in real-time applications. Research in this area resulted in algorithms which speeded up the DFT computation considerably. Most notable of these is the Fast Fourier Transform (FFT). With the development of high-speed processors, the FFT has found many real-time applications in field of measurement and instrumentation.

II. LIMITATIONS OF ASIC CHIPS Implementing the FFT algorithm on silicon has improved the computation time of the FFT. However, there are some inherent problems with ASICs. Most dedicated FFT chips are designed and optimised for certain fixed transform lengths i.e. N=256 complex points. This narrows the application area of such chips. Another problem is that dedicated chips have fixed-point data wordlengths. This implies that the dynamic range is lower compared to floating-point chips. Errors and constrains due to finite wordlengths i.e. overflow and quantization noise reduce the precision of dedicated ASICs. Especially in the case of FFT computation these errors are great nuisance [9]. During the FFT computation results at a particular stage are rounded and stored in memory. In the following stage these results are read from memory, new butterfly computations are performed and new results are rounded again and moved back to the memory. Since the FFT computation is an iterative process, the successive rounding errors accumulate over the FFT stages. These errors appear at the output as round off noise (quantization noise ). One remedy for this problem is to keep the rounding noise (error) per butterfly

In the early days, FFT algorithms were implemented in software running on general-purpose computers. However, with the advances in VLSI technology, FFT algorithms are now implemented on programmable Digital Signal Processors (DSPs), Field-Programmable Gate Arrays (FPGAs) and dedicated FFT processor ICs. General-purpose DSPs are specially designed and optimised to be dedicated to a range of applications. With users demanding higher processing speeds for real-time measurement applications, dedicated FFT processors are replacing general-purpose DSPs in some application areas [1-3]. In such applications, use of the FFT reduces computation complexity hence resulting in increased system efficiency, e.g. the FFT can be used to calculate, in real-time, the spectral estimation of a rotor shaft’s harmonic frequencies 96

The butterfly that is used is the DIT-FFT radix-2 butterfly with all wordlength reductions at the output of the butterfly. The results are scaled and quantized back to 24-bits in order to prevent overflow due to multiply and add/subtract operations. Convergent rounding is used since it is bias free. DIT-FFT radix-2 butterfly is shown in Figure 1 [11]:

computation as small as possible. This is achieved by having wide internal wordlengths and ensuring that the error (noise) introduced by the rounding circuit is small. However, it is desirable to have short wordlengths and less complex circuitry. Short wordlengths lowers the cost in terms of chip area and power consumption. On the other hand, short wordlengths lowers the precision of the FFT computation. If the precision of the result is low then signal to noise ratio (SNR) at the output will be lower than the SNR at the input. Therefore great care must be shown in selecting wordlengths. In addition to these, it should be pointed out that SNR specification changes from one application to another. With this FFT processor chip external data pathways are in the form of 24-bit two’s complement signed fractions per component of complex data whereas internal data pathways are in the form of 39-bit two’s complement signed fractions. Coefficients (or twiddle factors) are pre-calculated and stored in coefficient ROM as 16-bit two’s complement signed fixed-point words.

Inputs

1 2

W kN = e

(1)

k =0

and the IDFT is defined as:

where WN = e

N −1

− nk ∑ X (n)WN k = 0,1,…,N-1

n=0 − j2π/ N

− j 2Πk N

(5) (6) = cos( 2Πk ) - jsin( 2Πk ) N

N

(7)

The DIT butterfly structure is different than DIF butterfly in that multiplication is done before addition. Both algorithms require the same number of complex multiply and add operations. At first sight it appears that there is little to choose between the two. However, careful analysis show that they do in fact have important differences. The DIT-FFT butterfly is best suited for silicon implementation, since DIT-FFT butterfly makes more efficient use of the multiply accumulate unit (MAC) and it is less prone to overflow. Largest output one can get from a scaled DIT-FFT butterfly is 1.2 whereas this is 1.4 for the DIF-FFT butterfly.

N −1

1 N

B’=A-WNk B

A = x + jX B = y + jY

The N-point DFT of a sequence x(k) is defined as [10]:

x(k ) =

Q

As shown in Figure 1 [11], the DIT-FFT butterfly takes a pair of input data values “A” and “B” and produces a pair of outputs “A’ ” and “B’ ” where : A’= 1 (A + BW kN ) (3) 2 (4) B’= 1 (A - BW kN ) 2 The scaling factor of ½ is required to avoid overflow with fixed-point arithmetic. In general, the input samples as well as the twiddle factors are complex and can be expressed as :

III. THE FAST FOURIER TRANSFORM (FFT)

n = 0,1,…,N-1

A’=A+WNB

Figure 1: Signal Flowgraph for the DIT-FFT Butterfly [11].

In the following sections of the paper we shall briefly describe the Fast Fourier Transform, architectural design of the processor and simulation results.

∑ x( k )WNnk

k

Q k

WN

B

Integrating measurement systems on silicon is most desirable since such systems are cheap, fast, reliable and accurate. Digital components give the same reading every time ensuring predictability and repeatability. Production costs are lower than that for analog devices which must be individually adjusted. What is more, integrated systems are far less sensitive to environmental changes like humidity, temperature and age than their analog counterparts.

X ( n) =

Outputs

1 2

A

(2)

In the DIT approach, the initial DFT is divided into two transforms, one consisting of a transform of even samples and the other consisting of a transform of odd samples. This process is carried out until the initial transform is reduced to a set of two-point transforms of the initial data. An in-place FFT implementation allows the results of each FFT butterfly to replace its inputs. In order to use an inplace algorithm it is necessary either to re-order the input data array or re-order the output array. This reordering is simply arranged by reversing the address bits. Before starting to calculate the DFT, the input data is ordered such that its address is bit-reversed, that is if the binary address of the required sequence of data is 1102 then the bit reversed version of that becomes 0112. Figure 2 shows the signal flowgraph for 8-point DIT-FFT with input scrambling [11].

An inverse DFT (IDFT) is easily computed without any major change in the algorithm. The only extra facility required is to be able to conjugate WN by negating its imaginary component. This is implemented in hardware. There are two main algorithms for implementing the FFT i.e. changing an N-point DFT into two N/2 point DFTs. These algorithms are Cooley and Tukey and Sande and Tukey algorithms. The Cooley and Tukey algorithm is referred to as “Decimation in Time (DIT)” whereas the Sande and Tukey algorithm is referred as “Decimation in Frequency (DIF)” [11].

97

4 of 2 point DFT’s



x(0002)

complex multiplications and Nlog2N complex additions 2 compared to N complex multiplications and N(N-1) complex additions of the DFT [11].

X(0002) W

x(0012)

0 8

X(0102)

2

0

W8

W8

x(0112)

X(0112)

0

W8

x(1002) 0

W8

X(1012) X(1102)

0

W8

W

2

0

2 8

W8

W8

x(1112)

The operation of the processor is first partitioned into three main processes. These are the: Data Input, FFT Computation and Data Output Processes. This partitioning is depicted in Figure 4.

X(1002)

1

W8

x(1012) x(1102)

IV. ARCHITECTURAL DESIGN OF THE FFT PROCESSOR

X(0012)

0

W8

x(0102)

X(1112) 3

W8 Stage 1

Stage 2

Figure 2: Signal Flowgraph for 8-point DIT-FFT with input scrambling [11]

As it can be seen from Figure 2, DIT algorithm consists of three butterfly stages. To the left, we have eight input data samples. Input data is multiplied with the twiddle factor W kN The solid dots represent addition/subtraction. The outputs are in natural order. Generally an N-point DIT-FFT algorithm consists of log2N stages each stage containing N/2 butterfly operations [11].

X(0102)

2

1

W8

x(0112)

X(0112)

0 8

x(1002)

W

x(1012)

W8

x(1102)

W8

W8

0

x(1112)

W8

W8

X(1002) 2

0

W8

0

Data Output Process

The FFT processor architecture consists of a single DIT-FFT radix-2 butterfly (which is referred as the butterfly processing element), a dual-port FIFO RAM, a coefficient ROM, a controller and an address generation unit. Data pathways are in the form of 24bit two’s complement signed fractions. Coefficients are stored as 16-bit two’s complement signed fixed point words. The block diagram representation of the FFT processor is depicted in Figure 5.

X(0012)

0 8

W8

FFT Computation Process

The processing cycle starts with the Data Input process, during which sampled data is read in and stored in memory. During the FFT Computation process, the FFT or IFFT is computed on the stored data. During the Output process results of the FFT Computation process are read out to the outside world. These processes are then mapped to hardware resources.

0

W8 W

FFT/IFFT

Figure 4: Three sub-processes of the DIT-FFT Algorithm

X(0002)

x(0002)

x(0102)

Data Input Process

Output Data

An in-place algorithm makes efficient use of available memory as the transformed data overwrites the input data. However, the indexing required to determine which location in memory to fetch the input data for each butterfly is quite complex.

x(0012)

Input Data Stage 3

X(1012)

0

X(1102)

Stage 1

3

W8

Stage 2

X(1112) Stage 3

Unscramble

Butterfly Processing Element

Controller

Figure 3: Signal Flowgraph for 8-point Modified DIT-FFT with output scrambling.

Coefficient ROM

Address Generation Unit

The Algorithm used in this FFT processor implementation is the modified version of DIT-FFT algorithm with inputs in natural order and outputs in bit-reversed order i.e. output scrambling. This input and output configuration is required if the processor is to be used for filtering applications. This is shown in Figure 3.

Input Data (seria/paralel)

FFT RAM

P/S

2

S/P

Scramble

Output Data (serial/ parallel)

Figure 5: Block Diagram Representation of the FFT processor.

A.

Butterfly Processing Element

The butterfly is the basic operator of the FFT. It computes a two point FFT. It takes two data words from memory and computes the FFT. The results are written back to the same memory locations of the inputs since an in-place algorithm is used. The butterfly processing element computes one butterfly operation every four cycles. It consists of one parallelparallel multiplier and two adders. The first adder is

The computational advantages of the FFT may be illustrated by considering first the FFT algorithms of Figures 2 and 3. These figures show that the N-point FFT contains log2N stages each stage containing N/2 butterflies. Hence there are (N/2)log2N butterfly computations to perform. Since each butterfly involves one complex multiplication and two complex addition operations, overall N-point FFT contains (N/2)log2N 98

39-bits wide and sums the cross products of the complex multiplication. The architecture for it is depicted in Figure 6. L C3 G

R

D0

C1,C3

NEGATE

C0

D0

Y C2,C3

G

L C0,C2 G

Phi2

R

D1

C1 GL

ROM Address Coefficient ROM R x

RAM Read Address

C2 D0

R

D1 R

C0,C1

RAM Write Address

Dual Port FIFO RAM WR

Phi2 ( Division by 2)

X C1,C3

C3

C1,C3

C0, C1

SHIFT

L

NEGATE

D1

Address Generation Unit (AGU)

The purpose of the address generation unit is to provide the RAM and the coefficient ROM with the correct addresses. It also keeps track of which butterfly is being computed in which stage. For a 256point complex FFT there are 8 stages, each stage consisting of 128 butterflies. In addition to this, since address generation during input, output and FFT computation processes are different it keeps track of the mode of operation of the chip and generates the required address. Mode of operation information is supplied by the controller. A block level description of the AGU is shown in Figure 7.

y

R

C1

B.

9

Figure 6: Butterfly Processing Element Architecture.

Next Butterfly

The multiplier forms the partial products of the complex multiplication and produces a 39-bit signedfraction result. This is followed by the first adder which sums the cross products of the complex multiplication. The output is rounded to a 25-bit result. The second 26-bit wide adder produces the sum and difference outputs of the butterfly operation.

stage done Butterfly stage

RAM read

ROM Read

Multip O/P

First Adder O/P

C0

y

cosΦ

settling

C1

Y

sinΦ

previous YcosΦ ycosΦ

C2 C3

x X

cosΦ sinΦ

YsinΦ YcosΦ

C0 C1

next y next Y

cosΦ sinΦ

ysinΦ next ycosΦ

C2 C3

next x next X

cosΦ sinΦ

next YsinΦ next YcosΦ

C0

next y

cosΦ

ycosΦ

previous YcosΦ-ysinΦ settling ycosΦ+ YsinΦ settling Y cosΦysinΦ settling next y cosΦ+ YsinΦ settling

2nd Adder O/P prev y’

RAM write

prev Y’

prev y’

prev x’ prev X’

prev Y’ prev x’

y’ Y’

prev X’ y’

x’ X’

Y’ x’

next y’

X’

7 3

FFT mode

Modes of Operation (from the controller)

Butterfly

9

IO mode Butterfly 7 stage 3

This type of implementation of the butterfly processor leads to an increase in computational speed at a cost of increased silicon area relative to using a serialparallel multiplier. However, bearing the length of the transform in mind, to achieve high throughput and high speed of operation this trade-off is cost effective. The butterfly processing element takes four cycles to compute a two-point FFT. It has a latency of five cycles. Three of these are associated with the fact that three input components (y, Y, and x) are required before an output can be computed and two are used to pipeline the RAM read and write operations. Thus, the write and access times of the RAM are is not a critical path of the operation of the processor. The target speed to the processor is clock frequency of 40MHz which results in a butterfly computation of 100ns. Allowing for the pipeline delay, the total computation time for a 256-point complex FFT is 102.4µs excluding data I/O processes. The butterfly processing element sequence is shown in Table 2 Cycle

Butterfly Generator Stage Generator

Butterfly IOdone

3

IOdone stage done

stage done stage

FFTdone

all stages done FFT mode address Base Index Generator

8

Skew Block

o

9

RAM WRITE ADDRS

9

RAM READ ADDRS

1

EN o

Bit-reversed Addrs 1 9 IO mode IO mode address 7 ROM Addrs. Coefficient ROM address Generator

IO Address Generator EN

ROM Addrs

Figure 7: Block diagram representation of AGU

The butterfly generator keeps track of which butterfly is being computed in a particular stage. It is basically a nine-bit up counter since for 256-point complex FFT there are 128 butterflies per stage and 4 data words per butterfly (2 real and 2 imaginary). It generates three signals called “IOdone”, “stage done” and “butterfly”. “IOdone” is generated when the “butterfly” count is 512. This informs the controller that either the Data Input or Output process is finished. The “butterfly” is the nine bit wide counter output it is also used for addressing the RAM during Input and Output Processes and providing basic timing for the FFT process. The “stage done” signal is generated when the current “butterfly” count is 128, it increments the stage generator by one. The stage generator keeps track of the current stage in the FFT computation. The stage generator supplies the base index generator with the number of the stage which is currently being computed. For a 256-point FFT, there are eight stages hence the stage generator is basically a three-bit counter which is incremented one every 128 butterfly counts (by the “stage done” signal). It generates a signal called “FFTdone” when the stage count is eight. This informs the controller that the FFT computation process is done hence forcing the FFT processor to start the data output process.

prev X’

The IO Address Generator is responsible for generating addresses for RAM during the data input and output processes. During the data input process the output of the butterfly generator “butterfly” can be used for addressing 512 locations in the RAM. However, during the data output process data should

Table 2: Butterfly Processing Element Sequence Table

99

be bit-reversed while being written to outside world. Once in the output process bit-reversed address is selected by the muxes in the AGU. There is no extra hardware required for implementing bit-reversing in hardware. The base index generator is responsible for generating addresses during the FFT computation mode. FFT mode address generation is quite complex. The butterfly has two complex data inputs A and B. A is referred as “index0” and B as “index1”. “Index1” can be calculated from index0 by [8]: index1 = index1+Ns

(8)

where Ns is the index spacing which can be N/s expressed as 2 where s is the stage number and N is the transform length. Also “index0” can be expressed as [8]: index0 =2*Ns*(butterfly DIV Ns)+butterfly MOD Ns (9) Where “butterfly” consists of first seven bits of the butterfly generator i.e. butterfly = b6b5b4b3b2b1b0, and “index0” is 8-bit wide RAM address. Table 3 shows calculated “index0” for 256-point FFT computation for each stage [8]. stage Ç 1 2 3 4 5 6 7 8

0 b6 b6 b6 b6 b6 b6 b6

È

index0 b6 0 b5 b5 b5 b5 b5 b5

b5 b5 0 b4 b4 b4 b4 b4

b4 b4 b4 0 b3 b3 b3 b3

b3 b3 b3 b3 0 b2 b2 b2

b2 b2 b2 b2 b2 0 b1 b1

b1 b1 b1 b1 b1 b1 0 b0

b0 b0 b0 b0 b0 b0 b0 0

to the memory. Whereas, “opready” stands for data being ready to be read out from the memory. Output signals “input”, “output” and “FFT” mode , gives information about the process the FFT processor is currently in. This information is quite vital, since address and data generation in each of the above modes are different. Signals “C0”, “C1”, “C2” and “C3” are four cycles required for the FFT computation process. As explained before, the butterfly processing element has latency of four cycles for calculating the butterfly operation. Signal “INC” is used to increment the butterfly generator, so that it can iterate over all butterflies in a stage. “CLEAR”, on the other hand, resets the butterfly and stage generators. This is done once in every 128 cycles during FFT mode and once in 512 cycles in input and output modes. Butterfly and stage generators are also cleared before and after each mode of operation. Signal “ENBW” enables RAM for writing whereas, “ENBOR” enables the RAM for reading. “ENBW” is generated only during the input mode and “ENBOR” is generated only during the output mode. As the FFT processor was designed and optimized for performing high speed sum of products operations, it is easily deployable in a variety of DSP based sum of products intensive instrumentation and measurement applications, such as correlation, convolution and digital filtering. The processor is compiled using 0.7µ CMOS technology. The size of the chip (including 2 pads) is 3.7mm by 4.1mm i.e. 15.17mm .The size 2 without the pads is 2.7mm by 3.2mm i.e. 8.64mm . The overall FFT chip plot can be seen in Figure 8.

where butterfly=b6b5b4b3b2b1b0 Table 3: Binary representation of the index0 in a 256 point DITFFT [8].

As shown in Table 3, “index0” can be derived by simply re-arranging the bits in the “butterfly” and inserting zeros in the leading diagonal. “Index1” can simply be obtained from “index0” by replacing zeros on the leading diagonal with ones [8]. This means that with a slight modification to the base index generator we can generate “index1” as well. C.

Controller

The sequence of events is determined by the controller depending on the feedback it receives from the surrounding units. Basically the controller is a Moore machine with outputs depending on the values of the next states. The input signal “INIT” is an external asynchronous signal used to reset the FFT processor. “Stage done” signal is generated by the butterfly generator. It implies that a particular stage of the FFT computation is done. Another input signal, “IOdone” informs the controller that either the input or output process is done. This is generated by the butterfly generator as well. Signal “FFT done” implies that the FFT computation process is accomplished. Signal “ipready” means that data is ready to be written

Figure 8: FFT Chip plot

V. SIMULATION The whole architecture is simulated at the logic level using simulation models generated with Cascade Design Automation EPOCH. These models included extracted and back annotated capacitive trackload models. MENTOR Quicksim package was used to carry out simulations.

100

VI. CONCLUSION An FFT processor architecture optimised for speed and area has been designed. The algorithm used was a modified version of the DIT-FFT with the inputs in natural order and the outputs in bit reversed order. The architecture consisted of a bit-parallel radix-2 butterfly, dual-port FIFO RAM, address generation unit and the controller. Separate memories are used for storing the data and the coefficients. Although the processor designed is quite small and fast there are some improvements that can be mode. Most of the cells used to build the FFT processor have been optimized for speed rather than area and power consumption. These blocks can be redesigned for reduced area and power consumption. Also investigation into the use of more than one butterfly processing element is recommended. At present we have a FFT processor architecture implemented on a 0.7µ CMOS technology. The size of the chip (including pads) is 3.7mm by 4.1mm i.e. 2 15.17mm .The size without the pads is 2.7mm by 2 3.2mm i.e. 8.64mm .The FFT processor is capable of computing 256 point complex FFT in 102.4µs excluding data input and output processes. The chip is operating with a clock frequency of 40MHz.

Serial FFT Processor With a Hierarchical Control Structure”, Proceedings -EECCTD’ 95 European Conference on Circuit Theory and Design, I.T.U. , pp. 423-426, Sept. 1995. [9] Sohie, R. L. G., “Implementation of Fast Fourier Transforms on Motorola’s DSP 56000/ DSP56001 and DSP 96002 Digital Signal Processors”, Motorola Inc, 1991, ISBN 0-20154413-X. [10] Proakis, B. J. and D. G. Manolakis, “Digital Signal Processing: Principles , Algorithms and Applications”, New York: Macmillan Publishing Company, Second Edition, 1992, ISBN: 0-02-396815-X. [11] Jervis, W. B. and E. C. Ifeachor, “Digital Signal Processing: A Practical Approach”, Addison-Wesley Publishing Company Inc., 1993, ISBN 0-201-54413-X.

ACKNOWLEDGEMENTS The authors wish to express their thanks to Overseas Research Students Award Scheme (ORS) for supporting this work and Cascade Design Automation for the EPOCH silicon compilation tool. REFERENCES [1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

Blasco-Gimenez, R., G .M., Asher , M., Summer and K.J., Bradley,” Performance of FFT-rotor slot harmonic speed detector for sensorless induction motor drivers”, IEE Proceedings -Electric Power Applications, vol. 143, iss. 3, pp.258-68, May. 1996. Flikkema, P.G. and S.G. Johnson,” Vehicle collision warning and avoidance system using real-time FFT”, IEEE 46th Vehicular Technology Conference. Mobile Technology for the Human Race (Cat. No.96CH35894), vol. 3, pp. 1820-4,1996 Zimmerman, G.A., M.F. Garyantes and M.J Grimm,”A 640 MHz 32 megachannel real-time polyphase-FFT spectrum analyzer”, Conference Record of the Twenty-Fifth Asilomar Conference on Signals, Systems and Computers (Cat. No. 91Ch3112-0), vol..1 pp. 106-10, 1991. Lecce, V. D. and D. E. Sciascio, “A VLSI Implementation of a Novel Bit-Serial Butterfly Processor for FFT”, Proceedings, Advanced Computer Technology, Reliable Systems and Applications Proc. 5 Eur. Comput. Conf. Adv. Comput. Technol. Reliable. Syst. Appl. Comp Euro’91., no. 1991, pp. 875-879, 1991. Chen, T. and L. Zho, “An Expandable Column FFT Architecture Using Circuit Switching Networks”, Journal of VLSI Signal Processing, vol. 6, no. 3, pp. 243-257, Dec. 1993. Szwarc, V. , L. Desormeaux, W. Wong, S. P. C. Yeung, H. C. Chan and A. T. Kwasnievski, “ A Chipset for Pipeline and Parallel Pipeline FFT Architectures”, Journal of VLSI Signal Processing, vol. 6, no. 3, pp.253-265, Dec. 1994. Storn, R. “Radix-2 FFT- Pipeline Architecture with Reduced Noise to Signal Ratio”, IEE Proceedings - Vision, Image and Signal Processing , vol. 141, no. 2, pp.81-88, Apr. 1994. Melander, J. , T. Widhe, P. Sanbarg, K. Palmkvist, M. Vesterbacka and L. Wanhammar, “Implementation of a Bit-

101

An Integrated 256-point Complex FFT Processor for Real ... - CiteSeerX

An Integrated 256-point Complex FFT Processor for Real ... - CiteSeerX

Suggest Documents

an fft processor based on 16-point module - CiteSeerX

Design Automation for a 3DIC FFT Processor for Synthetic ... - CiteSeerX

Vector Processor Customization for FFT - Computer Engineering ...

An Survey of Low Power FFT Processor for Signal ...

Design and Evaluation of an FFT Processor Utilizing ... - Wsimg.com

a novel low-power reconfigurable fft processor - CiteSeerX

a pipeline fft processor - Semantic Scholar

An Integrated Educational Platform Implementing Real ... - CiteSeerX

Implementation of a Single FFT Processor

Optimized Hardware Implementation of FFT Processor

An Integrated Flow Cytometry-Based System for Real ... - CiteSeerX

An Integrated SystemC Framework for Real-Time ... - CiteSeerX

low-power application-specific processor for fft computations

27 an integrated database for real-time management of ... - CiteSeerX

Multi-FFT Vectorization for the Cell Multicore Processor - Oak Ridge ...

An integrated microfluidic processor for single ... - Google Sites

Real-Time Tone-Mapping Processor with Integrated Photographic and ...

Real-Time Tone-Mapping Processor with Integrated Photographic and ...

An Integrated Real and Non-Real Time Collaboration Environment for ...

Petri Net Models for Single Processor Real-Time Scheduling - CiteSeerX

Processor-Sharing Models for Integrated-Services Networks

Real-time signal processor for pulsar studies

COMPLEX MULTIPLICATION REDUCTION IN FFT PROCESSORS

An Integrated Heuristic Approach to Power-Aware Real ... - CiteSeerX