This FPGA is provided in a ML402 XTREME DSP evaluation platform. .... IEEE Standard VHDL Language Reference Manual, IEEE-1076-2000, 2000. 11. [5].
high performance FFT for OFDM Modulator and ... systems such as Asymmetric digital subscriber line (ADSL),. Wireless ... points FFT and higher symbol rates.
May 5, 2009 - carry out the initial down conversion prior to digitization. With present ... high-resolution digitization at sample rates of up to several hundred ...
Email: [email protected], [email protected], ... To convert X(k) into an N/4 4-point DFT, the DFT sequence is subdivided into four ... point format. ..... t.html. [3] J. Cooley and J. Tukey, âAn Algorithm for the Machine Calculation ...
adaptive fuzzy logic controller for DC motor speed control. E.A. Ramadan *, M. ... generally used to control systems which include unknown .... It can be a continuous or discrete system ..... [13] Ogata K. Modern control engineering. 3rd edition.
a single transformation imposed on the whole image and are also referred to as ..... processing packages) to produce images that are of equal size and, if ...
the computation on encrypted data using an FHE cryptosystem based on NTRU-like .... memory map, manage on board data storage in the form of an encrypted ...
Jun 2, 2012 - Xtreme DSP Slices and the use of block RAM, and relative to block .... Vision[C].Proceedings of the 2001 IEEE International Conference on.
In this study, a hardware solution for car plate detection problem is proposed based on softcomputing techniques, namely the genetic algorithm and neural ...
The use of programmable or reconfigurable logic to accelerate algorithms of .... The RIPP10 board has a highly interconnected array of 8 FLEX81188 FPGA ... not completely automatic, therefore there are some relatively simple manual tasks,.
fabric processor based field programmable gate array (FPGA). Such an approach allows for ... ment units; FPGA, Kalman filter; robotics. ... low-cost MEMS inertial measurements units [4] have opened the door to the ..... Honeywell. HMR3000.
On the Maestro processor, we implemented an FFT and an im- age processing application called CRBLASTER and evaluated the performance. The FFT is a ...
Local Area Network (LAN) the standards have chosen ... An extension beyond the standards ..... boards are Compact PCI compliant and can be plugged in.
Jun 4, 2011 - 2 Department of Computer Engineering, University of Balamand, Tripoli, Lebanon. 3 Electrical and Electronics Department, Faculty of ...
requirements of standards for advanced communications systems can be ...... Since the communication toolbox in MATLAB Simulink implements a variety of ...
Trivium, Phelix and MICKEY are shown respectively. These implementations have been shown in the latest stream cipher workshop [6]. In [12] and in [13] two ...
Feb 17, 2009 - variables in any feasible solution take integer values. Real- world problems often ... ables, typically proceed by a sequence of many non-integer relaxation problems ... to perform much faster than traditional load-store processor- ...
is still possible to decipher phone calls. In 2009, the ... of the attack for CUDA graphics cards or the PS3 cell proces
Image Enhancement Algorithms can be classified as point processing ... Modern FPGA chips include dedicated DSP functions, PowerPC, Micro blaze, etc which ...
May 10, 2012 - of GHA software implementations, the tool Joulemeter (developed by Microsoft Research) [24] is used. The tool is able to estimate the power ...
Abstract. The most-significant-digit-first function evaluation method (E-method) allows efficient evaluation of polynomials and certain rational functions on custom ...
up of up to 20 times over a highly optimized commercial software .... in row zero of the (m + 1) x (n + m + 1) tableau, and the right hand size ... bi/aik and aik > 0. 5.
Email: [email protected]. Junyi Liu ... Email: [email protected]. AbstractâThe ..... we employed MathWorks' HDL Coder for translating the.
This paper proposes three different architectures for im- plementing a least mean square (LMS) adaptive filtering algorithm, using a 16 bit fixed-point arithmetic ...
with the modern commercial off-the-shelf PCs. ⢠User-space interfacing through the Universal Hardware Drivers (UHD), in both Windows and Linux OS. ⢠Support ...
Implementation of an FFT on the FPGA of USRP2 boards J.L. Buthler, M.Buhl, G. Berardinel,A.F. Cattoni (presenter) Aalborg University, Denmark
Software Defined Radio (SDR) testbeds can greatly benefit the research environment due to the flexibility and reconfigurability Concerning SDR development, widespread acceptance has been achieved by the Universal Software Radio Peripheral (USRP) by Ettus Research • • • • •
Cost-effectiveness, that makes it suited for the realization of large scale testbeds Support of a large set of inter-changeable Radio-Frequency (RF) front-ends Connectivity with the host PC through the Gigabit Ethernet port, ensuring compatibility with the modern commercial off-the-shelf PCs User-space interfacing through the Universal Hardware Drivers (UHD), in both Windows and Linux OS Support of the open source GNU Radio software radio development framework
Motivation •
The current FPGA firmware only contains basic operations as: • Communication between Host-pc and USRP • Filtering • Analog/Digital conversion • Up/down sampling of the RF signal
•
Implementation of further digital signal processing on the remaining resources can boost data rate performance Two main reasons for focusing on the FFT: • Prevents further communication between Host-pc and USRP, which would decrease data rate due to stress on the Ethernet connection • Found to be one of the computational heavy tasks of a modern communication chains, such as LTE, and is theoretically able to gain up to N/2 times the speed by parallel processing (N being the number of bins in the FFT)
•
Challenges • Implementation should be compatible with the existing firmware • New versions of the USRP firmware uses most of the external Ram for the timestamp feature (97% of the USRP2 and 50% of the USRPN200) • Current dataflow is serialized but the FFT needs to operate on a packet of data • The 100 MHz clock sets high requirements for the amount of parallelization/pipelining • Parallelization requires a lot of multipliers (at least 4 for each 2 input samples ) • Pipelining requires memory and multipliers • Xilinx’s core generators own FFT uses many resources and does not intuitively adapt to the existing dataflow of the FPGA firmware • The FFT has to be designed from scratch with resource efficiency in mind
Solution
• •
General data processing module was designed to adapt the dataflow of the current firmware Data is intercepted between the existing data processing block (CIC filter) and the VITA module (handles timestamping and adding other metadata to the samples)
Solution •
•
•
The module is transparent to the existing firmware since it adapts the dataflow using the strobe signal Uses a pipeline to collect samples for processing Grey box is a setting register reused from the original firmware, the plan is to utilize it to gain information about the desired FFT length and then make the module decide the CP length itself
Solution • • • • • • •
Uses Cooley Tukey FFT algorithm Resource optimized 16 bit fixed point operations Able to support different lengths without new firmware image has to be loaded Supports FFT length of up to 1024 using the 100 MHz clock Uses Dual Access RAM to load the I and Q sample simuntaniously Decimation of 1/N to avoid overflow
Results – Xilinx coregen vs. Proposed design Specification
Xilinx Core Gen.
Proposed design
Required
FFT size
1024
1024
1024
Input data width
16
16
16
Twiddle factor width
16
16
16
CP insertion
No
No
Yes
Speed [Msps]
~30
~16
14.336
Latency [micro s]
34
64
71
Scaled
Yes
Yes
Yes
Data format
Fixed point
Fixed point
Fixed point
Optimized for
Resource usage
Resource usage
Resource usage
Resource
Xilinx Core Gen.
Proposed design
Remainding on the USRP2
Slice
1909
1252
11878
Slice Flip Flops
2816
279
23756
4 input LUTs
2796
2331
23756
BRAMs
7
4
2
Results – High increase if 2048 point should be realized Specification
Xilinx Core Gen.
FFT size
2048
Input data width
16
Twiddle factor width
16
CP insertion
No
Speed [Msps]
~30
Latency [micro s]
34
Scaled
Yes
Data format
Fixed point
Optimized for
Resource usage
Resource
Xilinx Core Gen.
Remainding on the USRP2
Slice
4307
11878
Slice Flip Flops
6329
23756
4 input LUTs
5905
23756
BRAMs
11
2
Conclusions •
It is tractable to implement new data processing features on the remainding fabric of the USRP2, however the memory usage of the ”timestamp” feature is quite high
•
A resource optimized FFT has been designed which supports an FFT length of up to 1024, given the current FPGA clock frequenzy
•
Xilinx coregen is faster than the proposed design, however this speed is unnecessary and comes at the cost of even further RAM usage
•
The USRPN200 provides more avaliable RAM and should be the target for implementation