Available online at www.sciencedirect.com
ScienceDirect Procedia Technology 10 (2013) 856 – 865
International Conference on Computational Intelligence: Modeling Techniques and Applications (CIMTA) 2013
Evaluation of Power Efficient FIR Filter for FPGA Based DSP Applications Subhankar Bhattacharjeea, Sanjib Silb, Amlan Chakrabartic*
a,b
Dept-ECE,Techno India College of Technology, Newtown, Rajarhat, Kolkata-700156, India A.K.Choudhury School of Information Technology, University of Calcutta, Kolkata, India a b c Email:
[email protected],
[email protected],
[email protected]
c
Abstract This paper describes the design and implementation of low power FIR filter for digital signal processing (DSP) applications, using Xilinx 6V1X130T1FF1156 (Virtex-6 Low Power) field programmable gate array (FPGA) devices. DSP is a highly demanding application domain in the present day technology wherein the demands for enhanced performance and reduced resource utilization have increased over the years. Recent advancements in FPGA design technology through the incorporation of DSP functional blocks along with the inherent FPGA features like high flexibility through reconfiguration, reusability, moderate cost and feature extension has resulted in FPGA(s) becoming the preferred platform for evaluating and implementing DSP. In this work we have implemented the various forms of FIR filter on FPGA and compared their performances in terms of delay, frequency of operation, resource utilization and power. To the best of our knowledge our work is first of its kind in respect to Virtex-6 FPGA devices. Our research paves the way for selecting the most suitable FIR filter architecture for DSP implementation using Virtex-6 FPGA.
2013The TheAuthors. Authors.Published Published Elsevier Ltd. Open access under CC BY-NC-ND license. ©©2013 byby Elsevier Ltd. Selection responsibility of of thethe University of Kalyani, Department of Computer Science & Engineering. Selectionand andpeer-review peer-reviewunder under responsibility University of Kalyani, Department of Computer Science & Engineering Keywords: Finite impulse response (FIR) filter; low power design; FPGA; DSP; Xilinx Xpower;
1. Introduction The finite impulse response (FIR) filter is the most fundamental circuit, wh i ch i s used to build DSP [1]
* Corresponding author. Tel.: +0-000-000-0000 ; fax: +0-000-000-0000 . E-mail address:
[email protected]
2212-0173 © 2013 The Authors. Published by Elsevier Ltd. Open access under CC BY-NC-ND license. Selection and peer-review under responsibility of the University of Kalyani, Department of Computer Science & Engineering doi:10.1016/j.protcy.2013.12.431
Subhankar Bhattacharjee et al. / Procedia Technology 10 (2013) 856 – 865
hardware. FIR digital filters find extensive applications in mobile communication systems such as channel equalization, matched filtering, and pulse shaping, due to their absolute stability and linear phase properties [14]. Since these circuits perform key operations in the present day DSP, their speed and p o w e r optimization are crucial quality factor in high performance DSP applications. Typically DSP applications require a tradeoff between power consumption and speed, hence there is an immense need for low power [6,7,13,14] high speed design of digital filter circuits. Nowadays, FPGA is a much favoured platform for digital VLSI design. This is due to the high flexibility, reusability, low power, moderate cost, easy upgrading (due to usage of hardware description languages (HDLs)) and feature extension (as long as FPGA is not exhausted) facilities available in the FPGAs. Field programmable gate arrays (FPGAs) provide a configurable structure through an array of adjustable logic modules interconnected by programmable routing resources and surrounded by programmable input/output blocks. In this research work we have presented the FPGA based design and implementation of low power FIR circuit comparable with peer research works [15,16] for DSP applications, as well as their performance analysis based on resource usage, delay and power considerations. This paper shows the results of our study based on the implementation of different types FIR [1] circuit using Virtex-6 FPGA, in respect to resource utilization, delay and power measure. Though related work exists for the design of the FIR circuit in VLSI domain for DSP applications [5, 6, 7, 8, 11], the FPGA based design issues were not considered in those. The works on FPGA based design of FIR circuit [9, 13, 16] have focused more on custom design and their realization, but a detailed analysis of the different FIR circuit in terms of FPGA design metrics of resource utilization, delay and power were not discussed. In [15], the authors have discussed about the FIR circuit for FPGA based design, but no results of power analysis were demonstrated. In our research we have tried to focus on the design and implementation issues of the different form of FIR circuits like Direct Form (DF), Transpose Form (TF), DF2, TF2, DFPOLY, DFPOLYPIPE, TF2POLYPIPE etc. are used in DSP applications, in terms of the FPGA design metrics. The work done in our paper is quite extensive and it proves to be a very good analysis in the domain of FPGA based circuit design for DSP applications. Virtex-6 low power FPGA [2] has been chosen for implementing our designs, since it possesses DSP-support features. The paper is divided into six sections. After the introduction, Section 1.1 describes the design of different f o r m o f F I R circuits. S ection 1.2 describes t h e different p i p e l i n e f o r m o f F I R circuits. The simulation environment is briefed in Section 1.3. The analysis of the results is presented in Section 1.4, and concluding remarks are discussed in Section 2.
1.1. Design of basic FIR filter circuits We have designed and implemented the following FIR filter circuits
The basic architecture of FIR filters of different form i.e. Direct form (DF), Transpose form (TF), Direct form2 (DF2), Transpose form2 (TF2) can be implemented by using basic identical multipliers, adders and delay units [17]. The transfer function for a FIR filter system in z domain is shown in equation [1] (1).
H ( z)
M 1
b k 0
Where
k
z k
(1)
857
858
Subhankar Bhattacharjee et al. / Procedia Technology 10 (2013) 856 – 865
H ( z)
Y ( z) (Output Sequence to Input Sequence) X ( z)
Variables b k and M in Equation (1) are the filter coefficients and the filter length respectively. This transfer function has a variety of realization structures. The FIR filters can be implemented by using any of the conventional filter structures shown in Figure 1. The block diagrams in Figure 1 are cascaded realizations of an FIR filter of order M=5. We observe that the filter blocks are modular in nature with regular data flow and can be easily extended to any order filter by simply cascading the computational cell. The input and output cells of the filter may vary depending on the choice of filter structures.
1.2. Design of pipeline FIR filter circuits We have implemented the pipelined FIR
Different Form of FIR filter DF
Maximum Frequency (MHz)
Delay (nSec)
57.003
17.543
Number of Slice Registers (160000) 290
Number of Slice LUTs (80000) 1368
designed and following filter circuits:
The technique hardware
The poly phase technique is widely used in multi rate high speed filters design [5]. In our design we have implemented both type i.e. poly phase as well as pipeline FIR filter. The design of DFPOLY, DFPOLYPIPE and TF2POLYPIE are shown in Figure 2,3,4,5.
pipelining is used in TF 118.793 8.418 498 1275 systems for concurrent DF2 115.554 8.654 497 1256 processing. TF2 58.624 17.058 288 1336 Concurrent processing of data DFPOLY 79.014 12.656 515 1510 divides the computational load between multiple processing elements, which in turn helps achieve high processing rates for large designs. Pipelining techniques can reduce the critical path delay of the system and can eliminate broadcasting and global interconnections within the design [5]. Hence, pipelining is the key strategy in our design.
859
Subhankar Bhattacharjee et al. / Procedia Technology 10 (2013) 856 – 865
Table I. Results of filters implemented 6v1x130t1ff1156
DFPOLYPIPE
139.392
7.174
1156
1617
TF2POLYPIPE
123.365
8.106
707
1356
different form of FIR on virtex-6 low power,
Table II: Results of power estimation of different form of FIR filters implemented on virtex-6 low power, v1x130t1ff1156
1.3. Simulation environment We have performed all the FIR circuits i.e. Direct Form (DF), Transpose Form (TF), DF2, TF2, DFPOLY, DFPOLYPIPE, TF2POLYPIPE implementation on Virtex-6 low power, 6V1X130T1FF1156 device with Xilinx ISE11.4 design environment, for circuit simulation purpose we have used Xilinx System Generator with Matlab 2009B and for optimized power estimation we have used Xilinx Total Dynamic Total Junction Different Total Xpower tool. Power Power Temperature Form of Quiescent Xpower [2] power based on the dynamic consumption in due to switching element (LUT, routing segment) a capacitance with it. Clock primary input assigned specific user. Xpower
FIR filter
Power (In Watt)
(In Watt)
(In Watt)
(0C)
DF
0.92754
0.07027
0.99781
52.7
TF
0.92754
0.07047
0.99801
52.7
DF2
0.92755
0.07061
0.99816
52.7
TF2
0.92756
0.07131
0.99887
52.7
DFPOLY
0.92760
0.07311
1.0007
52.7
DFPOLYPIPE
0.92763
0.07463
1.00226
52.7
TF2POLYPIPE
0.92759
0.07289
1.00048
52.7
estimates the the observation of power CMOS circuits activity. Each FF, BRAM, and that can switch has model associated signals and signals are frequencies by the estimates power
860
Subhankar Bhattacharjee et al. / Procedia Technology 10 (2013) 856 – 865
as a summation of the power consumed by each element in the design. The power consumed by each switching element in the design is given by: P = C * V2 * E * F * 1000 Where: P=Power in mW; C=Capacitance in Farads V=Volts; E=switching activity F=Frequency in Hz
1.4. Analysis of results In this section, we illustrate the performance results of the architectures we synthesized on the Virtex-6 low power FPGA and we discuss the area, clock, latency and power results. The design space considered in this work consists of seven hardware implementations listed in Table I. First the basic architectures 16 tap DF, TF, DF2 and TF2 FIR filters are implemented, second the poly phase 16 tap structure using DF filter i.e. DFPOLY is implemented and third we pipeline the DF and TF2 filter structures in the poly phase architectures i.e. DFPOLYPIPE and TF2POLYPIPE are implemented. The power estimation for different architecture of FIR filters are listed in Table II. From our comparison of FIR filters in Figure 6 we can say that minimum delay FIR filter is DFPOLYPIPE whose useable frequency is maximum and maximum delay FIR filter DF whose useable frequency is minimum. We can suggest that the better design objective would be to design an FIR filter circuit with minimum delay for high frequency applications. DFPOLYPIPE is the most suitable filter structure for high frequency application. Here the pipelining technique reduces the critical path delay and increases the clock frequency and throughput of the design. From our comparison of registers and LUTS of different FIR filter in Figure 7, we can say maximum resources are consumed by DFPOLYPIPE FIR filter and minimum resources are consumed by DF FIR filter. From the comparison of Figure 8 we can also observe that maximum total quiescent power and total dynamic power is consumed by DFPOLYPIPE FIR filter and minimum total quiescent power and total dynamic power is consumed by DF FIR filter. Here also we can conclude that the architecture that are best suited for minimum area, power using the results is DF FIR filter. The throughput and the delay for this architecture are not good. The DFPOLYPIPE would be the most suitable architecture for standalone applications, where high through put or high speed application is the primary concern at an additional penalty of area, power and latency increase. Different stages of pipelining can be applied depending on the design requirements and the applications. We can choose pipelined FIR filter with different depths to further increase the throughput of the design. Alternately, we can selectively pipeline the filter structures in the design to increase the throughput without sacrificing much of the area or the power. Therefore, a trade off between all the parameters can be easily achieved depending on the application.
Subhankar Bhattacharjee et al. / Procedia Technology 10 (2013) 856 – 865
Figure 1. Block Diagrams for basic FIR filter structures
861
862
Subhankar Bhattacharjee et al. / Procedia Technology 10 (2013) 856 – 865
Figure 2. Block Diagram for poly phase FIR filter structure
Figure 3. Block diagram for DFPOLY FIR filter even or odd block
Figure 4. Block diagram for DFPOLYPIPE FIR filter even or odd block
863
Subhankar Bhattacharjee et al. / Procedia Technology 10 (2013) 856 – 865
20 18 16 14 12 10 8 6 4 2 0
Frequency(MHz)
Delay(ns)
Figure 5. Block diagram for TF2POLYPIPE FIR filter even or odd block
160 140 120 100 80 60 40 20 0
FIR Filter
1400 1200 1000 800 600 400 200 0
(b) Comparison of frequency for different form of FIR filter
Number of LUTs
Number of Regirters
Figure 6. (a)Comparison of Delay for different form of FIR filter;
FIR Filter
FIR Filter Figure 7. (a)Comparison of registers for different form of FIR filter;
1800 1600 1400 1200 1000 800 600 400 200 0
FIR Filter (b) Comparison of LUTs for different form of FIR filter
Subhankar Bhattacharjee et al. / Procedia Technology 10 (2013) 856 – 865
Power(Watt)
0.92764 0.92762 0.9276 0.92758 0.92756 0.92754 0.92752 0.9275 0.92748
Power(Watt)
864
0.075 0.074 0.073 0.072 0.071 0.07 0.069 0.068
FIR Filter FIR Filter Figure 8. Comparison of Quiescent power for different form of FIR filter;
(b) Comparison of Dynamic power for different form of FIR filter
2. Conclusion Design of low power high speed FIR filter is always a challenge for DSP applications. Our study provides a comparison of the different architecture of FIR filter implemented in Virtex-6 FPGA. In peer research works [14, 15, 16, 17] they have compared about the delay and the area and power for pipelined FIR filter but did not considered power consumption of the non pipelined FIR filters. In our research we have focused both on delay and power consumption of different types of pipelined and non pipelined FIR filters on FPGA. The Xilinx Virtex6(6V1X130T1FF1156, 40nm technology) FPGA [2] family was chosen as the implementation platform, because it provides special features to support DSP operations, a trend that seems to be common at the present time for FPGA chips. From our research we have observed that the pipeline FIR filter design takes more registers i.e. more resources and power consumption is also high, so it is suitable for high speed DSP application. But where the resource utilization and power dissipation are primary concerns than the speed of the DSP applications, then we can use non pipeline FIR filter i.e. basic DF FIR filter. References 1. J.G. Proak is a nd D.G. Ma nolak is, Digital Signal Processing: Principles, a lgorithms a nd Applica tions. Upper Sa d dle river, NJ: Prentice-Ha ll,1996 . 2. http://www.xilinx.com/support/documentation/sw_manuals/xilinx13_4/ug440.pdf. 3. A.Shaw and M.Ahmed, “Pipelined recursive digital filters: a general look ahead scheme and optimal approximation,” IEEE Trans. On Circuits and Systems II: Analog & Digital Signal Processing, Vol.46, no.11, pp. 1415-1420, Nov. 1999. 4. Chao-Huang Wei, Hsiang-Chieh Hsiao, Su-Wei Tsai “FPGA Implementation of FIR Filter with smallest Processor” IEEE NEWCAS conference,pp.337-340,19-22 June 2005. 5. R. A. Hawley, B. C. Wong, T.J. Lin, J. Laskowski, and H. Samueli, “Design techniques for silicon compiler implementations of high-speed FIR digital filters," IEEE Journal of Solid-State Circuits, vol. 31, pp. 656-667, May 1996. 6. L.Wang and N.R.Shanbhag, “Low power filtering via adaptive error cancellation, ” IEEE Tran. Signal Process., vol. 51, no.2 , pp. 575583, Feb. 2003. 7. J.H.Chai, N.Banerjee, and K.Roy, “Variation aware low power synthesis methodology fir fixed point FIR filters,” IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., vol.28, pp.87-97,2009. 8. RI.Hartley, “Sbexpression sharing in filters using canonic signed digit multiplier,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 43, no.10, pp. 677-688, Oct. 1996. 9. A.G.Dempster and M.D.Macleod, “Use of minimum adder multiplier blocks in FIR digital filters,” IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol.42, no.9, pp. 569-577, Sep.. 1995. 10. Lakshmi Narayanan G. and Venkataramani B., “Optimization Techniques for FPGA-Based Wave Pipelined DSP Blocks” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol.13. no 7 . pp 783- 792, July 2005.
Subhankar Bhattacharjee et al. / Procedia Technology 10 (2013) 856 – 865
11. Y.Voronenko and M.Piischel, “Multiplierless multiple constant multiplication,” ACM Trans. Algorithms, vol.3, pp.1-38, May 2007. 12. M.Aktan, A.Yurdakul, and G.Dunder, “An algorithm for the design of low power hardware efficient fir filters,” IEEE Trans. Circuits Syst. Integr. Reg. Papers. Vol. 55, no.7 pp. 1536-1545, Jul. 2008. 13. K.H.Chen and T.D.Chiueh, “A low power digit based reconfigurable FIR filter,” IEEE Trans. CircuitsSyst. II vol.53, no.8, pp. 617-621. Aug. 2006. 14. A.P.Vinod and E.Lai, “Low power and high speed implementation of FIR filters for software define radio receivers,” IEEE Trans. Wireless Commun., vol.5, no.7, pp. 1669-1675, Jul. 2006. 15. A.P.Vinod and E.M.K.Lai, “An efficient coefficient partitioning algorithm for realizing low complexity digital filters,” IEEE Trans. Computer Aided Design Integr. Circuits Syst., vol. 24, no.12, pp. 1936-1946, Dec. 2005. 16. R.Mahesh and A.P.Vinod, “New reconfigurable architectures for implementation FIR filters with low complexity,” IEEE Trans. Computer Aided Design Integr. Circuits Syst., vol. 29, no.2,Feb. 2010. 17. Hourani, Y. Kim, S. Ocloo, and W. Alexander, “Automated hardware IP generation for digital signal processing applications," in Signals, Systems and Computers, 2006. ACSSC ' 06. Fortieth Asilomar Conference on, (Paci_c Grove, CA, USA), pp. 1156-1160, Nov. 2006.
865