Global Journal of Advanced Engineering Technologies,Vol3,Issue3-2014
ISSN: 2277-6370
DESIGN OF LOW POWER FIR FILTER Sanjay Goyal, G.K.Singh, R.M.Mehra Department of Electronics and Communication Engineering, School of Engineering and Technology, Sharda University, Greater Noida 201 310, India Email:
[email protected]
Abstract — Low Power FIR filters has been implemented with two independent techniques. In the first category of FIR filters, the concept of pipelining has been incorporated that increases throughput (by increasing the operating frequency) while increasing the latency. If the goal is to maintain the same throughput, the pipelined filter can operate at the same frequency but at a lower supply voltage resulting in power reduction proportional to the square of the reduction in supply voltage. At the same time parallel processing is also used so that multiple computations can be performed simultaneously. For the second category of FIR filters, architectural transform in the form of bit-level optimizations have been implemented to yield significant reduction in power. Typical DSP applications where a multiplier plays an important role include digital filtering, digital communication and spectral analysis. In this category, the three important types of multipliers Array, Wallace Tree and Booth Multipliers are constructed. Based on the extensive simulation and synthesis, it is concluded that array multiplier is having largest delay with high power dissipation due to large circuitry involved in the structure. Wallace tree multiplier is having less delay but it is having irregular structure. Therefore the use of Booth Multiplier which provides high performance (less area and reduced delay) at lower power dissipation than that of simple array or Wallace-tree multipliers has been done. In the modified Booth encoding scheme, the number of partial products generated are halved, reducing area and power in comparison to both array and tree based multipliers. By conducting extensive simulation, the design of digital FIR filter using pipelining and modified Booth encoding scheme is proved to be best in area and power dissipation when compared to that of conventional method of designing digital FIR filter using array and tree multipliers. It is observed that significant reduction in number of slices and Look up Tables (LUT) has been achieved with respect to direct method realization. It was also found that while the design is optimized for the low power, Booth multiplier with carry save adder (CSA) has less critical path delay than that of Booth multiplier without carry save adder. So Booth multiplier with carry save adder is best as far as the timing is concerned. Index Terms — FIR Filter, Multiplier, Adder, Pipe-lining
I. INTRODUCTION Much of the research and development efforts in the area of digital electronics have been oriented towards increasing the speed and the complexity of the single chip digital systems. This has resulted in a powerful design technology, which enabled the development of powerful personal workstations, sophisticated computer graphics, and multimedia capabilities such as real-time speech recognition and real-time video [1-5]. High-speed computation has thus become the expected norm from the average user, instead of being the province of a selected few with access to a powerful
mainframe. Thus the processing power or sped has been the metric characterizing the performance of a system. However improvements in system performance generally come at the expense of silicon real-state. So, generally, the task of the VLSI designer has been to explore the area-time implementation space, attempting to strike a reasonable balance between these often conflicting objectives. Until recently, while focusing on area and speed, power considerations were often on of secondary concern. The picture is however, undergoing some radical changes. Power consumption of these high-performance systems has emerged as a major design concern, with several factors contributing to the growth of this important criterion. Some of the driving factors are: (A) The remarkable success and growth of the portable consumer electronics market. Lap-top computers, personal Digital Assistants (PDAs), Cellular phones and pagers have enjoyed considerable success among consumer, and the market of these portable devices is only projected to increase in the future. For these applications, power consumption has become a critical design concern and has perhaps superseded speed and area as the over-riding implementation constraints. (B) Portability is not the sole driving force behind the push for low power. There exists a strong pressure for procedures of high-end products to reduce their power consumption as well. Trends indicate that a contempory high performance processor dissipates as much as 30 W. Assuming that the same trend continues in the future, it can be extrapolated that a microprocessor, clocked at the 500 MHz, would consume 315 Watts of power. The cost associated with packaging and cooling of such devices is becoming prohibitive. Since core power consumption must be dissipated through the packaging, increasingly expensive packaging and cooling strategies are required as chip power consumption increases. Consequently, there is a clear financial advantage in reducing the power consumed by high performance system. (C) The issue of reliability also comes in addition to cost. High power systems tend to run hot and high temperature tend to exacerbate several silicon failure mechanisms. Every 10 degree Celsius rise in operating temperature roughly doubles a component’s failure rate. There are various failure mechanisms such as electro migration, junction fatigue, and gate dielectric breakdown. All the above factors have led to the expansion of the design space with power becoming an equally important criterion as
303
Global Journal of Advanced Engineering Technologies,Vol3,Issue3-2014
area and speed [3-5]. Hence new design methodologies and tools are required to find optimal solutions in this threedimensional design space. II. LOW POWER DIGITAL DESIGN There are three major sources of power dissipation in digital CMOS circuits [6-8], which are summarized in the following equation: =( +( +( ) (1) Here, the first term represents the dynamic, i.e. the switching component of power, where is the capacitive load, is the supply voltage, is the switching frequency, and α is the probability that a power consuming transition occurs (the activity factor). The second term is the power dissipated due to the direct-path short circuit current generated when both NMOS and PMOS transistors are simultaneously active, conducting current directly from supply to ground. The last term is due to the leakage current , which arises from substrate injection and subthreshold effects, and is primarily determined by fabrication technology considerations. Dominating this power equation is the first term, which is due to the switching activity. In most of the well-designed CMOS circuits, under regular operating environment, the dynamic component accounts for about 90% of the total power dissipation. This reveals the three degree of freedom inherent in the low power design space: voltage, physical capacitance, and activity. Optimizing for the power therefore invariably involves an attempt to reduce one or more of these factors (assuming that a reduction in performance is not allowable). Barring a dramatic introduction of novel low power design technology, it is commonly agreed that low power digital design requires optimization at all levels of the design hierarchy, i.e. technology, devices, circuits, logic, architecture (structure), algorithm (behavior) and system levels. To address the challenge to reduce power, a multifaceted approach has to be adopted which attacks the problem on various fronts, some of which are pointed out below: (1) Reducing chip and package capacitance: This can be achieved through process development such as SOI with fully depleted wells, process scaling to sub-micron device sizes, and advanced interconnect substrates such as Multi-Chip Modules (MCM). This approach can be very effective but is very expensive. (2)Voltage Scaling: This approach is the most promising key to achieve greater levels of power reduction, but often requires techniques to compensate for the decreased system performance, Parallel architectures, pipelining or hardware replication provide the mechanism for this trade off by maintaining throughput levels using slower device speed and thus permitting low voltage operation. (3)Using power management strategies: The power savings that can be achieved by various static and dynamic power management techniques are very application dependent, but
ISSN: 2277-6370
can be significant. This includes self-adjusting and adaptive circuit architectures that can effectively respond to the environmentally change as well as varying data statistics. (D) Employing an integrated design methodology: This requires the development of power conscious techniques and tools for behavioral synthesis, logic synthesis and layout optimization. Prime requirements for this are accurate and efficient estimation of the power cost of alternative implementation and the ability to minimize the power dissipation subject to given performance constraints. III. TECHNIQUES FOR POWER REDUCTION IN FIR FILTERS A. PIPELINING Pipelining transformation leads to a reduction in the critical path, which can be exploited to either increase the clock speed or sample speed or to reduce the power consumption at same speed. There are two main advantages of using pipelining (a) Higher Speed and (b) Lower Power The propagation delay is associated with charging and discharging of the various gate and stray capacitances in the critical path. For CMOS circuits, the propagation delay can be written as ! " #
(2)
Where denotes the capacitance to be $ % charged/discharged in a single clock cycle, i.e., the capacitance along the critical path, & is the supply voltage and is the threshold voltage. Parameter k is a function of ' technology parameter µ, , ( . The power consumption of a CMOS circuit can be estimated using the following equation: P=
&
f
(3)
Where denotes the total capacitance of the circuit, & is the supply voltage, and f is the clock frequency of the circuit. Pipelining can be used to reduce the power consumption of a FIR filter. Let (4) % ) = & f represent the power consumed in the original filter. It should * be noted that f= , where % ) is the clock period of the + ,
reference filter. Now consider an M-level pipelined system, where the critical * path is reduced to of its original length and the capacitance to be charged/discharged in a single clock cycle is reduced to .
. Notice that the total capacitance does not change. If the same clock speed is maintained, i.e the clock frequency f is maintained, only a fraction of the original capacitance, -
.
-
, is being charged/discharged in the same amount of
304
Global Journal of Advanced Engineering Technologies,Vol3,Issue3-2014
time that was previously needed to charge/discharge the capacitance, $ % . This implies, then, that the supply voltage can be reduced to β & , where β is a positive constant less than 1. Hence the power consumption of the pipelined filter will be 2
=
1
&
2
% )
(5)
Therefore, the power consumption of the pipelined system has been reduced by a factor of 2 as compared with the reference system.
The power consumption reduction factor, β, can be determined by examining the relationship between the propagation delay of the original filter and the pipelined filter. The propagation delay of the original filter is given by (6)
! " #
The propagation delay of the pipelined filter is given by 3
1
4
(7)
! " #
It should be noted that the clock period, , is usually set equal to the maximum propagation delay, , in a circuit. Since the same clock speed is maintained for both filters, the following quadratic equation can be obtained to solve for β, M 2
&
5
= 2
&
5
pipelining. It does not accelerate instruction execution time but it does accelerate program execution time by increasing the number of instruction finished per cycle [9-13]. B. BIT-LEVEL OPTIMIZATIONS The basic FIR operation consists of a series of multiply-add operations, which make it important to consider the issues of computational complexity and power of these underlying arithmetic operations to address possibilities of power reduction in the FIR computation. These issues can be considered as bit-level optimizations for reducing the power, after discussing architectural level filter optimizations in the previous sections. Different types of multiplication and addition schemes have been investigate in literature for minimum power consumption. Also, it is important to take the delay into account, since it is essential to maintain a desired filter throughput.
Fig 1 Direct Form FIR Filter
% )
ISSN: 2277-6370
(8)
Once β is obtained, the reduced power consumption of the pipelined filter can be computed using Eq. 5.
We therefore require a general multiplication scheme for the filter taps, which can be used for any new set of filter coefficients. Since the multiplier area and delay, and hence the power are considerably larger than that of an adder, the issue of having an optimized multiplier comes first. We analyzed several different types of multiplication schemes, on ground of area, delay and power consumption. IV. SIMULATION AND SYNTHESIS OF FIR FILER Most of the simulations have done in the Verilog HDL. Verilog HDL is one of the most used programming language used to model the digital systems by dataflow, behavioral and RTL style of modeling [14, 22]. Verilog HDL is used primarily for design, verification and implementation of the design. Verilog HDL allows different level of abstraction to be mixed in the same model. Most popular synthesis tools support Verilog HDL. This makes it the language of choice for designers. The simulation is performed using Modelsim 10.2c and the synthesis is done using Xilinx ISE software (Xilinx ISE 14.4 version) for Virtex5 FPGA (Field Programmable Gate Array) family (XC5VLX20T).
Fig.2 Pipelined FIR Filter
The Pipelined FIR Filter shown in the Fig-1. In pipelining any operation along the critical path is broken into smaller and quicker operation with registers between the levels in order to get a smaller critical path or increase in the operating frequency which leads to higher throughput. The total execution time for each individual instruction is not altered by
305
Global Journal of Advanced Engineering Technologies,Vol3,Issue3-2014
ISSN: 2277-6370
complexity and power in the filter. The above results have been summarized in the Table 1 Table 1 Area analysis of Multiplier and Filter Component Area Area Comments (LUTs) of (LUTs) Standard Our design design 17 13 Reduction Multiplier of 23.5 % 61 52 Reduction Filter of 14.7 %
Fig 3.Simulation Result for Direct FIR Filter
After precise estimation of the area of the two filter implementation, we make an architectural-level power estimation of our design and compare it with the reference architecture. Because of the non-availability of a specialized power estimation tool for our work, it was not possible for us to obtain exact power dissipation for our design. However, since our design basically utilizes the advantage of several architectural transforms, we can make a fairly accurate estimation of the power dissipated therein by understanding the impact of these transforms on the various factors that effect power, namely, the effective switched capacitance, the frequency of operation, and the required supply voltage. Table 2 Power analysis in FIR filter Parameter Reference Pipelined Filter Filter 1.17 % ) )) % ) 3.3 V 2.7 V (1/4) % ) % ) Power 6789 0.19 6789 Power Reduction 81 %
Fig 4.Simulation Result for Pipelined FIR Filter
We formulated the design of both the reference direct FIR filter shown in Figure 1 and our proposed low power pipelined FIR filter architecture shown in Figure 2 in a parameterizable Verilog code, which was then simulated and synthesized. We took a filter length L=4 for our design. The wordlength for both the coefficients and the input data is 4 bits. Reference FIR Architecture: The reference FIR architecture was implemented with 4 taps, thus realizing a length-4 direct FIR filter. Area (Number of Look Up Tables): 61 Pipelined FIR Architecture: The same filter was realized using pipelining. The design employs the modified Booth multipliers, whose area results have been presented earlier. Area (Number of Look Up Tables): 52 As expected, the area of our filter implementation is smaller than that of reference filter. This is because of the pipelining and Booth encoding involved in the architecture, and this is also the point that leads to reduction in computational
V. CONCLUSION We have presented the low power realization of the FIR filter using pipelining and modified Booth multiplier. We have analyzed the computational complexity of these architectures and found it to be lower than that of the direct form FIR filter. The reduced computational complexity of the pipelined architecture enables reducing the frequency and hence the supply voltage, thus significantly reducing the power dissipation. The result shows that with the pipelined architecture, the power can be reduced at the architectural level to as 81%, while maintaining the same throughput rate. Power minimization is an integrated activity and can be performed at various stages of the design process. We considered architectural level transforms for power minimization in the FIR filter computation. In future fully parallel pipelined version can be implemented to achieve the peak performance and A/D and D/A converter can be interfaced within the FPGA. The optimization of the design can be done in terms of area occupied on the chip and the
306
Global Journal of Advanced Engineering Technologies,Vol3,Issue3-2014
combination of parallel and pipelining techniques can be used in designing of FIR filter for further optimization and lowering power consumption.
REFERENCES [1] Sukumaran, A; Pavan, S., "Low Power Design Techniques for SingleBit Audio Continuous-Time Delta Sigma ADCs Using FIR Feedback," Solid-State Circuits, IEEE Journal of , vol.PP, no.99, pp.1,11, 2014. [2] Rani, J.S.; Sai Phalghun, C., "FPGA based partial reconfigurable fir filter design," Advance Computing Conference (IACC), 2014 IEEE International , vol., no., pp.789,792,21-22Feb.2014 [3] da Silva, E.AB.; Lovisolo, L.; Dutra, AJ.S.; Diniz, P.S.R., "FIR Filter Design Based on Successive Approximation of Vectors," Signal Processing, IEEE Transactions on , vol.62, no.15,pp.3833,3848,Aug.1, 2014. [4] Umasankar, A; Vasudevan, N., "Design and analysis of various slice reduction algorithm for low power and area efficient FIR filter," Current Trends in Engineering and Technology (ICCTET), 2013 International Conference on , vol.,no.,pp.259,263,3-3July2013. [5] Sendhilkumar, N.C.; Logashanmugam, E., "Design of high speed and low power new reconfigurable fir filter for DSP applications," Current Trends in Engineering and Technology (ICCTET), 2013 International Conference on , vol., no., pp.181,183, 3-3 July 2013.
ISSN: 2277-6370
Architecture: A Quantitative Approach”, 3rd Edition Morgan Kaupmann Publishers [16] Sung-Mo Kang and Yusuf leblebici “CMOS Digital Integrated Circuits Analysis and Design”, 3rd Edition Tata McGraw-Hill Education Private Limited, New Delhi, 2011 [17] Gary Yeap, “Practical Low Power Digital VLSI Design”, Kluwer Academic Publisher, 1998 [18] Fausto Pedro, “Digital Filters”, Intech, Janeza Trdine 9, Croatia, 2011 [19] A.T. Erdogan and T. Arslan, “High Throughput FIR Filter Design for Low Power SoC Applications”, 9th Int. Conf. on VLSI Design, Jan 2010 [20] Jongsun Park, Woopyo jeong and Hunsoo Choo, “High Performance and Low Power FIR Filter Design Based on Sharing Multiplier”, ISLPED, 02 August, 2012 [21] K. Hwang, “Computer Arithmetic, Principles, Architecture, and Design”, New York, Willey, 1979 [22] James M. Lee, “Verilog Quickstart A Practical Guide to Simulation and Synthesis in Verilog”, 3rd Edition, Kluwer Academic Publisher, New York, 2002
[6] John G. Proakis and Dimitris G. Manolakis, “Digital Signal Processing Principles, Algorithms, and Applications”, 4th Edition Pearson Prentice, Hall, 2007. [7] Sanjit K. Mitra, “Digital Signal Processing: A Computer-Based Approach” Tata McGraw-Hill Publishing Company Limited, 2003. [8] Uwe Meyer-Baese “Digital Signal Processing with Field Programmable Gate Arrays” 3rd Edition Springer Publication, 2007 [9] Keshab K. Parhi, “VLSI Digital Signal Processing System Design and Implementation”, A Wiley- Interscience Publication, New York [10] Jan M. Rabey and Massoud Pedram, “Low Power Design Methodologies”,Kluwer Academic Publisher, 2005 [11] Anantha P. Chandrakasan, “Minimizing Power Consumption in CMOS Circuits”, Department of EECS, University of California at Barkeley [12] Samir Palnitkar, “Verilog HDL A Guide to Digital Design and Synthesis”, SunSoft Press, 1996 [13] J. Bhasker, “Verilog HDL Synthesis A Practical Primer”, Star Galaxy Publishing, 1997 [14] M. Morris Mano and Michael D. Ciletti, “Digital Design with An Introduction to the Verilog HDL” 5th Edition Pearson Prentice Hall, 2011. [15] John L. Hennessy and David A. Patterson, “Computer
307