Implementation of Low Complexity FIR Filters using a ... - CiteSeerX

Implementation of Low Complexity FIR Filters using a Minimum Spanning Tree Henrik Ohlsson, Oscar Gustafsson, and Lars Wanhammar Department of Electrical Engineering Linköping University SE-581 83 Linköping, SWEDEN E-mail: {henriko, oscarg, larsw}@isy.liu.se

Abstract— In this paper we propose a method for implementation of multiple constant multiplications, as used in, for example, FIR filters. The method is a shifted difference coefficient method where the differences are selected using a minimum spanning tree. By finding a minimum spanning tree of an undirected graph, corresponding to the coefficients, an implementation of a multiple constant multiplication block with low arithmetic complexity is obtained. There are algorithms that finds a minimum spanning tree in polynomial time, making the proposed method computational efficient. We also propose that the differences are computed on odd coefficients only. This reduces the number of adders in an implementation further, compared to other difference coefficient methods. Several stages of differences, i.e., a set of differences is used to compute a new set of higher order differences, may also be used. We show that the proposed method give optimal, or close to optimal, results with respect to the number of additions required for a number of FIR filter implementations.

I. INTRODUCTION FIR filters are commonly used DSP algorithms and the power consumption is the dominating issue in many applications. To reduce the power consumption, implementation of FIR filters with low arithmetic complexity has been studied thoroughly [1]–[8]. In this paper a method that uses a minimum spanning tree for implementation of an FIR filter is proposed. We focus on FIR filters, but the method is generally applicable to computation of multiple constant multiplications or its transpose. The latter corresponds to the sum of products problem. In this paper the method is applied on the transposed FIR filter, where all the multiplications are computed on the same input, as shown in Fig. 1. Only the cost for additions/subtractions are considered. A shift operation is considered to be free. This is relevant in the bit-parallel arithmetic case where shifts may be hardwired and the only circuits required are adders/subtractors. Also, since the implementation cost of a subtractor is similar to that of an adder, the subtractors are referred to as adders and the number of adders required are considered as the implementation cost. x(n)

hN

hN-1

T

hN-2

T

h1

h0

T

Fig. 1. Nth-order transposed FIR filter.

y(n)

A. Difference Coefficient FIR Filters In, e.g., a lowpass FIR filter, the magnitude of the coefficients varies slowly. That is, adjacent coefficient values have approximately the same magnitude. This can be used in a structure where only the differences between the coefficients are realized. Since the differences between adjacent coefficient may be small, the corresponding multipliers can be implemented with a lower cost, in terms of number of adders, compared to a direct implementation of the filter. It is often possible to find difference coefficients with lower implementation costs by selecting the differences differently. One method is to sort the coefficient with respect to the magnitude before computing the differences [1], [8]. Graph based methods, such as finding a Hamiltonian path or a minimum spanning tree (MST), has also been proposed for computation of the differences [5], [7]. In this paper we use an MST computed from an undirected graph to find differences with a minimal implementation cost in each stage. By only computing an MST on a set of odd, integer coefficients, as proposed in [7], an efficient implementation is obtained. It is only necessary to consider odd integer coefficients since all even numbers can be derived from an odd integer and a multiplication by a power of two, i. e., a shift operation. This yields a reduced implementation cost of the differences compared to other graph based methods. All the methods discussed above can be seen as a class of algorithms that solves the multiple constant problem, using differences between the coefficients. Other methods that solves the multiple constant problem are, for example, multiplier blocks [2] and subexpression sharing[3], [4], [6]. II. GRAPH REPRESENTATION Graph representation of multiple constant multipliers and the use of minimum spanning trees for implementation of multiple constant multipliers with low arithmetic complexity have previously been proposed [5], [7]. A. Undirected Graph A coefficient set and the relation between the coefficients can be represented using an undirected graph, where each vertex corresponds to one of the filter coefficients, hi. The

graph is fully connected, i.e., there are edges between all pairs of coefficients. These edges corresponds to the difference between the two coefficients and each edge is assigned a cost, c h , h , corresponding to the minimal cost for i j implementing the difference between the coefficients hi and hj. In Fig. 2 an undirected graph for four coefficients is shown.

of two corresponds to a shift operation and, as discussed above, such operations does not require any extra hardware. The only cost that needs to be considered is the cost for implementing the odd integer multiplier. Hence, it is sufficient to consider only odd integer coefficients in the graph.

h0

The first step in the proposed method is to compute the odd integer coefficients that corresponds to the original coefficients. These coefficients becomes the vertices in the undirected graph. Some of these odd coefficients may be equal. These duplicates are removed from the graph. Also, the coefficient 1 is added to the coefficient set if it is not already included. This corresponds to the input of the structure and can always be used to form differences without any extra implementation cost. The edge between two vertices corresponds to the difference between the two coefficients of the vertices. A difference is obtained as

ch0,h2

ch0,h1

h1

ch0,h3

h2

ch1,h2 ch1,h3

ch2,h3

h3 Fig. 2. Undirected graph for four coefficients.

B. Minimum Spanning Tree An MST of a graph with weighted edges is a set of edges that connects all vertices while minimizing the sum of the edge weights, which in this case corresponds to minimizing the implementation cost of the required differences. Finding an MST is a well studied problem with several polynomial time algorithms available [9], [10]. For a graph, an optimal MST can be found using a greedy algorithm. Hence, a short execution time is required for computing an MST. Note that, in the general case, a graph may have several MSTs. Using an MST for selecting the differences often yields a solution where one, or several, of the original coefficients are used to form two, or more, differences. It is possible to find a set of differences, yielding a minimal implementation cost, where each coefficient is used only once to compute a difference by finding a Hamiltonian path in the graph instead of an MST [5]. A Hamiltonian path is a path that only visit each vertex once with a minimum cost. Finding such path is an NP-hard problem. However, a Hamiltonian path may only yield a solution as good as an MST, it will never be better. The depth of an MST, i.e., the path from the root to the leaf with the highest number of vertices between them, gives the critical path of the difference stage. However, finding an MST with a constrained depth is also an NP-hard problem. III. PROPOSED METHOD The proposed method is applied on the transposed sum-ofproduct structure which is a structure where all multiplications are computed on the same input. This makes it possible to derive an undirected graph for the coefficients. The method uses the fact that every even coefficient can be derived from a multiplication between a power of two and an odd integer coefficient. A multiplication by a power

A. Computing Difference Coefficients

h diff

i, j

= 2 m hi ± 2 n h j

(1)

where h diff is a difference between the coefficients hi i, j and hj, i and j are the coefficient index of the two vertices connected by the considered edge, and m and n are positive integers. By introducing shift operations on the coefficients when computing the differences, as shown in (1), a large number of possible differences are obtained for each pair of coefficients. For each of these differences the implementation cost is computed and each edge is assigned the difference, or differences, with the lowest implementation cost. Here we consider CSD multipliers for the implementation of the differences. In this case the cost for a difference coefficient is the number of adders required for a minimal representation, such as CSD, implementation of the difference coefficient [7], [11]. The cost for computing the original coefficient is the cost for the difference coefficient, here it is the number of nonzero bits in the difference minus one, plus one adder for computing the original coefficient. An optimal solution, in terms of adders required, is obtained if the cost on each vertex of the MST is one. For such cases every unique coefficient can be realized using one adder only and the total number of adders is equal to the number of unique coefficients. B. Multi-Stage Differences Computing the differences from the original coefficients results in a new coefficient set, composed by the differences with a minimal implementation cost. This set of coefficients, the first-order differences, can be used to compute a second set of differences, the second-order differences. A new graph can be formed and an MST for this graph can be computed. When several stages of differences are considered it may be possible that a difference appear in more than one dif-

ference stage. For such cases the difference only needs to be computed once and may then be reused in subsequent difference stages.

Stage 1 57x

Stage 2 5x

C. Example

4

As an example of the proposed method, consider the set of coefficients α = [4, 41, 57, 64, 68, 114]. From these coefficients the set of odd coefficients, αodd = [1, 17, 41, 57], are obtained by dividing each of the original coefficients with the largest power of two possible and removal of any duplicate coefficients. From the odd coefficients the undirected graph shown in Fig. 3 is derived. On the edges the minimal cost for realizing a difference as well as the differences of this cost are shown. A minimum spanning tree of this graph is shown in Fig. 4. For this graph, four different MSTs are possible. In Fig. 5 the implementation corresponding to the MST in Fig. 4 is given. The relation between the MST and the implementation is illustrated by the bold interconnects in Fig. 5. As can be seen the differences 1 and 5 are required as inputs to the first difference stage, to compute the multiplications with the original coefficients. A second difference stage is used to compute the difference 5 and the total number of adders required for this implementation is four. The difference 5 could have been replaced by 9 or 33, yielding another solution with a minimum implementation cost. In this example the same implementation cost is obtained. However, it may be the case that one of the possible differences is already available, either as a coefficient or as a difference from a previous stage. For such cases, the previous result can be reused and the total cost can be reduced.

1 c=1 d=[1]

17

c=2 d=[5, 9, 33] c=2 d=[7, 65]

c=2 d=[3, 7, 65]

c=2 d=[5]

41

c=1 d=[1]

57 Fig. 3. An undirected graph for the coefficients α = [1, 17, 41, 57].

x

8

2

41x

16

x

57x 41x

17x

x

114x

4 64 4

68x 64x 4x

Fig. 5. Implementation corresponding to the MST in the example.

IV. EXAMPLE FILTER IMPLEMENTATION As an example, a 63rd-order linear phase FIR filter is considered [12]. This is a linear-phase FIR filter which has a symmetric impulse response. The symmetry is used to reduce the computational complexity of the filter implementation. The example filter is implemented using the transposed direct form structure. This filter has 22 unique odd coefficients. An MST for the first difference stage of this filter is shown in Fig. 6. From the figure it can be seen that the differences of the first stage are 1, 33 or 255, and 511. All differences of cost one is realized in the first difference stage, yielding a cost of 22 adders. As can be seen from the MST in Fig. 6, 33 is one of the original coefficients which is realized in the first difference stage with cost one. Hence, it is not necessary to compute this difference in the second stage since it can be taken directly from the first stage without any extra cost. Hence, the only coefficient that need to be realized in the second difference stage is 511. The cost for implementing this difference in the second stage is one, (511 = 512 – 1). The total cost, in terms of numbers of adders, for implementing the multipliers in this filter using our proposed method is then 22 + 1 = 23. This can be compared with the optimal solution of one adder per unique coefficient, which in this case is 22. For comparison the number of adders required for a CSD implementation [11] is 48, the number of adders using a subexpression sharing method [4] is 23, and the number of adders required in a multiplier block implementation [2] is 22. V. EVALUATION OF THE METHOD

1 c=1 d=[1]

c=2 d=[5, 9, 33]

17

41 c=1 d=[1]

57 Fig. 4. A minimum spanning tree for the graph in Fig. 3.

To evaluate our proposed method we compare three FIR filters, implemented using different methods. The comparison is performed with respect to the number of adders required for the implementation of the multipliers in the filters. The considered filters are a ninth-order lowpass filter from [1], a 66th-order halfband highpass filter from [13], [14] and the 63rd-order filter considered in the example above. Our proposed method is compared with an implementation using CSD representation of the coefficients [11], a subexpression sharing method (Pasko) [4], and a multiplier block implementation (Dempster) [2]. The latter

1 c=1 d=[1] 9

c=1 d=[1]

7 c=1 d=[1]

33 c=1 d=[1]

119

c=1 d=[1]

c=1 d=[1]

31 c=2 d=[511]

71

c=1 d=[1]

431

c=1 d=[1]

15

5

c=1 d=[1]

321

c=1 d=[1]

c=1 d=[1]

59

3 c=1 d=[1] 41

c=1 d=[1] 37

c=1 d=[1] 21

c=1 d=[1] 61

c=1 d=[1] 29

c=1 d=[1] 195

c=2 d=[33,255] 351

c=1 d=[1]

c=1 d=[1]

23

19

c=1 d=[1] 99

Fig. 6. Minimum spanning tree for the first difference stage of the example filter.

yield the optimal result, in terms of the number of adders required, for the three filters considered here. The results show that a significant reduction of the implementation cost is obtained using our proposed method, compared to a CSD implementation. For the ninth-order filter, the proposed method gives an optimal solution, while for the two other cases, near optimal solution are obtained. The multiplier block method yields optimal result for all three cases while the subexpression sharing method yields the same result as our proposed method. However, our method solves the problem with polynomial computation time, yielding faster execution times compared to the other methods. Hence, the method may be included in the selection (quantization) of the filter coefficients. TABLE I ADDER COST FOR THE MULTIPLICATIONS Filter

CSD

Pasko [4]

Dempster[2]

Ninth-order 66th-order 63rd-order

5 24 48

3 12 23

3 11 22

Proposed Method 3 12 23

VI. CONCLUSIONS In this paper we propose a graph based method for implementation of multiple constant multiplications, as used in for example, FIR filters, with low implementation cost in terms of adders. By computing a minimum spanning tree for an undirected graph for the odd coefficient set, a set of difference coefficients with a minimal implementation cost is obtained. The method can be applied in several stages, i.e., computation of new differences are possible from a set of previously computed differences. We compare the proposed method to an optimal method and show that the proposed method yield optimal, or near optimal results, when implementing three different FIR filters. The method has a polynomial computation time, which give fast execution times. This makes it possible to include the method in the search for binary filter coefficients.

REFERENCES [1] K. Nakayama, “Permuted difference coefficient realization of FIR digital filters,” IEEE Trans. Acoust., Speech, Sig. Proc., vol. 30, no. 2, April 1982. [2] A. G. Dempster and M. D. Macleod, “Use of minimum-adder multiplier blocks in FIR digital filters,” IEEE Trans. Circuits Syst.– II, vol. 42, no. 9, pp. 569–577, Sep. 1995. [3] R. I. Hartley, "Subexpression sharing in filters using canonic signed digit multipliers," IEEE Trans. Circuits Syst.–II, vol. 43, pp. 677– 688, Oct. 1996. [4] R. Pasko, P. Schaumont, V. Derudder, S. Vernalde, and D. Durackova, "A new algorithm for elimination of common subexpressions," IEEE Trans. Computer-Aided Design Integrated Circuits, vol. 18, no. 1, pp. 58–68, Jan. 1999. [5] K. Muhammad and K. Roy, “A graph theoretic approach for synthesizing very low-complexity high-speed digital filters,” IEEE Trans. Computer-Aided Design, vol. 21, no. 2, Feb 2002. [6] M. Martínez-Peiró, E. Boemo, and L. Wanhammar, "Design of high speed multiplierless filters using a nonrecursive signed common subexpression algorithm," IEEE Trans. Circuits Syst.–II, vol. 49, no. 3, pp. 196–203, Mar. 2002. [7] O. Gustafsson and L. Wanhammar, “A novel approach to multiple constant multiplication using minimum spanning trees,” in Proc. IEEE Midwest Symp. Circuits Syst., Tulsa, OK, Aug. 4–7, 2002, vol. 3, pp. 652–655. [8] H. Ohlsson, O. Gustafsson, and L. Wanhammar, “A shifted permuted difference method,” to appear in Proc. IEEE Symp. Circuits Syst., Vancouver, Canada, May 23–26, 2004. [9] C. Bazlamaçcı and K. Hindi, “Minimum-weight spanning tree algorithms Asurey and empirical study,” Computers & Operations Research, vol. 28, no. 8, pp. 767–785, 2001. [10] E. Lawler, Combinatorial Optimization: Networks and Matriods, Dover Publications, 2001. [11] L. Wanhammar, DSP Integrated Circuits, Academic Press, San Diego, 1999. [12] Y. C. Lim and S. R. Parker, “Discrete coefficient FIR digital filter design based upon an LMS criteria,” IEEE Trans. Circuits Syst., vol. 30, pp. 723–739, June 1983. [13] H. Ohlsson and L. Wanhammar, “A digital down converter for a wideband radar receiver,” In Proc. National Conf. Radio Science, Stockholm, Sweden, June 10–13, 2002, pp 478–481. [14] H. Ohlsson, Studies on implementation of digital filters with high throughput and low power consumption, Thesis no. 1031, Linköping University, 2003.

Implementation of Low Complexity FIR Filters using a ... - CiteSeerX

Implementation of Low Complexity FIR Filters using a ... - CiteSeerX

Suggest Documents

High Speed Low Complexity FPGA-based FIR Filters ... - Martin Kumm

Optimized Implementation of RNS FIR Filters Based on ... - CiteSeerX

FIR Filters

Design of FIR Filters

Design of FIR Filters

Design of FIR Filters

On a Design of Narrowband FIR Low-Pass Filters

Low Power FPGA Implementation of Digital FIR

VLSI Implementation of a Low-Complexity LLL Lattice ... - CiteSeerX

Design and Implementation of a Low Complexity VLSI ... - CiteSeerX

FIR INVERSE FILTERS

Implementation of Time-Multiplexed Sparse Periodic FIR Filters ... - DiVA

Implementation of Digital Unbiased FIR Filters with ... - Semantic Scholar

Derivations and Complexity Filters - CiteSeerX

Implementation of High Speed FIR Filter using Serial and ... - CiteSeerX

low power and high-speed implementation of fir filters for software ...

low power and high-speed implementation of fir filters for software

FPGA Implementation of Low Complexity VLSI ...

Low Complexity Implementation of Block ... - IEEE Xplore

Realization of Multistage FIR Filters Using Pipelining ... - Telfor Journal

Analysis Of Eye Tracking Movements Using FIR Median Hybrid Filters

On LUT Cascade Realizations of FIR Filters - CiteSeerX

integrated optical fir-filters for adaptive equalization of ... - CiteSeerX

Optimization of FRM FIR Digital Filters Over CSD and ... - CiteSeerX