A Low Power FIR Filter Design Technique Using ... - Semantic Scholar

10 downloads 344 Views 301KB Size Report
that the sign-extension bits of a 2's complement number cause high switching activity in digital arithmetic circuits. Such switching is undesirable in low power ...
1

A Low Power FIR Filter Design Technique Using Dynamic Reduced Signal Representation Zhan Yu , Meng-Lin Yu , Kamran Azadet and Alan N. Willson, Jr. Integrated Circuits and Systems Laboratory Bell Labs, Lucent Technologies University of California, Los Angeles, CA 90095 Holmdel, NJ 07733 zhanyu, willson @icsl.ucla.edu yu, ka @research.bell-labs.com Abstract— While arithmetic circuits using 2’s complement representation are easy to implement, it is well-known that the sign-extension bits of a 2’s complement number cause high switching activity in digital arithmetic circuits. Such switching is undesirable in low power applications. In this work, we exploit the sign-extension property of a 2’s complement number and propose a reduced representation of 2’s complement numbers to avoid sign-extension. Instead of having the high switching activity at the MSB side of the datapath, the proposed number representation avoids switching of the MSBs altogether, and therefore reduces the power dissipation in digital arithmetic circuits. With the proposed technique, the maximum magnitude of a 2’s complement number is detected and a reduced representation is dynamically generated to represent the signal. There is a constant error introduced by the reduced representation and such error is compensated accordingly. The proposed signal representation is particularly useful in digital filters where the coefficients are slowly varying and have small magnitudes. Our experimental results have shown a 38% power saving using the proposed technique.

I. Introduction The implementation of arithmetic circuits is very important in digital signal processing and communications applications. Design techniques that reduce the power dissipation in digital arithmetic circuits have been widely explored by researchers. The 2’s complement number signal representation has been widely used in arithmetic circuit design due to the ease of implementation of arithmetic functions. However, it is also well known that when a 2’s complement number switches between a positive and a negative value, large signal transition activity occurs in the mostsignificant-bits (MSBs) of the data-path. Low power [1] techniques that exploit the use of other types of signal representation, such as sign-magnitude and signed-digit, have also been investigated [2], [3], [4]. However, the easy-toimplement properties of 2’s complement arithmetic circuits are lost in these techniques. Due to many advantages of the 2’s complement number representation, we proposed to use a reduced representation of 2’s complement numbers to reduce the switching activity This research was supported by a grant from Lucent Technologies.

in the MSBs of a signal. The easy-to-implement properties of 2’s complement numbers are preserved, but the high switching activity in its most-significant bits is avoided. In the following section, we will first review a wellknown property, the sign-extension, of 2’s complement numbers. Such a property is used to derive our proposed reduced representation of a 2’s complement number. Next, in Section III, we discuss how to use the proposed reduced representation in the design of certain digital arithmetic circuits. Finally, in Section IV, we report experimental results to show the effectiveness of our method, and we conclude the paper in Section V.

II. Reduced Representation of 2’s Complement Numbers An -bit 2’s complement number is represented by bits: . The most significant bit (MSB) is the sign-bit, where implies is a positive number and indicates is negative. If has a magnitude less than , bit is the signbit. Bits simply repeat and form a string of s or s, which we call sign-extension. We can rewrite as . Since the sign-extension is composed of repeated sign-bits, the information contained in sign-extension is redundant. It is well-known that a 2’s complement number with sign-extension could be represented by the sum of an bit vector and a constant vector which has a string of s from bit to bit at the MSB side [5]. This property is illustrated in Fig. 1. We call the constant vector a compensation vector and denote it as . The binary vector is an -bit reduced representation of the 2’s complement number. Such representation is more efficient than the -bit representation with sign-extension. This property is often used in building 2’s complement addition circuits to reduce the number of adder cells used at the MSB side. When we need to add up such -bit 2’s complement numbers, we only need to build the circuitry to add the -bit vectors and the sum of the compensation vectors, which could be pre-computed as a single constant vector. When a 2’s complement number switches between a positive value and a negative value, all of its sign-extension bits

2

1

1

1

compensation vector CN-1,m-1 m-bit reduced representation

xm-1 xm-2 xm-3

x0

xm-1

xm-1 xm-1 xm-2 xm-3

x0

N-1

m

0

m-1

m-2

m-3

hN-1

hN-1

hN-2

h1

h0

coefficient H zero shift comp

bit position

Fig. 1. Compensation Vector for 2’s Complement Number

pN

pN-1

pN-2

p1

partial-product P

p0 pc

TABLE I - WIRE C ODING .

Fig. 2. Partial-Product Generation.

Symbol -2 -1 0 +1 +2

zero 0 0 1 0 0

shift 1 0 0 0 1

comp 1 1 0 0 0

in 0

H

1

H

2

H

3

H

4

H

5

H

6

H

7

H

8

H

partial-product generator

out Ctotal

hybrid section 3-5 accumulation path

are switching. This is very undesirable in low power cirFig. 3. Hybrid-form FIR Filter. cuit design. For a reduced representation, the MSBs are not switching at all, since they are always s. In this work, we vector is required for each tap. The sum of all comfurther exploit this property in low power arithmetic circuit pensation vectors ( , denoted as design. ) should be added at the end of the filter accumulation path, as in Fig. 3. III. Low Power FIR Filter Using Dynamic ReThe partial-products are added to the filter accumulation path. An FIR filter could be implemented as a direct-form duced Representation FIR filter which requires fewer registers and therefore lower In this section, we describe a general technique in power dissipation, or a transposed-form FIR filter which allow power arithmetic circuit design that exploits the sign- lows the maximum clock rate at the cost of more registers extension property of a 2’s complement number by using the and more power consumption. We choose the implemenproposed reduced representation. For simplicity, we discuss tation of a hybrid-form [6], which finds a good speed and the application of our technique to the implementation of an power trade-off between the direct-form and the transposedadaptive FIR filter with 5-level symbol inputs, but our tech- form implementations. Fig. 3 shows a hybrid-form FIR filnique is general and applies to other kinds of input signals. ter with -taps per hybrid section. The adders in the accumulation path could be implemented using carry-save arithA. Adaptive FIR Filter with 5-level Inputs metic. For each hybrid section, a Wallace tree could be used For a 5-level signal, each symbol takes one of the val- to perform the addition operation [6]. ues in . The symbols could be represented by 3-wire signals according to B. Partial-Product Generation Using Dynamic Reduced Signal Representation Table I. The number of taps in an adaptive filter could be In an adaptive filter design, a full word-length multiplier large, ranging from tens to over a hundred, depending on the channel characteristic. Each tap computes the product must be built to accommodate the possible large coefficient of the data symbol and a filter coefficient. The results are dynamic range during the adaptation phase. In an adaptive bit coefficients and 5-level symbol inputs, an added in the accumulation path. Because the data symbols filter with bit partial-product generator is built. However, after take values in , such multiplication is very simple. Assuming the coefficients are -bit 2’s com- the coefficients of an adaptive filter have converged, some plement numbers, we call the product of the data symbol coefficients take values with small magnitude. Fig. 4 shows and the coefficient a partial-product, which is represented a typical response of an adaptive filter after convergence. by binary bits. The partial-product generation circuit Note that many taps have very small coefficients in comparis illustrated in Fig. 2 which is a parallel structure that does ison to the main tap. not require carry-propagation. The MSB ( ) of the partialFor small coefficients, the corresponding partial-products product is a negated sign-bit. This is because the accumula- have small magnitudes and long sign-extensions if repretion path of the FIR filter requires a longer word-length and sented by 2’s complement numbers. We seek to generthe use of the compensation vector could be applied at the ate partial-products using reduced representation to avoid MSB side to reduce the number of adder cells [5]. Assume the sign-extension switching of a 2’s complement reprethe output of the FIR filter requires bits, a compensation sentation. Assume we detect that the maximum magni-

3

C. Dynamic Computation of Compensation Vector

Adaptive FIR Filter Response 1.2

1

0.8

0.6

0.4

0.2

0

0

10

20

30 Taps

40

50

60

Fig. 4. A Typical Adaptive Filter Impulse Response.

cN

cN-1

cN-2

1

1

1

hN-1

hN-1

hN-2

cm-1

cm-2

c0

1

0

0

hm-1

hm-2

h0

compensation vector

C N-1,m-1 coefficient zero shift comp

0

0

0

hm-1

pm-1 cm-2 zero

0 1

0 1

pm-2

p0

partial-product

pc

cm-2 shift c m-1 c m-2 comp c m-1 c m-2 comp

0 1 mux

Fig. 5. Low-Power Partial-Product Generation.

tude of the coefficient is less than . We know the corresponding partial-product has a magnitude less than . Therefore, the bit is the sign-bit, and an bit reduced representation of the partial-product can be generated accordingly. This reduced representation is associated with a compensation vector , instead of the original compensation vector . Assume we can generate the bits to of as a control signal in partial-product generation (denoted as ; we will discuss the generation of this control signal in Section III-D). A low power partialproduct generator could be built to output a reduced representation of . Such a low-power partial-product generator is illustrated in Fig. 5, and it still keeps the parallelism of the standard partial-product generation scheme in Fig. 2.

Obviously, using the reduced representation in partialproduct generation introduces a constant error in each tap. Such an error corresponds to the difference between the original compensation vector and the new compensation vector , which is a string of s from bit to bit : . As the adaptive filter updates the coefficients, this error is also changing. Also, since there are registers in the accumulation path, it may take several clock cycles for this error to propagate to the output of the filter. This error must be computed and corrected dynamically. Since this constant error is introduced in the accumulation path of the FIR filter, we should build a compensation vector correction path that imitates the error propagation in the accumulation path. To simplify the error introduced, we detect the maximum magnitude of all the filter taps in one hybrid section, instead of detecting the magnitude of each individual filter coefficient. Therefore, all the taps in the same hybrid section use the same sign-bit position, and introduce the same compensation vector error. Assume that all coefficients in one hybrid section have a magnitude less than , this hybrid section introduces an error equal to . (Usually, for the balance of timing and the modularity of a design, the number of taps in each hybrid section is the same.) Therefore, the multiplication could be moved to the end of the compensation vector correction path. For the -tap hybrid filter in Fig. 3, the compensation vector computation path is illustrated in the lower part of Fig. 6, where is the compensation vector difference introduced by one tap in hybrid section . The multiplication at the end of the compensation vector computation path is a multiplication with a constant (the number of taps per hybrid section) and can easily be implemented with a shift and add. Whenever a filter coefficient is updated, the corresponding compensation vector of the hybrid section should be updated. Since the compensation vector computation path has the same number of registers as that in the accumulation path, the change in the compensation vector will correctly propagate to the output and synchronize with the filter output. The filter output should be the sum of three values: the output of the filter accumulation path, the original compensation vector and the dynamic compensation vector . The overall filter diagram is shown in Fig. 6. D. Coefficient Maximum Magnitude Detection Since each hybrid section shares one compensation vector, we should detect the maximum magnitude of all the coefficients in the hybrid section. Such a circuit is essentially a leading-zero or leading-one detector. Instead of building a leading-zero/one detector for each coefficient, we can compute the maximum magnitude of all the coefficients in a single hybrid section in a parallel fashion. The computation

4

TABLE II E XPERIMENTAL R ESULTS .

Circuit Filter Data-path Coefficient Magnitude Detection and Update Dynamic Compensation Vector Computation Total Power

2’s complement 239.9 (mW) 110.8 ( W) – 240.0 (mw)

in 1

h

2

h

h

3

h

4

5

h

h

6

h

7

h

partial-product generator

8

h

out Ctotal Cdynamic 3

c

0-2

c

3-5

c

6-8

Fig. 6. Low Power Filter with Dynamic Reduce Representation. 0

Power Savings % 39.8%

38.9%

varying systems. 0

h5

Reduced Rep. 144.4 (mW) 165.6 ( W) 2.15 (mw) 146.7 (mW)

0

h4

0

h3

0

h2

0

h1

0

h0

IV. Experimental Results An adaptive filter with 5-level inputs has been synthesized in 0.5- m technology. The coefficient word-length is 10-bits. We used a hybrid-form FIR filter with 8-taps per hybrid section and a clock rate of 10MHz. The coefficients are updated every 200 symbols. Table II shows the power dissipation of the system. Over 38% power dissipation can be saved when using the proposed technique.

0

coefficient

h

V. Conclusions

This work has exploited the sign-extension property of a 2’s complement number. A reduced representation for coefficient h h h h h h h 2’s complement numbers has been proposed to avoid signextension and the switching of the sign-extension bits. The xnor maximum magnitude of a 2’s complement number is deand tected and its reduced representation is dynamically generleading-zero 0 ated to represent the signal. A constant error is introduced detector 0-1 0-1 0-1 0-1 0-1 0-1 by the reduced representation and this error is also dynamcompensation vector c5 c4 c3 c2 c1 c0 ically compensated. The proposed signal representation is C0-1 N-1,m-1 particularly useful in digital filters where the coefficients are Fig. 7. Maximum Magnitude Detection. slowly varying and have small magnitudes. The proposed technique has been implemented in an adaptive filter and circuit is illustrated in Fig. 7 with a 6-bit 2-tap example. shown to reduce power dissipation by over 38%. One leading-zero detector is built instead of two (one for each tap). In Fig. 7, the XNOR-gates for each coefficient References detect the start of the sign-extension for one coefficient. The [1] A.P. Chandrakasan and R.W. Brodersen, “Minimizing power consumption in digital CMOS circuits,” Proceedings of the IEEE, pp. AND-gates detect the maximum magnitude of all the coeffi498–523, Apr. 1995. cients in the hybrid section. Finally, a leading-zero detector [2] C.V. Schimpfle, S. Simon, and J.A. Nossek, “Low power CORDIC imgenerates the compensation vector. plementation using redundant number representation,” in Proceedings, 1 5

1 4

1 3

1 2

1 1

1 0

1

E. Power Overhead Using dynamic reduced representation reduces the signal switching in the MSB bits of the filter accumulation path, however extra circuits are needed to detect the magnitude of the coefficients and to dynamically compute the compensation vectors. Such computation causes power dissipation. However, this power overhead occurs only when the coefficients are updating. In many adaptive equalization systems, frequent coefficient updating is only needed during the adaptation phase. For a slowly varying channel, slow update is adequate once the coefficients reach convergence. In such a case, the power overhead is negligible. The proposed technique is thus particularly suitable for slowly-

[3]

[4]

[5] [6]

IEEE International Conference on Applications-Specific Systems, Architectures and Processors, July 1997, pp. 154–161. M. Zheng and A. Albicki, “Low power and high speed multiplication design through mixed number representations,” in Proceedings, International Conference on Computer Design: VLSI in Computers and Processors, Oct. 1995, pp. 566–570. C. Nagendra, R.M. Owens, and M.J. Irwin, “Unifying carry-sum and signed-digit number representations for low power,” in Proceedings, 1995 International Symposium on Low Power Design Low Power Design Symposium, April 1995, pp. 15–20. M. Roorda, “Method to reduce the sign bit extension in a multiplier that uses the modified booth algorithm,” Electronics Letters, vol. 22, pp. 1061–1062, Sep. 1986. K. Azadet and C.J. Nicol, “Low-power equalizer architectures for high-speed modems,” IEEE Communications Magazine, pp. 118–126, Oct. 1998.

Suggest Documents