Design of Optimized CIC Decimator and Interpolator

0 downloads 0 Views 317KB Size Report
in FPGA and verified with Modelsim and Matlab simulation results. CIC filters ... downsampling of signals in decimation process and as anti- imaging filters after ...
812

Design of Optimized CIC Decimator and Interpolator in FPGA Ramesh Bhakthavatchalu1, KarthikaV.S., Lekshmi Ramesh, Budhota Aamani Dept. of Electronics and Communication Engineering Amrita Vishwa Vidyapeetham Amritapuri, Kollam-690525, Kerala, India 1 [email protected]

Abstract—Cascaded Integrator Comb (CIC) filters are extensively used in Multirate signal processing as a filter for both decimation and interpolation processes. This paper analyzes optimized architecture and implementation aspects of decimator and interpolator using CIC filter and comparison between the results in hardware and simulations. The hardware is synthesized in FPGA and verified with Modelsim and Matlab simulation results. CIC filters function as efficient anti-aliasing filters before downsampling of signals in decimation process and as antiimaging filters after upsampling of signals in interpolation process. This paper also discusses about pipelining, throughput and area reduction techniques and performance analysis with respect to the number of stages (N) and rate change factor (R) of the filter.

decimator and interpolator using the results from Xilinx FPGA synthesis. II. THEORY OF CIC CIC filters are derived from the standard moving averager which is a simple form of an FIR filter.

Index Terms-CIC Filter, Decimator; Interpolator; Multirate; Modelsim; FPGA.

I.

INTRODUCTION

FPGAs have revolutionized the field of digital signal processing. The large hierarchy of programmable logic blocks within the FPGA gives great re-configurability together with speed. Once programmed, the FPGA may not provide the flexibility of a processor but offers better speed , which is required for many DSP applications. In many applications, for example in a communications system, the required signal may be only KHz wide but it may be centered at very high frequencies. Sampling such a signal at the Nyquist criteria , i.e. sampling at twice the highest input frequency , leads to a higher data rate of the signal. Processing a high data rate signal is a difficult task. Reducing the data rate of such signals would ease the processing significantly. In a communications system, two systems might be working at different rates which requires a rate change process. This is achieved by the use of a decimator or an interpolator. In cases where decimation or interpolation rates are very high, implementation using finite impulse response filters (FIR) filters might be costly due to the requirement of large number of filter taps. CIC filters, which are an optimized class of FIR filters, introduced by Hogenauer [1], provide a very efficient means of implementing these filter functions without the requirement of multipliers. This paper discusses the architecture of CIC filters and the implementation aspects of decimation and interpolation process using CIC filter and comparison between the various implementation methods of a

Fig 1. (a) Standard moving average filter (b) Recursive running sum filter (c) CIC version of a D-point averaging filter.

The moving average filter requires D-1 summations to produce an output sample. The time domain equation of a standard moving average filter is y(n) = 1/D [x(n) + x(n-1) + x(n-2) + . . . + x(n-D+1)] The z-domain transfer function H(z) of a standard moving average filter is H(z) = 1/D [1 + z-1 + z-2 + . . . + z -D+1 ] The recursive running sum filter depicted in Fig 1.(b) is an equivalent form of the standard moving average filter. The recursive running sum filter is computationally more efficient compared to the moving average filter as it requires only two additions to produce an output sample .The time domain equation of the recursive filter is y(n) = 1/D [x(n) - x(n-D)] + y(n-1) The z-domain transfer function H(z) of the recursive filter is H (z) = 1/D (1-z-D) / (1-z-1)

978-1-4673-5090-7/13/$31.00 ©2013 IEEE

813 III. CIC FILTER STRUCTURE If the delay line representation is condensed and the scaling factor 1/D is ignored in the recursive running sum filter depicted in Fig 1.(b), the classic form of a first-order CIC filter is obtained whose structure is shown in Fig 1.(c). The CIC filter has two sections: the feed forward portion, which is called the comb section and the feedback portion, which is called the integrator. The D in the comb section represents the differential delay. The comb section subtracts an input sample delayed by D, which is the differential delay, from the current input sample. The integrator performs the operation of an accumulator. The time domain equation of the CIC filter is y(n) = [x(n) - x(n-D)] + y(n-1) The z-domain transfer function Hcic (z) of the CIC filter is Hcic (z) = (1-z-D) / (1-z-1) The time domain impulse response of a single stage CIC filter is depicted in Fig 2. The positive impulse from the comb filter results in the all-ones output at the integrator. Then the negative impulse from the comb filter, which arrives D samples later, causes subsequent output samples at the integrator to be zero. Though the CIC filter has a recursive nature, it has a finite impulse response. Hence it is an FIR filter. Comb impulse response 1

Amplitude

0.5 0 -0.5 -1

0

0.5

1

1.5

2

2.5 Time

3

3.5

4

4.5

5

Fig 3. Characteristics of a single-stage CIC filter when D = 5: (a) Magnitude response (b) Phase response; (c) Pole/zero locations

Perfect pole zero cancellation can be obtained in CIC filters. The z-plane pole/zero characteristics of a CIC filter with differential delay D= 5 ,is shown in Fig 3.(c) . The comb filter generates D zeros which are equally spaced around the unitcircle, and the integrator generates a single pole at z=1which cancels the zero at that point. The frequency response of a CIC filter can be obtained from the z-domain transfer function by setting z=ejȦ , resulting in a sin(x)/x like low pass filter characteristic centered at 0 Hz.[5] Hcic (ejw) = (1-e-jwD) / (1-e-jw) = e-jw (D-1)/2 sin (wD/2) / sin (w/2) The magnitude of Hcic(ejȦ) at 0 Hz gives the DC gain of the CIC filter. |Hcic (e-jw)| w=0 = |sin (0) / sin (0)| = 0/0 Using Marquis de L'Hospital's rule |Hcic (e-jw)| w=0 = (D/2) cos (wD/2) / (1/2) cos (w/2) So, the DC gain of a CIC filter is equal to the comb filter delay D. IV. CIC DECIMATOR

Fig 2. Single-stage CIC filter time-domain responses when D = 5

The frequency magnitude and linear-phase response of a D = 5 CIC filter are depicted in Fig 3.

Downsampling by a factor R is a process where every Rth sample is retained while the R-1 samples in between are discarded. The output sample rate hence decreases by a factor of R. Decimation is a process where anti-aliasing filtering precedes the downsampling process. The CIC decimator is obtained by interchanging the order of the comb and the integrator sections in the CIC filter structure, such that the comb section comes first, followed by the downsampler. Interchanging the comb and integrator section does not affect the functioning of the filter as it is linear. Fig 4 depicts the CIC decimator structure.

814

Fig 4: CIC Decimator

Fig 7. First-order, D =10, R = 4, interpolating CIC filter spectra: (a) Input spectrum before interpolation (b) Output spectral images after upsampling (c) Output spectrum after anti-imaging filtering.

. VI. IMPLEMENTATION ISSUES

Fig 5: Frequency magnitude response of a first-order, D = 10, decimating CIC filter: (a) response of input signal (b) Response after anti-aliasing filtering (c) Response after Downsampling, R= 5.

V. CIC INTERPOLATOR Upsampling by a factor R is the process of inserting R-1 zerovalued samples between original samples in order to increase the sampling rate. The output sample rate increases by a factor R. Upsampling by R adds to the original signal R-1 undesired spectral images which are centered at multiples of the original sampling rate.CIC filters are used as anti-imaging filters for interpolated signals in order to remove the unwanted spectral images .The comb section precedes the integrator in a CIC interpolator. Fig 6 depicts the CIC interpolator structure.

Fig 6. CIC Interpolator

In CIC filters, interchanging the order of the comb section and the integrator section does not affect the functionality of the filter as it is linear. However placing the comb section on that side of the filter which is operating at a lower sample rate reduces the memory requirements in the delay. The new differential delay of the comb section is then reduced to N=D/R .The new comb section now functions at a lower clock frequency. This implementation is depicted in Fig 8.

Figure 8.Single-stage CIC filter implementations for interpolation and decimation

The reduced memory requirements and the lower clock frequency at the comb section reduce the hardware power consumption. For an M stage CIC decimator the DC gain is given by (NR) M and for an M stage interpolator it is (NR) M / R. Hence bit growth process occurs at the output. The output bit width should be large enough in order to accommodate the bit growth occurring in the internal stages. The bit width of the output data of a CIC decimator is Bmax = Bin + log2 (RN)M The bit width of the output data of a CIC interpolator is Bmax = Bin + log2 (RN)M/R where Bin is the bit width of the input data.[5]

815 A. Optimized CIC Decimator: An efficient implementation technique for CIC decimator to reduce bit growth is the non-recursive decimator. [2]

of repeating the input samples R-1 times. This implementation technique is efficient in terms of hardware usage as the number of adders and delay elements required is reduced, without compromising on the speed of the design.

Bit growth occurs due to the presence of integrators. The transfer function of a recursive CIC decimator is H (z) = [(1-z-R) / (1-z-1)] L where R is the downsampling rate and L is the number of stages.The differential delay(D) is one. The transfer function can be expressed as: R-1

H(z) =

[™ z ] -n

Fig 11.(a)Two stage CIC interpolator.

L

n=0

The above expression is further expanded as: H (z) = (1+ z-1 + z-2 + …. z-R+1) L If the downsampling rate R can be expressed as a power of 2, then by means of polynomial factoring the above expression can be factored as: H (z) = (1+ z-1) L (1+ z-2) L (1+ z-4) L. . (1+ z2 J-1)L. where R is expressed as the J th power of 2. The non-recursive decimator structure is depicted in Fig 9.

Fig 11.(b) The CIC interpolator implemented using the hold interpolator.

Fig 11.(a) depicts a two stage CIC interpolator, where the upsampling rate is 4. Fig11. (b) depicts the optimized CIC interpolator. The innermost single stage interpolator is replaced by a hold interpolator which repeats the input sample values 3 times. VII.

Fig 9. Non-recursive decimator

The advantage of the non-recursive decimator is that the integrator section is removed and this reduces the bit growth phenomenon occurring at the intermediate stages of the filter. In this structureh, bit growth occurs at the rate of only L bits per stage. B. Optimized CIC Interpolator: An efficient implementation technique for CIC interpolator is to use the hold interpolator. In a multistage CIC interpolator the core structure is a single stage interpolator [2]. The single stage interpolator is depicted in Fig 10.

MATLAB VERIFICATION RESULTS

The DC gain for an M stage CIC interpolator is (NR) M / R. Hence the maximum bit width of the output data is given by Bmax = log2 (RN)M/R + Bin. Table I shows the simulation results (obtained from Matlab and Xilinx ISE) for the CIC interpolator by varying the number of stages and upsampling rate for a differential delay of 1 and input bit width of 8 (sign bit=1, number of integer bits=3, number of fractional bits=4). TABLE I OUTPUT BIT WIDTH REQUIREMENTS FOR VARYING RATE CHANGE FACTOR AND NUMBER OF STAGES IN A CIC INTERPOLATOR

Input bit width

Varying Rate change

Differential delay

Varying no: of stages

Output bit width

8

2

1

2

9

8

2

1

3

10

8

4

1

2

10

8

4

1

3

12

Fig 10. Hold Interpolator

The function of the innermost single stage interpolator is to hold the input sample value for R-1 clock cycles, where R denotes the upsampling rate. Hence the core structure can be replaced by a hold interpolator which performs the operation

816 TABLE III DEVICE UTILIZATION SUMMARY FOR DECIMATOR NUMBER OF STAGES = 3, DOWNSAMPLING RATE = 4 AND DIFFERENTIAL DELAY = 1.

Fig 12. Error plot for a CIC interpolator with number of stages = 2,upsampling rate = 4 and differential delay =1.

Fig 12.(a) and (b) shows that when the output bit width is less than 10 there is an error. Fig 12.(c) shows that when the output bit width is 10 bits it accommodates the bit growth and hence error value is 0.Fig 12.(d) shows that if the output bit width is increased beyond 10 bits, the error value is 0. VIII.

XILINX SYNTHESIS RESULTS

The tabular columns depicted below compare the device utilization summary of the device xc3s100e -5 tq144 for a non-recursive CIC decimator and a recursive CIC decimator. TABLE II DEVICE UTILIZATION SUMMARY FOR DECIMATOR NUMBER OF STAGES = 2, DOWNSAMPLING RATE = 4 AND DIFFERENTIAL DELAY = 1.

Number of slice flip flops Number of 4 input LUTs Number of occupied slices Number of slices containing only related logic Number of slices containing unrelated logic Total number of 4 input LUTs Number of bonded IOBs Number of BUFGMUXs

Optimized

Un-optimized

92

115

72 49

84 65

49

65

0

0

72

86

72 0

84 2

24 1

25 1

The number of slice flip flops and slices usage is less in the optimized design compared to the recursive design. B. CIC Interpolator:

A. CIC Decimator:

Logic utilization

Logic utilization Number of slice flip flops Number of 4 input LUTs Number of occupied slices Number of slices containing only related logic Number of slices containing unrelated logic Total number of 4 input LUTs Number used as logic Number used as a routethru Number of bonded IOBs Number of BUFGMUXs

Optimized

Un-optimized

39 21

85 45

The tabular columns depicted below compare the device utilization summary of the device xc3s100e -5 tq144 for a CIC interpolator using a hold interpolator and a normal CIC interpolator TABLE IV DEVICE UTILIZATION SUMMARY FOR INTERPOLATOR NUMBER OF STAGES = 2, DOWNSAMPLING RATE = 2 AND DIFFERENTIAL DELAY = 1.

Logic utilization Number of slice flip flops Number of 4 input LUTs

21 21

45 45

0

0

21

45

22 1

22 1

Number of occupied slices Number of slices containing only related logic Number of slices containing unrelated logic Total number of 4 input LUTs Number of bonded IOBs Number of BUFGMUXs

Optimized

Un-optimized

53

63

43

50

28

38

28

38

0

0

53

50

22

23

1

1

Minimum period for the optimized interpolator: 3.521ns (Maximum Frequency: 284.010MHz).

817 Minimum period for the un-optimized interpolator: 3.544ns (Maximum Frequency: 282.171MHz). TABLE V DEVICE UTILIZATION SUMMARY FOR INTERPOLATOR NUMBER OF STAGES = 2, DOWNSAMPLING RATE = 4 AND DIFFERENTIAL DELAY = 1.

Logic utilization Number of slice flip flops Number of 4 input LUTs Number of occupied slices Number of slices containing related logic Number of slices containing unrelated logic Total number of 4 input LUTs

Number of bonded IOBs Number of BUFGMUXs

Optimized

Un-optimized

40 23 21 21

86 47 45 45

0

0

23

47

22 1

22 1

Minimum period for the optimized interpolator: 3.521ns (Maximum Frequency: 284.010MHz). Minimum period for the un-optimized interpolator: 3.544ns (Maximum Frequency: 282.171MHz). TABLE VI DEVICE UTILIZATION SUMMARY FOR INTERPOLATOR NUMBER OF STAGES = 3, DOWNSAMPLING RATE = 4 AND DIFFERENTIAL DELAY = 1.

Logic utilization Number of slice flip flops Number of 4 input LUTs Number of occupied slices Number of slices containing related logic Number of slices containing unrelated logic Total number of 4 input LUTs Number of bonded IOBs Number of BUFGMUXs

Optimized

Un-optimized

73 47 38

122 71 63

38

63

0

0

47

71

22 1

22 1

Minimum period for the optimized interpolator: 3.544ns (Maximum Frequency: 282.171MHz) Minimum period for the un-optimized interpolator: 3.544ns (Maximum Frequency: 282.171MHz). The results show that the hardware usage in the optimized CIC interpolator is far less compared to that of the un-optimized

CIC interpolator. Also, both the approximately the same frequency.

designs

work

at

IX. CONCLUSION In this paper the CIC decimator and interpolator structures are optimized in terms of hardware required for their implementation. The number of stages and rate factor are varied and the results are verified in Modelsim, Xilinx ISE FPGA synthesis and Matlab. The device utilization summary for decimator and interpolator with different stages are implemented in Xilinx Spartan 3 FPGA and verified in hardware. X.

ACKNOWLEDGEMENT

Authors express their sincere gratitude to Amrita Vishwa Vidyapeetham for providing valuable resources in the VLSI labs and digital library. . XI. REFERENCES [1] Eugene B. Hogenauer, "An Economical Class of Digital Filters for Decimation and Interpolation," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-29, no. 2, pp. 155-162, Apr. 1981. [2] Ricardo A. Losada and Richard Lyons,”Reducing CIC Filter Complexity”,IEEE Signal Processing magazine,pp. 124-126,July 2006. [3] Ricardo A. Losada,”Digital Filters with Matlab” ,The Mathworks Inc. ,May 2008. [4] Ramesh Bhakthavatchalu, Karthika V.S.,Lekshmi Ramesh,”Design and Implementation of Improved Attenuation CIC Decimator and Interpolator in FPGA”, International Journal of Recent trends in Engineering and Technology, ACEEE, pg. 18-22, Vol. 6, No.2, Nov2011. [5] Anna Engelbert, Carl Hallqvist,”Computable efficient recursive filters”, pp.14-23, Chalmers University of Technology, Sweden,2008. [6] Xilinx LogiCORE IP CIC Compiler v3.0, DS845 June 22, 2011.

[7] B.A.Shenoi,”Introduction to Digital Signal Processing and Filter Design”,JohnWiley and sons,Inc.,NewJersey,2006 [8] J. Mitola, "The Software Radio Architecture," IEEE Communications magazine, vol. 33, no. 5, pp. 26-38, May 1995. [9] Fredric J. Harris, Multirate Signal Processing for Communicating Systems, 2004. [10] Uwe Meyer-Baese. “Digital Signal Processing with Field Programmable Gate Arrays”. Springer-Verlag, New York, Inc.,Secaucus, NJ, USA, pp. 70-75, 2008. [11] Milos D.Ercegovac, Tomas Lang “Digital Arithmetic”, pp 631-664, Morgan Kaufmann Publishers,San Francisco,USA [12] Esteban O. Garcia, Rene Cumplido, Miguel Arias, “Pipelined CORDIC Design on FPGA for a Digital Sine and Cosine Waves Generator”. Proceedings of 3rd International Conference on Electrical and Electronics Engineering, pp 1-4, Sept 2006.

Suggest Documents