new radix-based fht algorithm for computing the discrete hartley ...

6 downloads 50 Views 307KB Size Report
In this paper, a new fast Hartley transform (FHT) algorithm- radix-22 suitable for pipeline implementation of the discrete. Hartley transform (DHT) is presented.
NEW RADIX-BASED FHT ALGORITHM FOR COMPUTING THE DISCRETE HARTLEY TRANSFORM M. T. Hamood and S. Boussakta School of Electrical, Electronic and Computer Engineering Newcastle University, Newcastle upon Tyne, NE1 7RU, England, UK {m.t.hamood, s.boussakta}@ncl.ac.uk ABSTRACT In this paper, a new fast Hartley transform (FHT) algorithmradix-22 suitable for pipeline implementation of the discrete Hartley transform (DHT) is presented. The proposed algorithm is developed by integrating two stages of the twiddle factor decomposition together into single butterfly, and applying the multidimensional index mapping technique. Radix-22 algorithm achieves at the same time both a simple and regular butterfly structure as a radix-2 algorithm and a reduced number of twiddle factor multiplication provided by a radix-4 algorithm and, unlike radix-4, can be applied to any transform length that is power-of-two with simple bit reversing for ordering the output sequence. The algorithm performance is analyzed and the number of multiplications and additions are calculated. Furthermore, a method for reducing the number of multiplications and additions is proposed, making it possible to noticeably improve the arithmetic complexity as compared with the existing FHT algorithms. Index Terms— Fast algorithms, discrete Hartley transform (DHT), radix-22 algorithm.

1. INTRODUCTION The discrete Hartley transform (DHT) [1, 2] has proved to be an efficient alternative to the discrete Fourier transform (DFT) for real data applications [3]. The main advantage of the DHT over the DFT is that its kernel is real, then it avoids the complex calculations. Furthermore, the DHT is an involuntary transform, therefore, it has the property of having its own inverse and except from a scale factor there is no need to distinguish between the forward and inverse transforms. Various fast Hartley transform (FHT) algorithms [4-9] have been introduced to compute the transform at high speed to meet the requirements of real-time applications. The first FHT algorithm reported by Bracewell [4] performs the DHT in complexity proportional to Nlog2N using radix-2 decimation in-time (DIT) algorithm. Sorenson et al [9] developed a complete set of FHT algorithms by using the

978-1-4577-0539-7/11/$26.00 ©2011 IEEE

1581

index mapping technique [10], implementing the algorithms for both DIT and DIF approaches and verified that the well known FFT algorithms can also be applied to the computation of the FHT. Among all of the FHT algorithms, the split-radix algorithm [9, 11, 12] appears to have the lowest arithmetic complexity. However, this algorithm has an inherent irregularity of the butterfly structure, making it less efficient for pipeline implementation, prompting Bracewell to develop the split-sequence FHT algorithm [13] greatly reducing the structural complexity at expense of more arithmetic operations. In recent years, there has been a growing interest in using a new FFT algorithm known as radix-22 [14-16], as well as its variants algorithms [17-20] in pipeline architectures. It achieves at the same time both a simple and regular butterfly structure as radix-2 and a reduced number of twiddle factor multiplication provided by radix-4. Thus, it is desirable to generalize the development of radix-22 algorithm to other transforms such as the DHT. In this paper, we propose the decimation-in-frequency (DIF) radix-22 FHT algorithm, by integrating two stages of the signal flow graph together for radix-2 algorithm into a single butterfly and applying the multidimensional linear index map. The algorithm’s performance has been examined by analysing its arithmetic complexity and subsequently comparing it with existing FHT algorithms. 2. ALGORITHM DERIVATION The DHT ܺሺ݇ሻ for a real data sequence ‫ݔ‬ሺ݊ሻ of length N is defined as [1]: N -1

X (k )

¦ x(n) cas(θnk )

(1)

n=0

where …ƒ•ሺߠሻ ൌ …‘•ሺߠሻ ൅ •‹ሺߠሻ, ߠ ൌ ʹߨȀܰ and the transform length N assumed to be an integer power of two. The development of radix-22 algorithm begins by considering the first two stages of the decomposition in the radix-2 DIF DHT together. Applying the 3-dimensional linear index map,

ICASSP 2011

N 2

n=

^n , n ^k , k

n1 + N4 n2 + n3

1

k = k1 + 2k2 + 4k3

1

` N d 4 - 1`

2

= 0, 1; 0 d n3 d

2

= 0, 1; 0 d k3

N 4 -1

N /4-1

(2)

(3)

1

¦ ¦ ¦ x(

2

n1 + 4 n2 + n3 ) cas(θ( 2 n1 + 4 n2 + n3 )(k1 + 2k2 + 4k3 )) N

N

¦

x( N4 - n) sin(θn) cas(θnk )

(12)

n=0

Substituting (8)-(12) into (6) and expand the summation with index ݊ଶ . After some manipulations, we get:

X (k1 + 2k2 + 4k3 ) N

N /4-1

x(n + N4 ) sin(θn) cas(-θnk ) = 

n=0

The common factor map (CFM) [21] of (1) has the form:

N/4-1 1

¦

X ( k1 + 2k2 + 4k3 ) N /4-1

N

¦ ª¬H

n3 =0 n2 =0 n1 =0

n3 =0

Using the following ܿܽ‫ ݏ‬identities:

k1 k2 N /2

(n3 )cos(θ( k1 + 2k2 )n3 )- H N1/22 (-n3 )sin(θ( k1 + 2k2 )n3 )º¼ kk

u cas( 4θn3 k )

(13)

3

cas( x + y ) = cos( x )cas( y ) + sin( x )cas(- y )

(4)

cas( x + yN ) = cas( x )

ೖ ೖ

భ మ where the second butterfly structure ‫ܪ‬ಿȀర is constructed as:

HNk /4k (n3 ) BNk /2 (n3 ) + (-1)k ª¬BNk /2 (n3 + N4 )cos( S2 k1 ) + BNk /2 ( N4 - n3 )sin( S2 k1 )º¼ 1 2

Equation (3) can be written as: (5)

X (k + 2k + 4k ) 1

2

N/4-1 1

3

1

¦ ¦ ¦ x(

N 2

n1 + N4 n2 + n3 )^cos( S2 n1 )cas(θ( N4 n2 + n3 )( k1 + 2k2 + 4k3 ))

n3 =0 n2 =0 n1 =0

+ sin( S2 n1 )cas(-θ( N4 n2 + n3 )(k1 +2k2 + 4k3 )) ` The basic idea of the new algorithm is to precede the second step decomposition of the remaining DHT coefficient, including the twiddle factor …ƒ•ሺߠሺܰȀͶ݊ଶ ൅ ݊ଷ ሻ݇ଵ ሻ and the exceptional values in multiplication before the next butterfly is constructed. Hence (5) becomes, X (k1 + 2k2 + 4k3 ) N/4-1 1

¦¦B

(6)

k1 N N /2

( 4 n2 + n3 ) cas(θ( N4 n2 + n3 )(k1 + 2k2 + 4k3 )) ௞

భ where the first butterfly structure ‫ܤ‬ಿȀమ is constructed as:

k1

N 2

B ( n2 + n3 ) x( n2 + n3 ) + (-1) x( n2 + n3 + ) N N/2 4

N 4

N 4

cas(θ( 4 n2 + n3 )(k1 +2k2 + 4k3 )) cos(θ( n + n3 )(k1 +2k2 ))cas(4θn3k3 ) N 4 2

 sin(θ( N4 n2 + n3 )(k1 +2k2 ))cas(-4θn3k3 )

N 4

cos(θ( n2 + n3 )(k1 + 2k2 )) cos(θ( n2 + n3 ))cos(k1 + 2k2 )

 sin(θ( N4 n2 + n3 ))sin(k1 + 2k2 ) sin(θ( N4 n2 + n3 )(k1 + 2k2 )) cos(θ( N4 n2 + n3 ))sin(k1 + 2k2 )

 sin(θ( n2 + n3 ))cos(k1 + 2k2 ) N 4

(7)

¦ x( n +

HN00/4 (n3 ) x(n3 ) + x(n3 + N2 ) + x(n3 + N4 ) + x(n3 + 34N )

(15)

HN01/4 (n3 ) x(n3 ) + x(n3 + N2 ) - x(n3 + N4 ) - x(n3 + 34N )

(16)

HN10/4 (n3 ) x(n3 ) - x(n3 + N2 ) + x( N4 - n3 ) - x( 34N - n3 )

(17)

HN11/4 (n3 ) x(n3 ) - x(n3 + N2 ) - x( N4 - n3 ) + x( 34N - n3 )

(18)

n=0

¦

x( N4 - n)cos(θn)cas(θnk )

¦H

X ( 4k 3 )

00 N /4

(n3 ) cas( 4θn3 k3 )

N/4-1

X ( 4k3 + 1) (8)

(19)

¦ >H

01 N/4

01

(20)

u cas( 4θn3k3 ) N/4-1

X ( 4k3 + 2)

¦ >H

10 N/4

@

10

(n3 ) cos(θn3 )  H N/4 (-n3 )sin(θn3 )

n3 =0

(21)

u cas( 4θn3k3 ) N/4-1

(10)

@

(n3 ) cos(2θn3 )  H N/4 (-n3 )sin(2θn3 )

n3 =0

(9)

X ( 4k3 + 3)

¦ >H

11 N/4

11

@

(n3 ) cos(3θn3 )  H N/4 (-n3 )sin(3θn3 )

n3 =0

(22)

u cas( 4θn3k3 )

N /4-1

N )cos(θn)cas(-θnk ) = 4

(14)

n3 =0

Also, using the following relations: N /4-1

1

N /4-1

The •‹ሺǤ ሻ and …‘•ሺǤ ሻ terms in (8) can be further simplified to yield, N 4

1

Equations (13) and (14) represent the general decomposition formula for the proposed radix-22 DIF algorithm; expanding it gives the desired output points. These points are derived by substituting values of k1 and k2 given in (2), therefore (14) can be expanded to:

Decomposing the composite twiddle factor given in (6), N

2

Finally, by substituting (15)-(18) into (13), we get a set of 4 DHTs of length N/4.

n3 =0 n2 =0

k1

1

(11)

n=0

1582

Combining eight points together gives an in-place butterfly of the radix-22 FHT algorithm as shown in Fig.1.

8

x(n)

1

x1

X(n)

1

x(N/4-n)

X(N/4-n) Cos (2T )

x(n+N/4)

Sin(2T )

x(N/2-n)

Cos (T )

Sin(T )

x(3N/4-n)

Cos (3T )

x(n+3N/4)

Sin(3T )

x(N-n)

1

2

X2 (a) 2

X1

X(N/2-n) 2

x2

X(n+N/2)

X2

(b)

X(3N/4-n) X(n+3N/4)

Sin(3T )

Cos (3T )

1 2

2

x1

Sin(T )

Cos (T )

X1

X(n+N/4)

Sin(2T )

Cos (2T )

x(n+N/2)

x2

2

X(N-n)

2

Figure 1 An in-place butterfly of the radix-2 FHT DIF algorithm; where ࣂ ൌ ૛࣊Ȁࡺ, solid and dot lines stand for addition and subtraction respectively.

Applying the above procedure recursively to the remaining DHTs of length N/4, the complete radix-22 DIF FHT algorithm is obtained. The great structural advantage of this algorithm is that it has a non-trivial multiplication at every even stage, whereas the odd stages contain additions/subtractions only. 3. ARITHMETIC COMPLEXITY In this section, the performance of the proposed algorithm is analyzed by calculating its number of multiplications and additions. This consideration is based upon the butterfly structure of the proposed algorithm shown in Fig. 1, and the decomposition formulas (19)-(22). In general, radix-22 algorithm needs (log2N) stages of butterfly computation. Each stage uses (3N/2-10) multiplications and (11N/4-6) additions. Additionally, four N/4 length DHTs must be calculated, thus the whole radix-22 satisfies the relations,

M(N ) = 4M( N4 )+ 32N

- 10

(23)

A(N ) = 4 A( N4 )+ 114N

-6

(24)

where M(N) and A(N) are the number of real multiplications and additions respectively, needed by the radix-22 algorithm for length-N DHT. It should be noted that the arithmetic operations associated with this algorithm for the power of four transform lengths are equal to those needed for the radix-4 algorithm [9]. However, owing to the symmetrical properties of the DHT kernel, the arithmetic complexity of the proposed algorithm can be further improved. This symmetry is based on the fact that the twiddle factors of the DHT (sine and cosine values) are the same at ߠ ൌ ߨȀͶ ( i.e. •‹ሺߠሻ ൌ …‘•ሺߠሻ ൌ ͳΤξʹ ).

1583

Figure 2 Partial signal flow graph for the (a) radix-22, FHT algorithm and (b) improved radix-22, FHT algorithm.

Such an arrangement allows optimization so that the number of multiplications and additions can be reduced. A view of this improvement is illustrated by the structure shown in Fig.2, which represents a partial signal flow graph extracted from the whole DHT graph at a specific length satisfying the condition ߠ ൌ ߨȀͶ. It can be proved that both of Fig. 2a and Fig. 2b are equivalents, From Fig. 2a,

ª X 1 º ª1 «X » = « ¬ 2 ¼ ¬1

1ºª 1 »« -1¼ ¬ 1

2 2

1 -1

º ª x1 º »« » 2 ¼ ¬ x2 ¼

2

(25)

and from Fig. 2b,

ª X1 º ª 2 «X » = « ¬ 2¼ ¬ 0

0 º ª x1 º

»« »

2 ¼ ¬ x2 ¼

(26)

Hence (25) is identical to (26), which means that Figs. 2a. and 2b. are also identical. To compute the improved complexity, as can be seen from Fig. 2, at each stage there are reductions in multiplications by a factor of 2, and in additions by a factor of 4 recursively. Therefore, the complexity relations for the improved algorithm are reduced to:

M(N ) = 4M( N4 )+ 32N - 12

(27)

A(N ) = 4 A( N4 )+ 114N - 10

(28)

The arithmetic complexities in (27) and (28) are recursive. To obtain the complexities for different transform lengths, the initial values of these complexities are needed. In this case, the initial values can be the number of operations that are needed by length-4 and length-8 DHTs, which are equal to M(4)=0, A(4)=8, M(8)=2 and A(8)=22 respectively. Substituting the initial values for M(4), M(8) in (27), A(4) and A(8) in (28) gives the computational complexities of the improved algorithm. A comparison has been made between this algorithm, radix-2, radix-4 and split-sequence FHT in terms of number of multiplications and additions, as shown in Figs 3 and 4

4

4

x 10

[1]

Radix-22

3

Number of Multiplications

5. REFERENCES

Radix-2 Split-Sequence Radix-4

3.5

[2]

2.5

[3]

2 1.5

[4]

1

[5]

0.5 0

[6] 0

500

1000

1500 2000 2500 3000 Transform Length - N

3500

4000

4500

Figure 3 Comparison between radix-22, radix-2, split-sequence and radix-4 FHT algorithms in terms of multiplications.

[7] [8]

4

7

x 10

[9] Radix-2 Split-Sequence Radix-4

6

[10]

Radix-22

Number of Additions

5

[11]

4

[12]

3

[13]

2

[14]

1

0

[15] 0

500

1000

1500 2000 2500 3000 Transform Length - N

3500

4000

4500

Figure 4 Comparison between radix-22, radix-2, split-sequence and radix-4 FHT algorithms in terms of additions.

[16]

respectively. The results of this comparison revealed that the developed algorithm outperforming radix-2, split-sequence algorithms and noticeably better than radix-4, and unlike the latter it can be applied for any powers of two transform lengths.

[17] [18] [19]

4. CONCLUSION In this paper, we presented a new DHT algorithm, which has a regular butterfly structure and applicable for any 2n-point DHT with simple bit reversing for ordering the output sequence. This algorithm has been developed by applying the multidimensional linear index decomposition technique and deriving proper divide-and-conquer relations. Comparisons based on arithmetic operations have been carried out between the developed, radix-2, split-sequence and radix-4 DHT algorithms. These comparisons have shown that the proposed radix-22 algorithm reduces the number of arithmetic operations and structural complexities compared to similar algorithms.

1584

[20]

[21]

R. N. Bracewell, "Discrete Hartley Transform," Journal of the Optical Society of America vol. 73, pp. 1832-1835, 1983. R. N. Bracewell, "The Hartley transform," Oxford University Press, Inc., 1986. R. N. Bracewell, "Assessing the Hartley transform," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 38, pp. 2174-2176, 1990. R. N. Bracewell, "The fast Hartley transform," Proceedings of the IEEE, vol. 72, pp. 1010-1018, 1984. H. S. Hou, "The Fast Hartley Transform Algorithm," IEEE Transactions on Computers, vol. C-36, pp. 147-156, 1987. K. J. Jones, "Design and parallel computation of regularised fast Hartley transform," IEE Proceedings -Vision, Image and Signal Processing., vol. 153, pp. 70-78, 2006. C. Kwong and K. Shiu, "Structured fast Hartley transform algorithms," IEEE Transactions on Acoustics, Speech and Signal Processing., vol. 34, pp. 1000-1002, 1986. H. J. Meckelburg and D. Lipka, "Fast Hartley transform algorithm," Electronics Letters, vol. 21, pp. 341-343, 1985. H. Sorensen, D. Jones, C. Burrus, and M. Heideman, "On computing the discrete Hartley transform," IEEE Transactions on Acoustics, Speech and Signal Processing., vol. 33, pp. 1231-1238, 1985. C. S. Burrus and T. W. Parks, DFT/FFT and Convolution Algorithms: John wiley and Sons, 1985. G. Bi, "Split radix algorithm for the discrete Hartley transform," Electronics Letters, vol. 30, pp. 1833-1835, 1994. P. Soo-Chang and W. Ja-Ling, "Split-radix fast Hartley transform," Electronics Letters, vol. 22, pp. 26-27, 1986. R. N. Bracewell, "Alternative to split-radix Hartley transform," Electronics Letters, vol. 23, pp. 1148-1149, 1987. O. Nibouche, S. Boussakta, M. Darnell, and M. Benaissa, "Algorithms and pipeline architectures for 2-D FFT and FFT-like transforms," Elsevier Digital Signal Processing Journal, vol. 20, pp. 1072-1086, 2010. H. Shousheng and M. Torkelson, "Designing pipeline FFT processor for OFDM (de)modulation," in URSI International Symposium on Signals, Systems, and Electronics, ISSSE 98. , 1998, pp. 257-262. H. Shousheng and M. Torkelson, "Design and implementation of a 1024-point pipeline FFT processor," in Proceedings of the IEEE Custom Integrated Circuits Conference., 1998, pp. 131-134. A. Cortes, I. Velez, and J. F. Sevillano, "Radix-r^k FFTs: Matricial Representation and SDC/SDF Pipeline Implementation," IEEE Transactions on Signal Processing., vol. 57, pp. 2824-2839, 2009. Y. J. Oh and M. S. Lim, "New radix-2 to the 4th power pipeline FFT processor," IEICE Trans. on Electronics, vol. E88, pp. 1740-1746, 2005. J. Yunho, Y. Hongil, and K. Jaeseok, "New efficient FFT algorithm and pipeline implementation results for OFDM/DMT applications," IEEE Transactions on Consumer Electronics., vol. 49, pp. 14-20, 2003. O. Nibouche, S. Boussakta, and M. Darnell, "Pipeline Architectures for Radix-2 New Mersenne Number Transform," IEEE Transactions on Circuits and Systems I: Regular Papers., vol. 56, pp. 1668-1680, 2009. C. Burrus, "Index mappings for multidimensional formulation of the DFT and convolution," IEEE Transactions on Acoustics, Speech and Signal Processing., vol. 25, pp. 239-242, 1977.

Suggest Documents