Multiplier Truncation in FPGA Based CWT

3 downloads 0 Views 282KB Size Report
Yahya T. Qassim1, Tim Cutmore and David Rowlands. Centre of Wireless ..... New Jersey, 1996. [9] J. E. Stine and O. M. Duverne, “Variations on truncated.
Multiplier Truncation in FPGA Based CWT Yahya T. Qassim1, Tim Cutmore and David Rowlands Centre of Wireless Monitoring and Applications Griffith University Brisbane, Australia 1 Email: [email protected] Abstract— This paper addresses the requirement of multiplier truncation on the FPGA based continuous wavelet transform (CWT) scalogram and compares it with the one produced by Matlab-software as a reference. A method was developed to give an appropriate truncation in the multiplier stage of the CWT. The Fast Fourier Transform (FFT) algorithm was used to compute the CWT at each time and scale. The VHDL language was used for design and implementation using Altium designer software targeting Spartan 3AN FPGA. The obtained results showed that hardware implementation achieved high degree of accuracy. The produced hardware scalogram in comparing with the software one has a NMSE of 0.0013, NAD of 0.0227 and SC quality measure of 0.998. Keywords-CWT; FFT; FPGA; scalogram quality; truncation.

I.

INTRODUCTION

The continuous wavelet transform (CWT) has been widely used for complex signal processing and analysis. A popular application for the CWT is the extraction of features from nonstationary 1-D signals by representing these signals in a 2-D scalogram [1]. This representation is useful in detecting high power regions in the signal at a localized time and frequency which gives the CWT more advantage than Fourier Transform (FT) analysis. A representative application for the CWT is the analysis of biomedical signals such as Electroencephalography (EEGs) targeting abnormalities [2], person’s response to stimulus [3], biofeedback [4] and other applications. The definition of the CWT for a signal X (t ) in the time domain using an analyzing wavelet ψ (t ) is as follows [5]: ∞

C ( s, b) = ∫ X (t ).ψ (t ).dt −∞

* s ,b

(1)

Where C (s, b) is the CWT coefficient at time b and scale s. The symbol ψ* is the complex conjugate of the basis wavelet function. Equation (1) represents the convolution between the input signal and the daughter wavelet at each time and scale. Sampled signals can be analyzed using the CWT according to the sampling frequency of the signal, where the CWT coefficients are approximated [6]. Due to the high complexity and details included in the CWT, most of its applied analysis in the last few decades has been done as off-line processing by software. The software method is flexible in changing the parameters of the used algorithms. However, in case of the need for real time

applications, the use of the CWT is usually avoided [7]. This is because the complex computations needed in calculating the implied CWT convolutions require significant time that makes the real time operation difficult to achieve [7]. The use of the Discrete Wavelet Transform (DWT) (computationally low requirement) produces a very low scalogram resolution compared with the one produced by the CWT as implied in [7]. Therefore, there is a need to implement the CWT by hardware for fast computations and high scalogram resolution. One of the useful properties of the FFT to the applications of signal analysis is that complex convolutions in the time domain can be considered as simple multiplications in the frequency domain and vice-versa. For example, if the input signal is x(t) and the wavelet function is y(t), then the relation between these two signals is as follows [8]:

x (t ) * y (t ) ⇔ X ( w) × Y ( w)

(2)

Where lower case symbols indicate the time domain representation and upper case symbols are the frequency domain components. By applying (2) on (1), the operation between the signal X(w) and the wavelet function Y(w) is converted to multiplication. Basically, the FFT based method for producing the CWT is getting both the signal and the wavelet function in the frequency domain, applying multiplications between them at different scales regarding the wavelet function then taking the inverse FFT of the multiplication product to return to the time domain [5]. The produced results are the wavelet coefficients. In this paper, complex Morlet wavelet is used as the mother wavelet. The Hardware implementation of the FFT is limited to the input resolution (number of bits) and the signal length. The needed multiplication in the mentioned algorithm extends the allocated bits for the signal and the wavelet function. For example, if both multiplier inputs are represented with 16 bits /sample, the output of the multiplier is in 32 bits/ sample. Therefore, there is a problem with the multiplication since it overruns the available size of the input to the inverse FFT. In addition, keeping finite word length signals through the design is necessary for reducing area and power consumption [9]. FPGA platforms are usually used as first prototypes for projects due to its flexibility of re-programming. Very high volume designs can be achieved with the Application Specific Integrated Circuits (ASIC) targeting final project designs [10]. Hence, the FPGA solution was chosen for the current implementation.

Previous work by the authors [11] has outlined the concept of a hardware CWT for real time applications on a FPGA platform. This paper details a methodology which manipulates the multiplication stage so that it can be reduced for the inverse FFT. The produced CWT hardware scalogram is compared with the same one as produced by software. The results show that hardware solution can compute the CWT coefficients and achieve high scalogram resolution.

16 bit signal sample (frequency domain) 16 bit x 16 bit Multiplier 16 bit wavelet sample (frequency domain)

This paper is organized as follows: the applied methodology is presented in section II. Section III demonstrates the FPGA implementation of the design using Computer Aided Design (CAD) tools whereas the results are in section IV and the conclusions are in section V.

32 bit product 8 MSB (Omitted)

8 LSB (Omitted)

Med 16 bits (result)

METHODOLOGY

This section describes the multiplication method which is employed the 16 bit multiplier stage in A. The performed truncation due to fixed word length requirement leads to quantization errors. Subsection B details the sources of some of the quantization errors. A. Multiplier truncation method The 16x16 bit digital multiplication doubles the result in 32 bit, a truncation procedure is used to keep the word length in the extension of 16 bit. This is necessary to prepare the result for the inverse FFT process that is computed by the same core. Fig. 1 shows the procedure required to maintain a finite word length in the multiplier stage of the design. In this figure, each signal sample is multiplied by the corresponding Morlet wavelet sample (both in the frequency domain). The digital multiplier produces the result of each multiplication in double the original input size. The used word length for representing the input signal and the wavelet function in the proposed design is 16 bits/sample. For the mentioned finite word length considerations, 16 bits are truncated to keep the result in a finite 16 bit word length. The least 8 significant bit is the least important because it hold less information weight comparing with the most significant bits and also holds more of the noise (fluctuations etc.). The 8 most significant bits are also truncated since it doesn’t affect the number sign i.e. no overflow happens for the final result. In addition, these 8 bits are not as activity changing as the lower order bits. The final result is taken to be the 16 bits in the middle of the multiplication result where experimentally this has shown that these bits give the best trade-off between accuracy and resolution. It should be noted that all numbers are represented in the twos complement number representation.

Fig. 1. The multiplication and the truncation processes.

B. Quantization errors Quantization errors appear at different stages of the proposed design. These errors negatively affect the final shape of the produced CWT which appears as small ripples in the boundaries between scalogram colors. It is surmised that the reason for the quantization errors are as follows: since the hardware implementation is dealing with binary bits, a quantization error is produced through approximating the input signal and the wavelet function to a specific word length. In addition, to keep a finite word length at the design stages, a truncation procedure is performed as mentioned in 2.A on the result of multiplication between the signal and the wavelet function in the frequency domain. This truncation produces a quantization error and it is external to the FFT and IFFT stages [9]. Internally, the Xilinx FFT core also uses truncation through its butterfly stages to accommodate the bit growth expansion [12]. This inherent FFT truncation adds quantization error on the analyzed signal. Fig. 2 shows an original and reconstructed nonstationary signal by the FFT core. Although the two signals seem matched, the quantization error still exists as Fig. 3 shows. This error is produced by the inverse FFT process. 50 original

40

reconstructed

30 Amplitude [uV]

II.

20 10 0 -10 -20

0

100

200

300

400

500 600 Time [ms]

700

800

900

Fig. 2 Original and reconstructed EEG signal

1000

1 wav_ scl10_ LSB Mem ory In strum en t

I[ 15..0] sp 6

CLK

GND

wav O[15..0]

I B[7 ..0]

OA[7..0] real_ resu lt_ LSB Memor y Instr ument

OB[ 7..0 ]

DIN[7 ..0] DOUT[7..0]

sp 7 O[ 15..0]

MEMORY_INSTRUMENT

Imag _result_LSB Memory In stru ment

CLK

J1 6B_8 B2

ADDR[1 1..0 ] WE

0

CLK

DI N[ 7..0 ] DOUT[7..0] IB[7..0]

I[15 ..0]

sp5

DIN[7..0] DOUT[7..0 ]

OA[7 ..0] OB[7..0]

ADDR[1 1..0 ] WE

IA[7 ..0]

ADDR[11 ..0] WE

J16B_8B2

J8 B2_1 6B

I A[ 7..0 ]

MEMORY_I NSTRUMENT

J8 B2_ 16B

MEMORY_INSTRUMENT

Mux 2 A[11. .0] Y[11. .0]

Amplitude [uV]

wav_ scl10_ msb Mem ory In strum en t

B[11.. 0] S0

CLK DIN[7 ..0] DOUT[7..0]

real_ resu lt_ MSB Memor y Instr ument CLK

M12 _B2 B1

Imad _result_MSB Memory In stru ment CLK

DI N[ 7..0 ] DOUT[7..0]

ADDR[1 1..0 ] WE

DIN[7..0] DOUT[7..0 ]

ADDR[1 1..0 ] WE

-1

GND

ADDR[11 ..0] WE

MEMORY_INSTRUMENT sp8

MEMORY_I NSTRUMENT O[15 ..0]

clk h an d_shake2 im_ in p[15 ..0] re_ inp[1 5..0 ] rst wav_ 10[1 5..0 ] GND

IA[7..0]

sp4

ad dress[1 1..0 ] add ress4[1 1..0 ] h and_shak e3 im _out[1 5..0 ] r e_out[1 5..0 ] write_ enable zmonit_im[3 1..0 ] zm onit_re[3 1..0 ] zsig _addr ess[9..0 ]

O[15 ..0]

IM_LSB

Memory I nstru ment

O[1 5..0 ]

Memor y Instr ument

CLK

IB[7..0]

CLK

DI N[7 ..0] DOUT[7..0]

IA[7..0] J8B2_1 6B

DI N[ 7..0 ] DOUT[7..0]

ADDR[9 ..0] WE

I[1 5..0 ] sp2

ADDR[9 ..0] WE

OA[7 ..0] OB[7..0]

mux1

I[1 5..0 ]

sp1

Y[9. .0] B[9.. 0] S0

J16 B_8 B2

800

ADDR[9 ..0] WE

ADDR[9 ..0] WE

DIGI TAL_ IO

600 Time [ms]

Memor y Instr ument CLK DI N[ 7..0 ] DOUT[7..0]

DI N[7 ..0] DOUT[7..0]

monitor Co nfigur able Digital IO hshale1 hshale2 addr ess[9..0]

400

IM_MSB

RE_MSB Memory I nstru ment CLK

M10_ B2B1

200

MEMORY_I NSTRUMENT

OB[7 ..0]

-3

0

J16B_8B2

MEMORY_I NSTRUMENT

OA[7..0 ]

A[9. .0]

-4

IB[ 7..0 ] IA[7..0]

J8B2 _16 B

RE_LSB sp3

VHDLENTITY: mu lt_ with_ tr uncation

-2

MEMORY_INSTRUMENT

IB[7 ..0]

J8B2 _16B

U_ mult_with _trun cation Multip lier_ controller.vh d

MEMORY_I NSTRUMENT

MEMORY_I NSTRUMENT

U_con tr_sram sram_con sider ation s.vh d

1000

clk addr ess[9..0] cntrl[2..0] fwd_ in v re_inp[ 15..0] fwd_ in v_we rst im_ou t[ 15..0] start_cntr lr re_ou t[ 15..0] xk_ ad dress[9 ..0] scale_sch_we xn_ ad dress[9 ..0] start za_re_cor e[15..0] zc_re_mem[ 15..0] zb_im_co re[15 ..0] zd _im_mem[ 15..0] zhand_ sh ak e3 ze_we zim ag _mult_r esult[1 5..0] zmem_ address_ after_mul[1 1..0 ] zreal_ mult_result[15 ..0] zp _hand _shake zscl_sch [9..0] zz_ scale_no [5..0] zzy_ coun ter[ 19..0] zzy_ hand_ sh ak e2

rst

start_ cntrlr

monitor 2 Co nfigur able Digital IO addr ess_mem_p ost_ mul[11 ..0] IM_to_ core[15 ..0] Re_ to _core[1 5..0 ] Scl_ No [5..0] ctr[19 ..0]

VHDLENTITY: contr_ sram

DIGI TAL_ IO cbuf BUF

clk

Fig.3 The quantization noise produced at the inverse FFT for signal reconstruction using Xilinx FFT core.

join O[2..0]

I2 I1 I0

J3 S_ 3B

mux 3 U1

FFT_I FFT clkp fwd_in v_wep fwd_in vp scale_sch_wep scale_schp[ 9..0 ] startp un lo adp xn _imp[1 5..0 ] xn _rep[1 5..0 ]

LSB Memo ry Instrumen t CLK DIN[7..0] DOUT[ 7..0 ]

Co nfigu rable Digital IO bu sy do ne dv edo ne rfd xk _im[15 ..0] xk _index [9..0 ] xk _re[1 5..0 ] xn _index [9..0 ]

bu sy p don ep dv p edon ep rfd p x k_imp[ 15..0] xk_ indexp [9..0] xk _rep[ 15..0] xn_ indexp [9..0]

Par ent

DA[9..0] Y[9 ..0] DB[ 9..0 ]

S

MUX addr ess[9..0]

inv 3 INV

DIGITAL_IO

ADDR[9..0] WE

GND

MEMORY_INST RUME NT sp _inpu t I B[7 ..0] O[15..0]

I A[ 7..0 ] J8 B2_ 16B MSB Memo ry Instrumen t CLK DIN[7..0] DOUT[ 7..0 ]

III.

DESIGN IMPLEMENTATION

The FFT based CWT is implemented using Matlab software for use as the standard and the same algorithm is implemented on hardware FPGA for comparison [13]. An EEG signal of 1024 point with 1 ms sampling interval is used to test the design. With this signal length, 37 wavelet scales are needed to represent the Morlet wavelet in the frequency domain. These scales cover the frequency range 0.5-500 Hz. The details on the FPGA implementation are context in the following subsection. A. FPGA scheme description The proposed design is implemented on Spartan 3AN FPGA on the Altium Nanoboard 3000 using Altium designer software package for design and test [14]. Altium designer was used to combine between Schematic and Hardware Description language (HDL) in one main design sheet. The Xilinx FFT core was imported in the Radix 4 configuration with the following settings: 16 bit width for data and phase factor, 1024 point transfer length, scaled, truncation in rounding mode and natural order output. The VHDL is used to design the needed controllers for design operation such as the multiplier controller which is used to perform the multiplication task and truncation as given in Fig. 1. When the design runs, the 1024 samples of the EEG signal are input to the FFT core. The FFT core loads these samples, computes the FFT of the signal then unloads the real and imaginary parts of the signal in the frequency domain. Block memory is used as buffers for real and imaginary parts of the transformed signal. The zoomed multiplier controller shown in Fig. 4 starts the multiplications between the transformed signal and the (previously saved) FFT of the Morlet wavelet function. This controller has three 16 bits input; two of them for the real and imaginary parts of the analyzed signal and one is used to input the samples of the wavelet function (wav_10). Hence, two multipliers are used in parallel, one for the real part of the signal and another one for the imaginary part. Directly after each multiplication, the previously described truncation is performed to keep in the finite word length of 16 bits. The multiplier controller output has three address buses; the first two are for source memories and the third one is for the result memory.

and 2 AND2 N1S

ADDR[9..0] WE

cntrl real_ outpu t_LSB Memor y Instr ument CLK

MEMORY_INST RUME NT GND

DI N[ 7..0 ] DOUT[7..0]

imag_ outpu t_ LSB Memo ry Instrumen t CLK DIN[7..0] DOUT [7..0 ]

I[15 ..0]

j1_img

I[ 15..0]

OB[7..0]

ADDR[ 9..0 ] WE OA[7..0] OB[ 7..0 ]

ADDR[9..0] WE

MEMORY_ INSTRUMENT

J16 B_8 B2

OA[7 ..0]

J16B_8B2

j1

IA[7..0] MEMORY_INSTRUMENT

j3

O[1 5..0 ]

real_d ata[1 5..0 ]

IB[7..0] J8 B2_ 16B

I A[ 7..0 ] j3_ img O[15..0] IB[ 7..0 ] J8B2_16 B imag_ outpu t_ MSB Memo ry Instrumen t CLK DIN[7..0] DOUT [7..0 ] ADDR[9..0] WE

imag_ data[15..0]

real_ outpu t_MSB Memor y Instr ument CLK DI N[ 7..0 ] DOUT[7..0] ADDR[ 9..0 ] WE

MEMORY_ INSTRUMENT

U_mult_with_truncation Multiplier_controller.vhd clk hand_shake2 im_inp[15..0] re_inp[15..0] rst wav_10[15..0]

address[11..0] address4[11..0] hand_shake3 im_out[15..0] re_out[15..0] write_enable zmonit_im[31..0] zmonit_re[31..0] zsig_address[9..0]

VHDLENTITY: mult_with_truncation

Fig. 4. CWT Design schematic and the zoomed multiplier controller.

The result memory stores the multiplication results re_out and im_out (real & Imaginary) when the write enable signal becomes high. Handshake signals are available at the input and the output of this controller as a task-start and task-end signals to the other controllers in the design. Finally, the zmonit_im and zmonit_re signal outputs represent the multiplication result without truncation for signal monitoring and test. The next task is to load the FFT core with these products to produce the CWT. As the Morlet wavelet is scaled by 37 different scales, the inverse FFT process is repeated 37 times. At each time, the FFT core produces the real and imaginary CWT coefficients at one wavelet scale. The design produces the whole wavelet coefficients in 1 ms at 125 MHz operational frequency. IV.

RESULTS

A real world EEG signal example is chosen for the CWT analysis and test (Fig. 5). The FFT based CWT is firstly calculated by Matlab and compared to the hardware result. The software result is considered the reference since no quantization error included in its scalogram.

Frequency [Hz]

A. Software result Fig. 6 shows the scalogram produced by the software implementation of this EEG signal. Only the frequency range 1.95-40Hz is shown since it contains most of the EEG frequency components, however, full calculations involving all scales are calculated and used. In Fig. 6 the high magnitude color is for low frequency components and low magnitude color is for high frequency components.

1.95

160

2.7

140

3.9

120

5.5

100

7.8

80

11

60

15.6 40 22

50

20

test EEG signal

37

40

200

Amplitude [uV]

30

400 600 Time [ms]

800

1000

20

Fig. 7. CWT analysis of the EEG signal in Fig.5-FPGA implementation

10 0 -10 -20

0

100

200

300

400

500 600 Time [ms]

700

800

900

1000

Frequency [Hz]

Fig. 5. The used EEG signal for design testing.

1.95

160

2.7

140

3.9

120

5.5

100

7.8

80

11

60

15.6 40 22 20 37 200

400 600 Time [ms]

800

1000

Fig. 6. CWT analysis of the EEG signal in Fig. 5-Software implementation.

The software scalogram shows the CWT coefficients that reflect the amplitude and frequency of the time series in graded colors. The highest amplitude of the time series in Fig. 5 between the period 600-800 ms is reflected by high color magnitude at the frequency range 3.5-5.5 Hz in Fig. 6. B. Hardware result The FPGA implementation of CWT in Fig. 7 shows the result of analyzing the same EEG signal by hardware. Still the high power coefficients existed at the same time and frequency ranges. The apparent high similarity by hardware on the same time-frequency-amplitude axes of the software scalogram indicates the accuracy of the considered truncation in the FPGA design. Some small differences are belonging to the quantization errors produced by the computation of the FFT core and the truncation of the multiplication products. These errors appear as ripples around the existed islands in the plot. This subjective comparison is supported by objective measures for result verification.

C. Objective comparison In addition to the viewed similarity, the difference between software and hardware scalogram plots is tested using image quality methods that target the objective measures of the quality for the hardware scalogram. Three different image quality methods were chosen to give proper comprehensive indication on the degree of similarity between hardware and software scalograms. This approach also means that weakness in any individual method can be avoided. These methods are [15]: the Normalized Mean Square Error (NMSE) which is the most popular measure. The second method is the normalized Average Difference (NAD): lower values of NAD indicate cleaner scalogram [16]. The third method is the Structural Content or Correlation (SC): estimation of the structure similarity between two scalogram plots. If the result of this measure is spread at 1, then the produced hardware scalogram is of better quality. Large SC values mean poor scalogram quality [17]. Table 1 shows the results of these three tests, where the software scalogram plot is the reference. In this table, the achieved three measures of quality are very close to the values produced in comparing two identical software derived scalogram plots. It can be seen that the three values are close to the symmetry which indicates that the results have a high degree of similarity. TABLE I. THE DEGREE OF CLOSENESS BETWEEN THE SOFTWARE AND HARDWARE SCALOGRAMS IN THREE NORMALIZED METHODS. THE SOFTWARE SCALOGRAM IS THE REFERENCE. Method NMSE NAD SC

Same software scalogram 0 0 1

V.

Hardware scalogram 0.0013 0.0227 0.9989

CONCLUSIONS

The obtained results in Figs. 6-7 and table I show that hardware implementation has a high similarity scalogram in

comparing with the error-free software one. Fig. 7 gives a high quality scalogram as a subjective measure supported by an objective measures in table I. The three methods of scalogram quality measures given in table I show how that a high quality of matching is achieved between the hardware and the reference software result. The NMSE is very low, the NAD is close to zero and the SC is spread at 1. Therefore we can conclude that the hardware result is reliable and accurate. These results approve that the accuracy of the followed methodology and lead to conclusion that the quantization errors, re-scaling of the considered word length by the FFT core and the handled truncation after multiplications have no major effect on the produced FPGA scalogram. ACKNOWLEDGMENT This work was supported by the Ministry of Higher Education and Scientific Research of Iraq. REFERENCES [1]

[2]

[3]

[4]

A. Muñoz, R. Ertlé, and M. Unser, "Continuous wavelet transform with arbitrary scales and O (N) complexity," Signal Processing, vol. 82, pp. 749-757, 2002. V. Sakkalis, T. Oikonomou, E. Pachou, I. Tollis, S. Micheloyannis, and M. Zervakis, "Time-significant wavelet coherence for the evaluation of schizophrenic brain activity using a graph theory approach," in Engineering in Medicine and Biology Society, 2006. EMBS '06. 28th Annual International Conference of the IEEE, 2006, pp. 4265-4268. A. Varnavas and M. Petrou, "Human'S performance prediction with wavelet analysis of EEG data," in Digital Signal Processing, 2007 15th International Conference of the IEEE, 2007, pp. 167-170. A. W. Keizer, R. S. Verment, and B. Hommel, "Enhancing cognitive control through neurofeedback: A role of gamma-band activity in managing episodic retrieval," Neuroimage, vol. 49, pp. 3404-3413, 2010.

[5]

[6]

[7]

[8]

[9]

[10] [11]

[12] [13]

[14] [15] [16]

[17]

C. Torrence and G. P. Compo, "A practical guide to wavelet analysis," Bulletin of the American Meteorological Society vol. 79, pp. 61-78, 1998. A. Grinsted, J. C. Moore, and S. Jevrejeva, "Application of the cross wavelet transform and wavelet coherence to geophysical time series," Nonlinear processes in Geophysics, pp. 561-566, 2004. T. Rondik and J. Ciniburk, "Comparison of various approaches for P3 component detection using basic methods for signal processing," in Biomedical Engineering and Informatics (BMEI), 2011 4th International Conference on, 2011, pp. 698-702. J. G. Proakis, D. G. Manolakis, “Digital signal processing: principles, algorithms and applications”, 3rd Ed., Prentice-Hall, Upper Saddle River, New Jersey, 1996. J. E. Stine and O. M. Duverne, “Variations on truncated multiplications,” Proceedings of the Euromicro Symposium on Digital System Design (DSD’03), 2003, pp. 112-119. Xilinx, Inc “FPGA vs ASIC”, Accessed July 25, 2012 http://www.xilinx.com/fpga/asic.htm Y. Qassim, T. Cutmore, D. James and D. Rowlands, “FPGA implementation of Morlet continuous wavelet transform for EEG analysis,” 4th International Conference on Computer & Communication Engineering (ICCCE), 2012, pp. 59-64 . The Xilinx Fast Fourier Transform core data sheet v7.1, Accessed Dec. 20, 2011 www.xilinx.com/support/documentation/ip.../xfft_ds260.pdf Xilinx, Inc “Spartan-3AN FPGA Family Data Sheet” Accessed Feb. 6, 2012 http://www.xilinx.com/support/documentation/spartan3an_data_sheets.htm Altium Limited, “Nanoboard-3000” Accessed Feb. 6, 2012 http://nb3000.altium.com/intro.html. A. M. Eskicioglu and P. S. Fisher, "Image quality measures and their performance," Communications, IEEE Transactions on, vol. 43, pp. 2959-2965, 1995. N.Bourbakis, “Detecting similarities and differences in images using the PFF and LGG approaches,” Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’02), 2002, pp. 355-362. S. Poobal and G. Ravindran, “The performance of fractal image compression on different imaging using objective quality measures,” International Journal of Engineering Science, vol. 3, pp. 525-530, 2011.