Speeding-up Fractal Audio Compression Using

0 downloads 0 Views 738KB Size Report
Keywords: Fractal audio compression, Moment descriptor, Selective search, .... We should aim to select descriptors leading to large between class distance and.
Journal of Law and Society M anagement 5 (1) © Nabu Research Academy, 2018

Speeding-up Fractal Audio Compression Using Selective Search Based on Proposed Affain Invariant Descriptors Zahraa A. Hasan and Loay E. George College of Science, Computer Science Department, Baghdad University, Baghdad, Iraq

Abstract: Fractal coding has been used for years in the field of compression. Many researchers and programmers considered this algorithm as a promising base of their work, but they noticed that it suffers from the long "encoding time", that is due to the exhaustive search performed to encode each range block, in which the whole blocks of domain pool is searched with a purpose to find the closest domain block. The aim of this research is to reduce the long encoding process by utilizing a selective based search method, so instead of making exhaustive search on the entire domain pool, only a subset (class) of this pool will be searched to encode each range block. The proposed method is based on an indexing scheme, this scheme relies upon the usage of moment descriptors as indexing parameters for the classes obtained; this approach does not affect the performance of the affain transform equation during the matching process . In the last few years, the moment concept was used by some researchers to achieve enhanced compression performance. In this article, a selective scheme that based on new set of weakly correlated moments is introduced. It retains as much as possible of their class discriminatory information that leads to find the best approximations for the range blocks during the matching stage, Moreover it leads to a high reduction in the encoding time compared to traditional method without making significant degradation in the fidelity of compressed audio file. Keywords: Fractal audio compression, Moment descriptor, Selective search, Indexing scheme

1. Introduction Audio compression technique in our modern life is addressed in several technologies that is include sound of data, radios, TV and telephones etc. [5]. Over the last two decades several audio compression algorithms were developed due to the need of establishing a coding methodology that satisfies the conflicting demands of high compression ratios and transparent quality for high fidelity audio signals [2]. Fractal audio compression is considered one the most important audio compression techniques which are of current interest [1]. Since the evolution of PIFS by Barsnly [3], followed by Arnode Jacquine [4], fractal image coding has been exceedingly studied and several schemes have been introduced and implemented. Because of fractal image coding's various shortcomings, researchers still cannot deliver a practical efficient fractal image coding scheme in terms of coding time. Thus, fractal coding is still under developments on images, and it is rarely applied on other types of data. Recently, fractal coding has been performed on audio data. Our interest is to explore this possibility. However, the problem that fractal compression suffers from is the long "encoding time", that is due to the exhaustive search performed in the encoding process even for small audio files. Different solutions have been used to handle this hard problem. A single indexing scheme that based totally on the usage of a single descriptor is developed; this could bring excellent reduction in the encoding time by the help of the moment descriptors.

95

2. Literature Review In 2005, Henry Xiao explored the overall performance of utilizing fractal coding on audio data in his thesis. Additionally, he had studied a few traditional fractal coding problems to provide a top -level view about the issue. He determined that fractal coding is not particular sufficient to handle all audio compression demands. Moreover, he determined that the compression ratio from utilizing fractal audio coding can be in a range from 3 to 6 [8]. In 2006, George, explored a new fractal image coding scheme based totally on IFStransform for zero mean range-area blocks. He had made some upgrades on the IFS-matching level through using moments primarily based indexing as a way to pick the set of area blocks which are suitable to suit every variety block individually. additionally, he used a stopping search condition based totally on tracking the minimal matching error, as a reason to reduce the long fractal coding time (round 3.5 times) [9]. In 2011, George and Eman brought a fast image compression method; they make use of moment functions extracted from the zero-mean range blocks and a new symmetry prediction criterion is used to decrease the number of isometric trails from 8 to 1 trail. Moments had been utilized to speed up the iterated function systems (IFS) matching level. These functions had been used to find out the block descriptor "moment ratio index", which is applied to classify the image blocks in both domain and range pools all through the encoding level. The block moment ratio descriptor of every range blocks is used to scan the area blocks and pick only the blocks whose descriptor is suitable to be IFS matched with the examined range block [11]. In 2013, L. E. George and Azhar Khanjar brought a fast fractal audio compression technique by using a selective based search strategy instead of traditional search approach can speed up the encoding process. The propos ed technique is based on using block indexing method in order to categories the domain pool blocks. The block indexing method group the domain pool into classes each of them has a specific classifier by using the moment descriptors as indexing parameters. Thus, only a subset of these blocks is tested instead of making the exhaustive search through the whole domain blocks [10],

3.The Proposed Search Approach A single descriptor approach is utilized in the speeding up task, it is considered as "one level block indexing method”. According to this approach the domain blocks are classified into classes using one identifier (descriptor) of good discrimination properties. The introduced mechanism will improve the efficiency of filtering and lead to reduction in the nominated domain blocks.

3.1 Trigonometric Weighted Moment Descriptor The distribution of material from a reference point or an axis is presented by a set of parameters called moments. The using of moments in the construction of the audio feature vec tor is one of the most commonly approach to differentiate blocks. The process of selecting the moments is very crucial, because if moments with little discrimination power are selected then the subsequent design of a classifier would lead to poor performa nce. On the other hand, if we selected the information-rich moments, the classifier’s design can be greatly simplified in a more quantative description. Moments have several orders each reflects different information for the same audio [6]. In this research work, the descriptors moments based are determined for each block belong to the domain and range pools using a set of sinusoidal weighted moments. The used moments are three (m1 , m2 , & m3 ) are of a good discrimination power. For an audio block, I(), the moments are computed using the following weights representations: 𝜋𝑖 cos ( ) 𝑛 (1) 𝑤1 (𝑖) = 𝜋𝑗 𝑛 −1 ∑𝑗=0 |cos ( )| 𝑛 2𝜋𝑖 ) cos ( 𝑛 (2) 𝑤2 (𝑖) = 2𝜋𝑖 −1 | ∑𝑛𝑗 =0 )| cos ( 𝑛 2𝜋𝑖 ) sin ( 𝑛 (3) 𝑤3 (𝑖) = 2𝜋𝑖 −1 | ∑𝑛𝑗 =0 ( )| sin 𝑛

96

Where, iϵ{0.1,2,…n-1}; n is the block size. The denominators in the above three equations are the normalization factors. The main step in moment selection is to look at each of the generated mome nts independently and test their discriminatory capabilities for the problem to handle. The main step in the selection of moments is to look Then, the three sinusoidal weighted moments are determined using the follow-ing: 𝑛−1

𝑚1 = ∑ 𝑤1 (𝑖) 𝑊𝑎𝑣 (𝑖)

(4)

𝑖 =0

𝑛−1

𝑚2 = ∑ 𝑤2 (𝑖) 𝑊𝑎𝑣(𝑖)

(5)

𝑖 =0 𝑖 −1

𝑚3 = ∑ 𝑤3 (𝑖 ) 𝑊𝑎𝑣 (𝑖)

(6)

𝑖 =0

Where, Wav(i) is the

ith

audio sample of the block.

The three types of considered affine transform invariants descriptors of audio blocks listed in the established domain code book: 𝐷1 = {

𝑁𝑏𝑖𝑛𝑠 |𝑚1 /𝑚2 | 𝑖𝑓 |𝑚1 | ≤ |𝑚2 | (7) 𝑁𝑏𝑖𝑛𝑠 |𝑚2 /𝑚1 | |𝑚2 | ≤ |𝑚1 |

|𝑚 /𝑚 | 𝑖𝑓 |𝑚1 | ≤ |𝑚3 | 𝑁 𝐷2 = { 𝑏𝑖𝑛𝑠 1 3 (8) 𝑁𝑏𝑖𝑛𝑠 |𝑚3 /𝑚1 | 𝑖𝑓 |𝑚3 | ≤ |𝑚1 | |𝑚 /𝑚 | 𝑖𝑓 |𝑚2 | ≤ | 𝑚3 | 𝑁 𝐷3 = { 𝑏𝑖𝑛𝑠 2 3 (9) 𝑁𝑏𝑖𝑛𝑠 |𝑚3 /𝑚2 | 𝑖𝑓 |𝑚3 | ≤ | 𝑚2 |

This work uses mean values as a reference to discard descriptors based on moment identifiers, but what if the corresponding mean values are closely located then it not guarantees good discrimination properties of a descriptor passing the test. We should aim to select descriptors leading to large between class distance and small within class variance. Table (1) lists, the mean values of these descriptors, and the mean absolute divergence of them, and the mutual covariance of each pair of these descriptors. Table (2) shows the cross correlation coefficients of mutual descriptors. Table (1) The 1 st Order Statistics of Descriptor Disc

Mean

AbsDef

StdDev

1

41.277

23.687

27.664

2 3

74.017 42.303

0.056 23.809

0.908 27.793

Table (2) The Cross Correlation Coefficients of Mutual Descriptors

Disc_A/Disc_B

CCC Value

1/2 1/3 2/3

0.007 0.879 0.001

The developed single-descriptor based approach have taken into consideration the both types of descriptors (𝐷1 and 𝐷3 ) which have good discrimination properties that leads to a better domain pool classification, and have been used individually to implement this approach. The single descriptor-based approach is named the first descriptor-based approach, and the other one is named second descriptor-based approach.

97

4.Encoding process steps Fractal audio compression model is divided into two categories Encoding and Decoding units. Fractal encoding is the process of finding close mappings, or transformation s if affine is required, for each range block from the domain blocks. After encoding the domain location and the coefficients of each transform are stored. So quite a lot of bits can then be saved over the original data. The range and domain pools must be partitioned into subspaces to determine the process of finding the mappings from t he domain block to range blocks. The steps of the encoding process of the selective based search strategy is as follows: 1. Select the wave audio file and convert it to binary then load its audio data in a one-dimensional array. 2. Construct the range pool, then construct the domain pool by down sampling the range pool. 3. For every domain block in the domain pool a. Calculate the moments (𝑚1 , 𝑚2 and 𝑚3 ) for the selected domain block using the equations (4), (5) and (6). b. Calculate the two descriptors (𝐷1 ) and (𝐷3 ) using equations (7) and (9) respectively. c. Calculate the moment index value for (𝐷1 ) and 𝐷3 ) using equation (10) to speed up the access to a desired set of domain blocks. The application of this concept requires building a data structure to guarantee faster access to the data. This data structure may contain the original data record with an index number for that record. To determine the index values for each range and domain blocks, the determined descriptor values (𝐷1 and 𝐷3 ) are mapped into integer values using the following equations: Index=round(|D| * NoBins)

(10)

Where, D represents either 𝐷1 or 𝐷3 (depending on the adopted type of the two proposed descriptors. NoBins represents the number of indexing bins, or in other words represents the range of the indexing values (i.e. [0, NoBins]). d. Calculate the parameters (𝑑̅ 𝑎𝑛𝑑 𝜎𝑑2 ) of the selected domain block and store the values in an array of records, this step reduces the computational redundancy; So the values are computed, stored and ready to use later in the matching process. The following equations are used to compute (𝑑̅ ) 𝑎𝑛𝑑 (𝜎𝑑2 ) 1

𝑑̅ =

𝑛

𝑛 −1

∑ 𝑑𝑖

1

𝜎𝑑2 =

(11)

𝑖=0 𝑛−1

𝑛

∑ (𝑑 𝑖 − 𝑑̅ )2

(12)

𝑖 =0

Where, {d i | i=0,…, n-1} is a domain block, 𝑑̅ is the mean value of domain blocks, and (n) is the number of domain blocks in the domain pool, 𝜎𝑑2 is the variance for domain blocks. e. Sort the above array of records in ascending order according to the moment index values. f. Construct an array of records (P), this array is a set of pointers (start and end) to determine the boundaries of each class of blocks that have the exact descriptor index values. 4. For every range block listed in the range pool: a. Calculate the moment descriptor value then determine its moment index value by using the same previous equations adopted in the domain pool. b. Calculate the parameters of the range block by the following equations: 𝑟̅ =

1 𝑛

𝜎𝑟2 =

𝑛 −1

∑ 𝑟𝑖

(13)

𝑖=0 𝑛−1

1

𝑛

∑ (𝑟𝑖 − 𝑟̅) 2

(14)

𝑖=0

Where, {ri | i=0,..., n-1} is a range block; 𝑟̅ is the mean value of range blocks, 𝜎𝑟2 is the variance for the range blocks, and (n) is the number of range blocks in the range pool. c. Determine the boundaries of each class of domain blocks by using the following array(P). d. In each matching instance calculate the scale (S) and error (𝑥 2 ) using the following equations [7]: 𝑟̃𝑖 = 𝑠 (𝑑 𝑖 − 𝑑̅ ) + 𝑟̅ −1 ∑𝑛𝑖 =0 𝑟𝑖 𝑑 𝑖

(15)

− 𝑛𝑟̅ 𝑑̅

(16) 𝑛 ̅ 𝜒 2 = 𝑛 ( 𝜎𝑟2 + 𝑠 𝜎𝑑2 ) − 2𝑠 ( ∑𝑛−1 𝑖=0 𝑟𝑖 𝑑 𝑖 − 𝑛𝑟̅ 𝑑 ) 𝑠=

𝜎𝑑2 2

(17)

Where; 𝑟̃𝑖 is the approximate range block generated by applying the affain ‘transform; s is the scaling coefficient of the affain transform.

98

2 e. At each matching instance the computed error value (𝜒 2) is compared with the minimum error (𝜒 𝑚𝑖𝑛 ) 2 2 2 2 of the previous matching instances; such that if (𝜒 ) is smaller than (𝜒 𝑚𝑖𝑛 ) then set (𝜒 𝑚𝑖𝑛 = 𝜒 ), and then set the scale value (S) and the position (Pos) of the matched domain block as the best IFS matching parameters. 2 2 f. Compare the predefined permissible error value (𝐸𝑡ℎ𝑟 ) with (𝜒 𝑚𝑖𝑛 ) such that if (𝜒 𝑚𝑖𝑛 < 𝐸𝑡ℎ𝑟 ) then the search is stopped and the set (S, Pos, r) is considered as the output of the IFS code for the tested range block, then go to step 4g. g. If the selected class has insufficient number of blocks compared to the predefined minimum block number, then the acceptable approximate for the tested range block may not be selected. In this case classes that hold similar descriptor index values must be involved in the matching method to find near optimal code for the tested range. Store the IFS code for the tested range block in the compression stream and then return to the beginning of step 4.

5 Test Results The current section presents the results of the conducted tests in order to study the effects of several coding parameters on the performance of the adopted FAC schemes which are based on utilizing the moment descriptors. The commonly system performance measures that are used in the following tests (i.e. mean square error (MSE), encoding time (ET), peak signal to noise ratio (PSNR), and compression ratio (CR)) are used to evaluate the performance efficiency of the FAC schemes. The effects of the compression parameters have been studied through the following tests. 1. Minimum error effect. 2. Number of bins effect and Block length effect. 3. Minimum Block Threshold test Finally, it is essential to mention some information that is related to the SW & HW environment at which the program is developed and executed. The program was constructed using C# visual studio as SW development tool, and the HW environment used during our test is a single PC with Intel (R) Core™ (1.80 GHz), and 4.00 GB RAM, Moreover, the developed singledescriptor based approach have taken into consideration the both types of descriptors (D and D ) which have 1

3

been used individually to implement this approach. The single descriptor-based approach is named first descriptor-based approach, and the other one is named second descriptor-based approach. In this set of tests a wave file of size 1.6 MB is used as test materials.

5.1 Block Length and Number of Bins Tests This set of conducted tests aimed to investigate the effects of the Block length parameter and number of Bins on the compression performance parameters in both cases the first and the second descriptor approaches. In these tests the value of other coding parameters is set at: the permissible minimum error valu e= 0.0001, ScaleBit=6, OffsetBit=8 and MaxScale=3. Tables (3–6) presents the effects of Block length parameter on the compression performance parameters .

99

Table (3) Effects of Block Length parameter on the compression performance of NoBins=10 Descriptor Block Type Length

𝑫𝟏

𝑫𝟑

PSNR (db)

1656.989 2056.988 2553.467 3052.798 3550.109 4049.685 5049.488 6048.167 7047.891 1656.989 2056.989 2553.467 3052.596 3550.109 4049.586 5049.208 6047.891 7047.632

CR

ET (sec)

3.55529.589 4.44429.581 5.55529.415 6.66628.991 7.77728.948 8.88828.882 11.11128.799 13.33328.458 15.55528.333 3.55529.999 4.44430.014 5.55530.011 6.66629.987 7.77729.985 8.88829.985 11.11129.983 13.33329.799 15.55529.798

Table (4) Effects of Block Length parameter on the compression performance of NoBins=100

Descriptor Block PSNR Type Length (db)

𝑫𝟏

𝑫𝟑

1657.569 2056.989 2553.979 3053.716 3551.396 4050.579 5050.222 6048.944 7048.616 1657.569 2056.989 2553.716 3052.798 3551.396 4050.705 5050.109 6048.860 7048.461

100

CR

ET (sec)

3.55519.225 4.44419.285 5.55519.211 6.66618.995 7.77718.997 8.88818.994 11.11118.894 13.33318.892 15.55518.877 3.55519.959 4.44419.942 5.55519.941 6.66619.899 7.77719.881 8.88819.875 11.11119.871 13.33319.752 15.55519.749

Table (5) Effects of Block Length parameter on the compression performance of NoBins=250

Descriptor Block PSNR Type Length (db)

𝑫𝟏

𝑫𝟑

1657.508 2056.989 2554.559 3053.979 3552.041 4051.106 5050.457 6049.208 7048.860 1657.447 2056.989 2554.259 3053.716 3552.041 4050.969 5050.338 6049.118 7048.777

CR

ET (sec)

3.55511.969 4.44411.988 5.55511.978 6.66611.976 7.77711.967 8.88811.966 11.11111.952 13.33311.941 15.55511.932 3.55512.022 4.44412.011 5.55511.995 6.66611.985 7.77711.981 8.88811.979 11.11111.975 13.33311.965 15.55511.879

Table (6) Effects of Block Length parameter on the compression performance of NoBins=350

Descriptor Block PSNR Type Length (db)

𝑫𝟏

𝑫𝟑

1657.569 2057.542 2554.881 3054.259 3552.798 4051.396 5050.969 6049.892 7049.488 1647.569 2056.989 2554.559 3053.979 3552.798 4051.396 5050.835 6049.788 7049.299

101

CR

ET (sec)

3.5552.998 4.4442.995 5.5552.994 6.6662.991 7.7772.897 8.8882.894 11.1112.891 13.3332.797 15.5552.795 3.5553.012 4.4443.001 5.5552.999 6.6662.985 7.7772.980 8.8882.973 11.1112.969 13.3332.961 15.5552.958

The results of the conducted tests on first descriptor-based approach show that: 1. MSE is inversely proportional with NoBins parameter. 2. PSNR increases with the increase of NoBins parameter. 3. CR is not affected by the NoBins variation. 4. ET is highly affected by the NoBins variation, such that the increase in NoBins para meter value led to high reduction in the encoding time. 5. The increase of Block Length causes an increase in MSE. 6. PSNR is inversely proportional with Block Length. 7. The increase in Block Length causes an increase in CR. 8. ET is reduced with the increase of Block Length. 9. In the second descriptor-based approach the MSE is slightly increased as The Block Length increases compared to the first descriptor-based approach. 10. The CR is not affected and remains constant in the second descriptor-based approach. 11. ET and PSNR are decreased as Block Length increases in the second-descriptor based approach.

4.2 Minimum Block Threshold test This parameter was used to define the optimum domain block for a particular range block. However, if the number of tested domain blocks is less than the minimum block threshold then a new search will be done on classes of domain blocks that have close descriptor values; Then the most accurate domain block will be selected among the new tested blocks. Table (7) shows the effects of minimum block parameter on the compression performance. Table (7) Effects of MinBlk parameter on the compression performance

Descriptor Type

Min PSNR Blk (db)

157.346 57.346 57.346 57.346 30057.346 40057.346 50057.346 100058.239 500059.918 1 57.569 10 57.569 100 57.569 200 57.569 300 57.569 400 57.569 500 57.569 1000 58.239 5000 59.030 10 100 200

𝑫𝟏

𝑫𝟑

CR

ET (sec)

3.55511.969 3.555 11.969 3.555 11.969 3.55511.968 3.55511.969 3.55511.969 3.55511.969 3.55512.023 3.55512.026 3.555 12.021 3.555 12.022 3.555 12.022 3.555 12.022 3.555 12.022 3.555 12.022 3.555 12.022 3.555 13.011 3.555 13.021

The results of the conducted tests show that: 1. MSE is inversely proportional with MinBlk parameter because the most accurate do main block is selected during the search. 2. PSNR increases with the increase of MinBlk parameter. 3. CR is not affected by the MinBlk variation. 4. ET is affected slightly by the MinBlk parameter. 5. The MSE is increased in the second-descriptor based approach compared to the first, While the PSNR and the ET are decreased. Moreover, the CR remains constant in both approaches.

102

5.3 Comparison with the Results of the Traditional Approach This section is used to study the performance of both descriptors (𝐷1 and 𝐷3 ) of the developed FAC scheme on the traditional approach by comparing the values of PSNR and the ET between the three approaches. The values of the coding parameters of the developed approaches are set as: NoBins=250, MinError=0.0001. Table (8) shows the comparison between the three adopted approaches. Table (8) Comparison between the performance of the traditional approach and the developed approach Traditional First Second Number Approach Descriptor Descriptor of Tested PSNR ET PSNR ET PSNR ET Sample (db) (sec) (db) (sec) (db) (sec) 1 2 3 4

25.070140.719 42.872147.131 37.71756.642 57.569275.233

24.053 3.12523.906 41.549 8.78841.396 37.632 0.99937.569 57.508 11.96957.447

3.929 9.701 1.029 12.022

6.Conclusion The effect of the two adopted descriptors on both PSNR and ET is very sufficient such that both parameters hold good and acceptable values in comparison to the traditional approach. Such that a high reduction in the encoding time and a significant increase in the PSNR is obtained during the encoding process. But the first descriptor (𝐷1 ) led to better results than that depends on the second descriptor (𝐷3 ) on the conducted audio files.

References 1. Gunjal, S. D. and Raut, R. D. ; "Advance Source Coding Techniques for Au -dio/Speech Signal: A Survey"; Internati-onal Journal of Computer Technology and Applications; Vol. 3, No.4; Pp. 1335-1342; 2012. 2. Spanias, A., Painter, T., and Atti, V.; "Audio Signal Processing and Coding"; W iley-Interscience; 2007. 3. Barnsley, M. F.; "Fractals Everywhere"; Academic Press; 1993. 4. Jacquin, A. E.; "Image Coding Based on a Fractal Theory of Iterated Contractive Image Transformation"; IEEE image processing; Vol. 1, No. 1; pp. 18-30. 5. Jaiswal, R. C. (2009). “Audio-Video 3.41. ISBN 9788190639675

Engineering”.

Pune,

Maharashtra:

Nirali

Prakashan.

6. George, L. E. and Al-Hillo, E. A.; "Speeding Up Fractal Color Image Compression Using Moment Features Based on Symmetry Predictor"; Eighth International Conference on Information Technology: New Generations; pp. 508-513; 2011. 7. George, L. E.; "IFS Coding for Zero-Mean Image Blocks"; Iraqi Journal of Science; Vol. 47, No. 1; pp. 190-194; 2006. 8. Xiao, H. ; "Fractal Audio Coding"; M.Sc. Thesis; School of Computing; Queens University; Canada; 2005. 9. . George, L. E.; "IFS Coding for Zero-Mean Image Blocks"; Iraqi Journal of Science; Vol. 47, No. 1; pp. 190-194; 2006. 10. George, L. E. and Azhar khanjar, "Speeding Up Fractal Audio Compression Using Tiling and Complementary Moments"; 2013. 11. George, L. E. and Al-Hillo, E. A.; "Speeding Up Fractal Color Image Compression Using Moment Features Based on Symmetry Predictor"; Eighth International Conference on Information Technology: New Generations; pp. 508-513; 2011.

103