2010 6th International Conference on Emerging Technologies (ICET)
Lossless Image Compression Using Kernel Based Global Structure Transform (GST) M. Asif Ali1, Aftab Khan2, M.Younus Javed3 and Aasia Khanum4 Department of Computer Engineering College of Electrical & Mechanical Engineering National University of Sciences & Technology (NUST), Islamabad, Pakistan. e-mail:
[email protected],
[email protected],
[email protected],
[email protected] Abstract—Lossless data compression using the variants of Burrows-Wheeler Transform (BWT) with various compression encoders has proven its effectiveness. This research provides a unique method for lossless compression of color images by improving the Global Structure Transform (GST) stage of the Burrows-Wheeler Compression Algorithm (BWCA). The proposed model applies the Move-To-Front (MTF) transform at the GST stage by selecting 2-D block (kernel) of BWT data. This method has resulted in a high occurrence of same gray levels in the kernel. Moreover, the symbol map for the MTF Encoder is generated only for the available gray levels in the kernel. The overall redundancy of the MTF indexes increases at the GST stage of the BWCA which results in increased compression.
together the repetitions of elements through a reversible transform. The number of symbols during the transformation is kept constant. GST is the second stage of the algorithm which transforms the local context of the symbols to a global context [4,6, and 8]. A typical representative of a GST stage is the Move-ToFront (MTF) transform introduced in the BWCA original scheme [2]. The MTF transform is a List Update Algorithm (LUA), which replaces the input symbols with corresponding ranking values [1]. The MTF scheme maintains the same number of symbols like the BWT stage.
Keywords-BWT; GST; BWCA; MTF; Kernel
I.
INTRODUCTION
Color images with high resolution require huge memory spaces and larger bandwidth for transmission. Lossless image compression is an efficient technique in reducing the image space and required bandwidth while preserving high image quality. The BWT, a block sorting algorithm, was initially developed for lossless compression of text data [2,3]. Lossless image compression can also be achieved by using this technique [11,5]. The BWCA’s overall performance has increased effectively through different improvements [10,12] since its creation in 1994. Improvements were designed mainly for the different stages of the complete algorithm to increase the efficacy at each level. This research work focuses mainly on the improvement of the GST stage of the BWCA for lossless compression of color images. The same scheme can also be considered for lossy image compression. DCT or DWT is commonly used as the preprocessing transform before the BWT stage for lossy image compression. II.
Figure 1.
The third stage uses the Run Length Encoding (RLE) scheme to reduce the element runs from the data. Different algorithms have been presented for this purpose [3], with the Zero Run Transform (RLE-0) found to be an efficient one. The last stage of the algorithm constitutes the Entropy Coding (EC) stage, which compresses the symbols by producing bit outputs using an arithmetic coding scheme. Burrows and Wheeler [2] proposed Huffman encoder in the EC stage while Abel [10] used a modified arithmetic coding scheme. Various entropy coding techniques at the end of BWCA have been presented [3, 7, and 8]. Canonical Huffman Scheme has also been applied for improved compression at the EC stage in this research. III.
BURROWS-WHEELER COMPRESSION ALGORITHM (BWCA)
IMPROVEMENT OF GST – PAST RESEARCH
Most GST stages use a recent ranking scheme for the List Update problem like MTF algorithm, used in the original BWCA scheme [2]. The MTF Encoder itself does not compress the data. It is used to enhance performance at the EC stage. Many improvements to the MTF stage have been presented [3,9,7, and 10]. Abel [10] presented a counter based algorithm called the Incremental Frequency Count (IFC) similar to the Weighted Frequency Count (WFC) stage of Deorowicz [9].
The classical scheme of BWCA, the four level algorithms, was presented by Abel [10] in 2003 shown in Fig. 1. Each level transforms the input data and presents the output data to the next level. The data traverses from each stage (level), starting from the leftmost stage of BWT to GST and then through RLE-0 to the last stage of Entropy Coder (EC) as shown in Fig. 1. BWT sorts the input data by grouping
978-1-4244-8058-6/10/$26.00 ©2010 IEEE
Typical scheme of the Burrows-Wheeler Compression Algorithm (BWCA)[1]
170
Improved MTF stages based on a delayed behavior, such as the MTF-1 and MTF-2 were presented by Balkenhol and Shtarkov [7] while Fenwick [3] presented a sticky version of it. Inversion Frequencies (IF) scheme was also presented by [7] which uses the offset between the occurrences of same symbol. IV.
Raster scan was used in this research for this coefficient transformation. B. Reduced Encoding Map In practice, MTF encodes the input stream of data on the basis of transformation map – called here the symbol map. Considering the above approach where all the possible symbols are present in the symbol map, the transformation map will rank the data with higher index value. Table I shows the MTF encoding of the string “NNNBAAA” with the symbol map containing all the symbols.
KERNEL BASED GLOBAL STRUCTURE TRANSFORM
A schematic overview of proposed kernel/mask based GST using the MTF encoder model is illustrated in Fig. 2.
TABLE I. ENCODING BY SYMBOL MAP CONTAINING ALL THE SYMBOLS: COLUMN 1: INPUT STRING, COLUMN 2: SYMBOL MAP
Figure 2. Lossless Image Compression with Proposed “Kernel MTF” for BWCA
N,N,N,B,A,A,A,A
AB……..N……Z
13,N,N,B,A,A,A,A
NAB……..……Z
13,0,0,B,A,A,A,A
NAB……..……Z
13,0,0,2,A,A,A,A
BNA……..……Z
13,0,0,2,2,A,A,A
ABN……..……Z
13,0,0,2,2,0,0,0
ABN……..……Z
In this research, the data was encoded by generating the symbol map for only the symbols present in the kernel. It results in mapping indexes of lower values to the data as compared to the previous scheme.
The sample image is transformed via the BWT and passed onto the GST stage as a 2-D matrix. The GST stage encodes the transformed data through the 2-D MTF encoder named as “Kernel MTF”. It is followed by the Zero Run-Length Encoding (RLE-0) and the EC stage. Many entropy coding techniques can be applied to compress the data.
TABLE II. MTF ENCODING BY SYMBOL MAP CONTAINING AVAILABLE SYMBOLS: COLUMN 1: INPUT STRING, COLUMN 2: REDUCED SYMBOL MAP
A. Kernel MTF Encoding BWT is a type of block sorting transform which is applied on blocks of image data reordered by any coefficient reordering scheme such as zigzag, Raster, Hilbert or Snake Scan. It results in transforming the image data to 1-D in this research “Raster Scan” is used before BWT stage.
N,N,N,B,A,A,A,A
ABN
2,N,N,B,A,A,A,A
NAB
2,0,0,B,A,A,A,A
NAB
2,0,0,2,A,A,A,A
BNA
2,0,0,2,2,A,A,A
ABN
2,0,0,2,2,0,0,0
ABN
Table II shows the MTF encoding of the string “NNNBAAA” with the symbol map containing only the available symbols. Column 1 shows encoding with a lesser value index while column 2 shows the available symbol map. This method retains additional information in order to know which symbols were present in the block of the data – named as the “encoding symbols” – which can decrease the compression ratio. However, it can produce high compression if a suitable kernel size is selected. Fig. 4 shows the comparison of frequency of occurrences of gray levels for the MTF and the proposed KMTF encoder. The MTF encoder produces higher index values for the gray levels in the range 150-220 which leads to low compression at the later stages. The KMTF eliminates this problem by producing concentrated indexes at the lower gray levels as visible in the gray level range 0-100.
Figure 3. (a) Original image (b) BWT of Image (c) 2-D block or kernel of BWT data (d) MTF Encoded block
Fig. 3(b) shows the BWT of sample image “Lena” Fig. 3(a). BWT was applied and the transformed data was retained in 2-D form. This resulted in greater pixel redundancy when the MTF Encoder was applied on a 2-D block or kernel of the data because the block will generally contain more relevant data from the band of near gray levels as shown in Fig. 3(c). Fig. 3(d) shows the MTF encoded block of the transformed data. MTF is applied on stream of 1-D data, so the data of kernel is again re-ordered using path scanning technique,
171
Compression ratio (CR) is calculated using the following formula: CR
n1 n2
(1)
Where, n1 is the “original data size” and n2 is the “encoded data size”. Table III shows the compression and sizes for the BWCA and Kernel MTF scheme. Different kernel sizes were taken. Kernel size of 64x64 gives better compression ratio as compared to kernel size of 128x128 and 32x32. The reason is that when a kernel size is small, the encoding symbols occupy greater space. If the kernel size is larger, the encoding symbols are less but the scheme approximates the BWCA scheme and the overall performance degrades. Table IV shows the compression and decompression time in seconds for the original MTF scheme and the proposed Kernel MTF scheme with kernel size 64x64. The compression time for both the schemes is almost the same while the decompression time for the proposed time is less than the original scheme. Table V shows the compression ratio results for different entropy coders with arithmetic coder producing the best results. The decompressed images were found to be of exactly the same quality as the original images.
Figure 4. Comparison of frequency of gray level occurences for image “Lena”
V.
EXPERIMENTATION AND RESULTS
The proposed model was tested on some sample JPEG images shown in shown in Fig. 5. All the sample images are of the resolution 512x512x3 (786432 bytes). System used for experimentation has the following specifications: Intel Pentium D CPU 3.4 GHz, 1GB RAM, 926 MB Virtual Memory with Windows 7 (32-bit) OS.
Figure 5. Sample images used for experimentation
172
TABLE III. COMPARISON OF BWCA AND PROPOSED METHOD (VALUES IN BOLD SHOW BEST COMPRESSION RESULTS) Original BWCA Scheme
Image #
Kernel MTF Scheme Kernel Size 128x128
Kernel Size 64x64
Kernel Size 32x32
BYTES
CR
BYTES
CR
BYTES
CR
BYTES
CR
1
650541
1.21
628843
1.25
629964
1.25
639394
1.23
2
670257
1.17
648761
1.21
649472
1.21
658126
1.19
3
549165
1.43
532155
1.48
530652
1.48
536223
1.47
4
652471
1.21
628232
1.25
627946
1.25
640615
1.23
5
723780
1.09
686451
1.15
687859
1.14
698966
1.13
6
415176
1.89
399458
1.97
395948
1.99
396467
1.98
7
641767
1.23
618310
1.27
616102
1.28
625715
1.26
8
503859
1.56
487705
1.61
478055
1.65
477430
1.65
9
623976
1.26
602723
1.30
595569
1.32
602767
1.30
10
756981
1.04
730706
1.08
734362
1.07
744759
1.06
11
559797
1.40
543862
1.45
539007
1.46
544247
1.44
12
510996
1.54
491321
1.60
1.62
488294
1.61
13
509004
1.55
492574
1.60
486610
1.62
486120
1.62
14
600166
1.31
572301
1.37
570380
1.38
577870
1.36
15
520851
1.51
497954
1.58
492452
1.60
496688
1.58
16
606459
1.30
586167
1.34
580311
1.36
584344
1.35
AVERAGE
593453
1.36
571720
1.41
568769
1.42
574876
1.40
TABLE IV.
485621
COMPRESSION AND DECOMPRESSION TIME (IN SECONDS)
Image #
KMTF
Proposed Scheme
KMTF
Proposed Scheme
1
44.61
44.87
11.98
11.90
2
43.66
39.39
12.25
10.46
3
38.23
37.11
10.94
9.04
4
39.30
39.39
12.10
9.94
5
43.42
39.20
12.45
10.44
6
44.48
31.67
8.06
7.11
7
42.48
39.15
12.26
10.21
8
42.19
33.60
8.25
8.41
9
43.41
35.32
9.05
9.13
10
40.47
37.05
9.71
9.80
11
45.76
34.63
8.68
8.82
12
39.31
32.25
7.72
7.82
13
39.35
37.70
8.46
9.03
14
41.13
34.62
8.21
8.57
15
40.68
33.42
7.28
7.89
16
38.63
37.24
8.72
9.63
AVERAGE
41.70
36.66
9.76
9.26
173
TABLE V.
COMPRESION RATIO OF DIFFERENT ENTROPY CODERS
Original BWCA Scheme
Kernel MTF Scheme
Image #
VI.
Huffman Encoder
CR
Huffman Encoder
CR
Canonical Huffman Encoder
CR
Arithmetic Encoder
CR
1
650541
1.21
648637
1.21
632364
1.25
623483
1.26
2
670257
1.17
667564
1.18
651872
1.21
642759
1.23
3
549165
1.43
549140
1.43
533052
1.48
519626
1.52
4
652471
1.21
646679
1.22
630346
1.25
618935
1.27
5
723780
1.09
705581
1.11
690259
1.14
680918
1.16
6
415176
1.89
415865
1.89
398347
1.99
385414
2.05
7
641767
1.23
634912
1.24
618502
1.28
608831
1.30
8
503859
1.56
495348
1.59
480455
1.65
463578
1.70
9
623976
1.26
614568
1.28
597969
1.32
587849
1.34
10
756981
1.04
752170
1.05
736762
1.07
732222
1.08
11
559797
1.40
558398
1.41
541407
1.46
527233
1.50
12
510996
1.54
504932
1.56
488021
1.62
473721
1.67
13
509004
1.55
505955
1.55
489010
1.62
497454
1.59
14
600166
1.31
589048
1.34
572780
1.38
561993
1.40
15
520851
1.51
511274
1.54
494852
1.60
479127
1.65
16
606459
1.30
599293
1.31
582711
1.36
569816
1.38
AVERAGE
593453
1.36
587460
1.37
571169
1.42
560810
1.44
[4]
M. Schindler, “A Fast Block-sorting Algorithm for lossless Data Compression”. In Proceedings of the IEEE Data Compression Conference 1997, Snowbird, Utah, STORER, J.A. AND COHN, M. Eds. 469. [5] Haitao Guo and C. Sidney Burrus, "Waveform and Image Compression with the Burrows Wheeler Transform and the Wavelet Transform." Proceedings of the IEEE International Conference on Image Processing (Oct. 1997) : 65-68. [6] B. Balkenhol, and S. Kurtz, “Universal Data Compression Based on the Burrows-Wheeler Transformation: Theory and Practice”. IEEE Transactions on Computers, 49(10), 1998, 1043-1053. [7] B. Balkenhol and Y. Shtarkov, "One attempt of a compression algorithm using the BWT", SFB343: Discrete Structures in Math., Faculty of Math., Univ. of Bielefeld, Germany, 1999. [8] S. Deorowicz, “Improvements to Burrows-Wheeler Compression Algorithm. Software - Practice and Experience”, 2000, 30(13), 14651483. [9] S. Deorowicz, "Second step algorithms in the Burrows-Wheeler compression algorithm". Software-Practice and Experience 32(2), 2002, pp. 99-111. [10] J. Abel, “Improvements to the Burrows-Wheeler compression algorithm: after BWT stages”, ACM Trans. Computer Systems, submitted for publication, 2003. [11] Elfitrin Syahrul, Julien Dubois, Vincent Vajnovszki, Taoufik Saidani and Mohamed Atri, “Lossless Image Compression using Burrows Wheeler Transform (Methods and Techniques)”, IEEE International Conference on Signal Image Technology and Internet Based Systems, 2008. [12] Manfred Kufleitner, “On Bijective Variants of the Burrows-Wheeler Transform”, unpublished.
CONCLUSIONS AND FUTURE WORK
Kernel MTF, a modification to the MTF encoder stage, increases the compression ratio. The experimental results show that application of Kernel MTF gives improved performance on average compared to the original BWCA scheme. Compression ratios for different kernel/block sizes have been presented in this paper. Compression and decompression times (in seconds) are almost the same in both the schemes. Neighbouring block size of 64x64 mostly gives the best compression ratio. Results show that when the kernel size is small, the encoding symbols occupy greater space and compression ratios are low. If the kernel size is larger, the encoding symbols are less but the scheme approximates the BWCA scheme and the overall performance is degraded and should be avoided. Other methods are currently being investigated to reduce the encoding symbols and to improve the RLE and EC stage. Different variants of these schemes are being considered to reduce the compression and decompression time. REFERENCES [1]
[2]
[3]
J. Bentley, D. Sleator, R. Tarjan, and V. Wei. “A locally adaptive data compression scheme”,1986. Communications of the ACM, 29, 320330. M. Burrows and D.J. Wheeler, “A Block-sorting lossless data compression”, SRC Research Report 124, Digital systems research center, Palo Alto, 1994. P. Fenwick, "Block sorting text compression–final report", Technical reports 130, University of Auckland, New Zealand, Department of Computer Science. 1996.
174