LOW BITRATE IMAGE COMPRESSION USING ... - Semantic Scholar

4 downloads 0 Views 128KB Size Report
Mohammad Gharavi-Alkhansari. Department of Electrical Engineering. Tarbiat Modarres University, Tehran, Iran. ABSTRACT. In this paper, we propose a new ...
LOW BITRATE IMAGE COMPRESSION USING SELF-ORGANIZED KOHONEN MAPS Mehrtash T. Harandi

Mohammad Gharavi-Alkhansari

Dept. of Electrical and Computer Engineering University of Tehran, Tehran, 14395, Iran

Department of Electrical Engineering Tarbiat Modarres University, Tehran, Iran

ABSTRACT In this paper, we propose a new image compression algorithm based on Kohonen self-organized maps. The compression is based on vector quantization (VQ) of the DCT coefficients of image blocks, where the VQ is implemented by a Kohonen network. At low bitrates, our proposed method performs better than an earlier compression scheme developed by Amerijckx et al. [1] and shows better subjective results in comparison to JPEG. 1. INTRODUCTION In mammals, the visual information is processed by a massively parallel interconnected network. This provides motivation for new methods that use neural networks for image processing. In this paper, we use Kohonen self-organized maps (KSOM) [2] for image compression. Vector quantization (VQ) [3], is a popular method for image compression. In VQ, a limited number of vectors (codewords) is used to approximate an N -dimensional space. Selecting the codewords such that the best representation of the space is obtained, is a major issue in VQ. A vector quantizer is said to be optimal if its distortion is minimum among all other vector quantizers with the same number of codewords. Madeiro et al. have shown that KSOMs satisfy the necessary condition for optimal VQ [4]. KSOM is a reliable and efficient way to achieve VQ and has been shown [5] to be usually faster and to perform better than the conventional algorithms like LBG [6]. One useful property of KSOM is the topological ordering. This property implies that if vectors in the input space are near to each other, then their projections in the output space will also be close. In the proposed algorithm, we use this property to compress the output of KSOM more efficiently. Many authors have used KSOM for vector quantization. In [1], Amerijckx et al. introduced a compression scheme based on KSOM. In this paper, we present a compression algorithm based on the discrete cosine transform (DCT) [7], vector quantization using KSOM, differential coding, and entropy coding. Through simulations, we compare the performance of the proposed algorithm with those of [1] and the JPEG standard [8]. We will refer to our proposed system as the high compression rate system (HCRS). 2. METHODOLOGY The proposed HCRS encoding/decoding system is shown in Fig. 1. The input image is partitioned into 8 × 8 blocks, and the DCT M. Gharavi-Alkhansari is now with the Department of Communication Systems, University of Duisburg-Essen, Bismarckstr. 81, D-47057 Duisburg, Germany, e-mail: [email protected].

transform of each block is computed. We rearrange the DCT coefficients by scanning them in a zigzag fashion, beginning with the DC coefficient. This results in a column vector with 64 coefficients. The coefficients that are located near the top of the column vector correspond to the low-frequency components of the block, and the coefficients that are located near the bottom of the column vector correspond to the high-frequency components. Hence, most of the block energy is concentrated in the upper part of the column vector. Because the KSOM does not converge when the dimension of input signal is large, this dimension needs to be reduced. In our algorithm, this is achieved by keeping the top portion of the column vector and removing the bottom part of it. This is equivalent to lowpass filtering of the image block. The number of low frequency coefficient that is kept in our algorithm is 12, which is determined by simulations. The next step is VQ which is a lossy compression method. For each image block, the output of the KSOM is the index of the winner neuron. Therefore, the input image is transformed to an index image where each pixel in the index image corresponds to an 8 × 8 block in the input image. The second part of encoder is a lossless entropy coder. In the entropy coder, the elements in the index image is rearranged by means of snake scanning, as shown in Fig. 2. Due to the topological ordering principle described earlier, and noting that most parts of the input image are smooth, we expect that the output index image is smooth as well. So, first, we transform the index image to a vector by snake scanner. Snake scan rearranges the pixels to achieve better compression bitrate. If we use the differential coder on the indices of each row, we have a large differential magnitude between the last index of each row and the first index of the following row, so we use snake scan to reduce this effect. The next block in entropy coder is a one-dimensional differential coder. The output of the differential coder is simply the difference between its current and last inputs, as shown in Fig. 3. To remove the redundancy present in the output signal of the differential coder, we use an arithmetic coder [9]. The reverse procedure is done in the decoder to obtain the decoded image from the encoded data as shown in Fig. 1. The encoder can achieve different bit rates by varying the number of neurons in the KSOM. 3. SIMULATION RESULTS For simulations, the KSOM is trained with a symmetric twodimensional Gaussian neighborhood function where the initial radius of the Gaussian function is equal to the KSOM size. The radius of Gaussian function is decreased by an exponential function during the training phase. The training phase is completed when no significant changes are made in the decoded image quality for a given sample image. For training, the peak signal to noise ratio

input image - DCT - Zigzag - Lowpass - Vector Quantization 8×8 (KSOM) Filter Scan

Encoder decoded

¾image

Lossy Encoder Inverse Inverse DCT ¾ Zigzag ¾ 8×8 Scan

Decoder

Lossy Decoder

index Encoded image- Snake - Differential- Arithmetic - Sequence Coder Coder Scan

? 6

Entropy Encoder

Codebook

Lookup Table (KSOM)

? 6

index

¾ image

Inverse Differential ¾Arithmetic ¾ Snake ¾ Decoder Deoder Scan

Entropy Decoder

Codebook Fig. 1. The HCRS Encoder/Decoder.

¾ ¾ - ···

Table 1. PSNR (in dB) of the decoded image Lena for different numbers of DCT coefficients. Number of DCT Coefficients 10 11 12 13 14

···

¾

¾

Fig. 2. Snake scanner. x(k) e(k) +h + − ? 6 x(k−1) ¾ z−1 ¾ +h

6 Fig. 3. Differential coder. (PSNR) is used as the measure of the image quality, which, for the 256 gray-level image, is defined as

!

Ã

PSNR = 10 log10

2552

PN −1 PM −1 ¡ n=0

m=0

¢2

ˆ m) I(n, m) − I(n,

ˆ m) denote the intensity of the original where I(n, m) and I(n, image and the decoded image, respectively, at the pixel located in row m and column n of the image. One of the first steps in designing the compression scheme is to determine the shape of the best lowpass filter for removing the high frequency contents from the DCT vector of image blocks. To resolve this issue, we trained the KSOM with different numbers of DCT coefficients. The results are listed in Table 1 for KSOMs of sizes 40 × 40 and 90 × 90. From Table 1, it can be seen that the highest PSNR is achieved when 12 DCT coefficients are kept. By decreasing the number of KSOM neurons, higher compression ratios are achieved. Simulation results show that the smallest KSOM which has a good performance with the designated lowpass filter is 40 × 40 KSOM. In Table 2, PSNRs and bits per pixel (BPP) are listed for KSOMs of different sizes and for some decoded standard images.

PSNR KSOM 40×40 KSOM 90×90 26.71 29.40 26.83 29.46 26.90 29.60 26.52 29.10 26.34 28.64

Table 2. PSNR (in dB) and BPP of decoded standard images for KSOMs of different sizes. KSOM 40×40 KSOM 60×60 KSOM 90×90 Image PSNR BPP PSNR BPP PSNR BPP Lena 26.90 0.12 27.11 0.15 29.59 0.16 Barbara 22.83 0.14 22.23 0.15 23.39 0.17 Baboon 20.76 0.14 21.01 0.17 21.93 0.18 Peppers 25.81 0.12 26.06 0.14 27.53 0.17 Boat 25.02 0.12 25.25 0.16 26.36 0.17 Bridge 21.95 0.14 22.29 0.14 22.65 0.17

data should be chosen with care otherwise the VQ may not give an efficient representation of the input space. In order to determine the sensitivity of KSOM to the training and the initial codebook, we train our 90 × 90 KSOM with two clearly different sets of images, shown in Fig. 4. In Fig. 5, PSNR is shown versus the number of iterations needed to train the KSOM, where PSNR is for the decoded Lena image. It can be seen from this figure that the two KSOMs result in very close PSNRs after training. As Lena image is in the set 1, the KSOM for the set 1 gives a slightly better PSNR (by approximately 1.7 dB) for this image compared to the KSOM of the set 2. 3.2. Comparison with JPEG and the Method of [1]

3.1. KSOM Robustness against Training Data

In this section, we compare the results of HCRS with that of the JPEG standard1 [8] and the algorithm of [1].

A major issue in VQ is its sensitivity to initial codebook. In other words, in the training phase, the initial codebook and the training

1 The JPEG graph is obtained by the ACDSEE ver2.04 software under Windows2000 platform.

Fig. 4. Two different training sets.

In Fig. 6, the PSNR versus BPP is shown for compression of the Lena image using the above three algorithms. This figure shows that the proposed algorithm gives higher PSNR (by approximately 4.5 dB) compared to the algorithm of [1]. The JPEG coder has slightly higher PSNR than the proposed algorithm except for the compression ratio of BPP=0.164. For subjective comparison between the proposed method and the JPEG algorithm, Fig. 7 shows the decoded images, after compression with these two methods, for the images “Lena,” “Baboon,” and “Barbara.” In Fig. 7, we also zoom on some parts of each image in order to make it possible to better compare edges and textures in both of the compression algorithms. By comparing the edges of the decoded images from the proposed scheme with those of JPEG standard, it can be seen that HCRS results in better subjective performance. This is due to the fact that in low-bitrate compression, the JPEG algorithm can only keep a very small number of DCT coefficients. On the other hand, HCRS compresses each 8 × 8 block by estimating the 12 high-energy DCT coefficients and therefore preserves edges much better than JPEG. 4. CONCLUSIONS In this paper, a new image compression algorithm based on Kohonen self-organized maps is proposed. The KSOM is used for vector quantization of the DCT coefficients of image blocks. At low bitrates, our proposed method performs better than an earlier compression scheme developed by Amerijckx et al. [1] and shows better subjective results compared to JPEG. 5. REFERENCES [1] C. Amerijckx, M. Verleysen, P. Thissen, and J.-D. Legat, “Image compression by self-organized Kohonen map,” IEEE Trans. Neural Network, vol. 9, pp. 503–507, May 1998. [2] T. Kohonen, Self-Organizing Maps. Springer, third ed., 1997.

Fig. 5. PSNR (in dB) versus the number of iterations in training of the 90 × 90 KSOM for the two training sets 1 and 2.

[3] A. Gersho and R. M. Gray, Vector Quantization and Signal Compression. Kluwer, 1992. [4] F. Madeiro, R. M. Vilar, and B. G. A. Neto, “A self organizing algorithm for image compression,” in Proc. IEEE Vth Brazilian Symposiom on Neural Network, pp. 146–150, Dec. 9–11, 1998. [5] J. A. Corral and M. Guerrero, “Image compression via optimal vector quantization: A comparison between SOM, LBG and k-means algorithms,” in IEEE International Conference on Neural Networks, vol. 6, pp. 4113–4118, June 27–July 2, 1994. [6] Y. Linde, A. Buzo, and R. M. Gray, “An algorithm for vector quantizer design,” IEEE Trans. Commun., vol. COMM-28, pp. 84–95, Jan. 1980. [7] N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete cosine transform,” IEEE Trans. Computers, vol. C-23, pp. 88–93, Jan. 1974. [8] G. K. Wallace, “The JPEG still picture compression standard,” Communications of the ACM, vol. 34, pp. 30–44, Apr. 1991.

Fig. 6. PSNR (in dB) versus BPP for compression using the HCRS, JPEG and the algorithm of [1].

[9] I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic coding for data compression,” Communications of the ACM, vol. 30, pp. 530–540, June 1987.

Fig. 7. (a) Lena image compressed by HCRS, PSNR=29.59, BPP=0.164. (b) Lena image compressed by JPEG, PSNR=29.82, BPP=0.172. (c) Lena image compressed by HCRS, PSNR=27.11, BPP=0.149. (d) Lena image compressed by JPEG, PSNR=29.12, BPP=0.143. (e) Lena image compressed by HCRS, zoomed, BPP=0.164. (f) Lena image compressed by JPEG, zoomed, BPP=0.172. (g) Lena image compressed by HCRS, zoomed, BPP=0.149. (h) Lena image compressed by JPEG, zoomed, BPP=0.143. (i) Baboon image compressed by HCRS, PSNR=21.93, BPP=0.177. (j) Baboon image compressed by JPEG, PSNR=21.39, BPP=0.204. (k) Barbara image compressed by HCRS, PSNR=23.07, BPP=0.155. (l) Barbara image compressed by JPEG, PSNR=22.90, BPP=0.148. (m) Baboon image compressed by HCRS, zoomed, BPP=0.177. (n) Baboon image compressed by JPEG, zoomed, BPP=0.204. (o) Barbara image compressed by HCRS, zoomed, BPP=0.155. (p) Barbara image compressed by JPEG, zoomed, BPP=0.148.