Adaptive decimation algorithm for image compression

3 downloads 0 Views 516KB Size Report
sampling, Decimation, Run Length Encoding algorithm, JPEG. 1 Introduction. As the media communication grows, image data compression method receives an ...
Adaptive decimation algorithm for image compression ANISSA ZERGAINOH, JEAN-PIERRE ASTRUC L2TI, Institut Galilée, Université de Paris Nord, Avenue J. B. Clément, 93430 Villetaneuse, FRANCE

Abstract: - This paper proposes a new lossy image compression method based on two-mixed transformation techniques. The first one employs an adaptive decimation algorithm to adjust the sampling rate to minimum local variance of the reconstructed image. To extract pixels on original image, the adaptive decimation process uses the high degree of spatial correlations between pixels so as to reduce the redundancy of information. The second transformation encodes efficiently gray level value and spatial location of retained pixels on the new sub-sampled image. Experimental results, on various images, show that the proposed compression method provides competitive compression results in terms of PSNR. Key-Words: - Compression, Still image, Image modeling, Statistical interpolation, Spatial correlation, Kriging, Subsampling, Decimation, Run Length Encoding algorithm, JPEG.

1

Introduction

As the media communication grows, image data compression method receives an increasing interest. Various techniques (transform, algorithms, …) and standards (JPEG, MPEG-4, …) have been proposed to reduce the size of storage data or to speed up the transmission with a constraint of preserving good fidelity of decoded image. A digital image is represented as a two-dimensional matrix corresponding to uniform sampling of an image acquisition system. Sampling rate above the Nyquist rate is required to avoid aliasing and to obtain a high resolution of the image. This restriction results in a lot of redundancy when applying to uniformly sampled images. For example, natural image contains both homogeneous regions in areas and non-homogenous regions in detail areas. Regular sampling places the same number of samples over entire image causing either loss of information due to aliasing or redundancy of information. This can be avoid by using efficient image representation. Irregular sampling seems to be a promising tool for efficient image representation. For transmission application, the transmitter encodes judiciously the sub-sampled image. The receiver has to reconstruct the original image from these scattered pixels with a minimal loss of information. Numerous reconstruction methods are available, such as methods based on the theory of regularization [1, 2], polynomial methods [4-9], reconstruction methods of band limited signals [3].

We distinguish two important groups of reconstruction algorithms. The first one concerns sub-sampling algorithms in which the sampling process is independent on the reconstruction method [16]. The second one refers to algorithms in which sub-sampling depends on the reconstruction method, i.e. positions of remaining pixels are chosen in order to reduce the Mean Square Error (MSE) of the reconstructed image [8, 9, 10, 19]. This paper proposes a lossy image compression method based on an adaptive decimation algorithm combined with an efficient coding of transmitted pixels. The paper is organized as follows. Section 2 describes the encoder where the strategy of adaptive decimation algorithm is explained. That is followed by presentation of the coding technique applied on sub-sampled image. Section 3 proposes the decoder where the reconstruction method is introduced. Section 4 provides experimental results. Concluding remarks and perspectives are given in Section 5.

2

Optimal encoder

Our encoder is composed of two stages. The first one concerns the adaptive decimation procedure. It retains only the nonredundant or decorrelated samples and constitutes the optimal distribution of pixels on image grid. The second stage proposes an efficient coding of pixel positions and gray level values. The encoder block diagram is given in Figure 1. Boxes containing the integer factor M , in decimation and interpolation

process, indicate respectively a variable down-sampling and up-sampling operation. Adaptive Decim ation stage

Efficient E ncoding stage

Original Image

and a local MSE is computed. Among these eight permutations, we keep the one providing a minimum local MSE. The process is iterated several times, on subsampled image, until the algorithm converges to a constant global MSE [19].

Decim ation

H

O rig in a l im ag e

M

Interpolation

M

Pixel Positions

I0

s ize o f

I0

S S fix e d fract io n o f p ix e ls Channel

Optimization 2

S0 = N * N

Pixel values

In it ial iza tio n : R eg u la r d eci m at io n S1 ≤ 0. 0 1* N * N

Sub-sampled Image

H

S u b -sa m p le d im a g e I S

_

S = S1

+

Optimization 1

R ec o n st ru cti o n

Fig.1 General block diagram of the encoder.

e (i , j) = M a x( I 0 − Iˆ 0 )

2.1 Adaptive decimation algorithm for an efficient image representation The adaptive decimation algorithm aims to provide an optimal sampling set of pixels minimizing the Mean Square Error (MSE) of the reconstructed image. It is performed according to reconstruction method. The global process is illustrated in Fig.2. In practice, the algorithm operates as follows. Let S 0 be the size of original image I0 and Î0 be the reconstructed image from sub-sampled image I S . x( i, j ) is the gray level value of the pixel at spatial position ( i, j ) on one of three images (I0, Î0, IS). For the initialization step, the adaptive decimation algorithm needs a reasonable starting point in order to perform well. We chose the S1 ≤ 0.01 * S 0 starting values uniformly distributed on the original image grid. Reconstruction method, which will be described in Section 3, is used to recover original image from its down-sampled version. Differences between pixel values and reconstructed values are calculated. At each iteration, one pixel is inserted into the sub-sampled image where the reconstruction error is maximal ( e(i, j ) ). The process is iterated until a fixed fraction of pixels ( S S < S 0 ) to transmit is reached. Because of the initial uniform decimation, we must undertake a supplementary optimization step (optimization 2) refining the first optimization described above. Optimization 2 consists in replacing each pixel x(i, j ) , belonging to sub-sampled image I S , by one of these eight neighbors { x(i, j + 1) , x(i − 1, j ) , x(i, j − 1) , x(i, j + 1) , x(i − 1, j − 1) , x(i + 1, j + 1) , x(i − 1, j + 1) , x(i + 1, j − 1) } on original image. For each permutation, the reconstruction method is applied

M e th o d Iˆ 0

In s erti o n o f x ( i, j ) in I S

S = S +1

S = SS

No

Y es S u b -sa m p le d im a g e I S

O p tim iza tio n 2

O p tim al s u b -s am p l ed im ag e I S

Fig.2 Adaptive decimation algorithm.

2.2

Efficient coding

Aims of this section are the minimization of bits required to represent value and spatial position of pixels constituting sub-sampled image. Codification of spatial positions must be negligible compared to the codification of pixel values. Previous section shows that adaptive decimation algorithm reduces the memory size of original image. Since the set of pixels composing the new sub-sampled image has decreased. To increase the compression ratio and preserve a good visual quality of decoded image, we would like to encode pixels values according to a lossless compression method. Note that the proposed decimation algorithm has decorrelated components of the original image, by keeping only a fraction of pixels supporting the useful information. Thus, it is difficult to find redundancies in the sub-sampled image. For this,

the sampling of pixels are encoded using a uniform quantizer. To encode position of pixels, a coordinate map is created. It is a monochrome grid, similarly sized than the original image. A single bit (one or zero) at each pixel position indicates the absence or the presence of pixels on original image. In order to cancel the additional influence of pixel position codification, eight bits are gathered on a byte. The Run Length Encoding (RLE) algorithm is applied on this assemblage. The algorithm determines a pixel having a number of consecutive appearances that exceeds a fixed threshold. This sequence is replaced by two information: the repetition number and the information to repeat. This method replaces a sequence by another one smaller than the initial sequence, in term of memory size.

3

Decoder based on image parameters model

The decoder receives information as described in Section 2. The information is analyzed in order to construct the sub-sampled image. To obtain the original high-resolution image, reverse procedure of decimation, must be performed on its down-sampled version. It consists to apply the reconstruction method, which will be explained below.

3.1

Reconstruction method

The decoder employs a method known as kriging. It was first proposed by Krige [12]. It has been developed in the field of mining, and is one of the efficient tools of geostatistics. Its mathematical foundations have been established by Matheron [13]. It has found many applications in hydrology and meteorology. The image surface is modeled as the realization of a random function (stochastic process). The random process is assumed to be stationary of second order: the first two moments are invariant by translation. It has a constant mean and covariance. Let y ( xi ) be the gray level of the pixel at spatial location xi on image grid. The unknown gray level value y$ ( x ) , located at the spatial position x , is estimated by a weighted linear combination of n available neighbor pixels: n ) y ( x) = ∑i =1 λi y ( x i ) (1)

where λi is the weight associated to the pixel y ( xi ) located at spatial position x i . It is a local reconstruction method. In this case, stationary constraints of the second order are supposed to be satisfied. Under stationary hypothesis, the covariance is defined by: Cov( x i , x j ) = σ 2 Corr ( x i , x j ) (2)

where σ 2 represents the variance of the process and Corr ( x i , x j ) corresponds to the spatial correlation

between two pixels i and j located at respective spatial positions xi and x j . To avoid a systematic error, the mean of the prediction error, under stationary hypothesis, has to be equal to zero: ) E{y ( x) − y ( x)} = ∑ λi E{y ( x i )} − E{y ( x)} i (3) = m(∑ λi − 1) = 0 i

where m is the mean of y (x) . This yields the following unbiased condition:

∑λ

i

−1= 0

(4)

i

The best linear unbiased predictor is obtained by determining λi which minimizes the MSE:   ) MSE{y ( x)} = E ∑ λi y ( xi ) − y ( x)  i 

2

(5)

The minimization of the MSE is treated by the Lagrange formalism. As a result, the weights λi are predicted by resolving the following system: ∑ λi Cov( xi , x j ) + µ = Cov( x j , x), ∀j  i (6)  ∑ λ i = 1  i where µ is a multiplier of Lagrange, and Cov( x i , x j ) is the spatial covariance between two pixels i and j located at xi and x j on the image grid. To solve the system, supplementary information on the covariance function are necessary.

3. 2

Selection of covariance model

As in geostatistics, in kriging method, a structural analysis is performed [12]. That consists, from experimental data, to establish a theoretical model that approximate suitably the experimental covariance function according to: Cov(h) = 1 −

2 1 ( y ( x i ) − y ( x j )) ∑ i≤ j N ( h)

(7)

where N (h) is the number of pairs separated by a distance h = xi − x j .

4. Simulation results and discussions To illustrate performance of the proposed compression method, simulations have been done on personal computer under Matlab software. Lena “256*256” (8 bits per pixel) is the test image. The quality of the reconstructed image is measured by the Peak Signal to Noise Ratio (PSNR), with reference to initial image as follows:

2 N max (8) MSE is the maximum quantizied gray level.

PSNR = 10 log10

where N max

The structural analysis is very important step, since the choice of a covariance model is determining for the visual quality of the reconstructed image. Experimental covariance model has been performed on several subpictures of Lena according to equation 7. It was found that two or three pixels are sufficient for the reconstruction of an unknown pixel. Increasing this number of neighbors does not improve the reconstruction quality, but increases the computational time. On the other hand, these two or three available pixels are located in a radius less than seven, even in the case of high compression ratios. This why during the structural analysis we have concentrated our attention to small distances, less than seven. In all considered cases, a linear theoretical model could successfully approximate the shape of the experimental covariance structure: h (9) Cov(h) = a (1 − ) b where a and b are constants whose values are experimentally determined.

Adaptive decimation operation is performed on the original image in order to retain only 4.9% pixels on sub-sampled image. Figure 4 shows the improvement of PSNR versus the number of iterations. After five iterations, the PSNR becomes constant ( ≈ 30dB ). Graphs, displayed in Figure 5, present distortions (PSNR) versus number of bits per pixel (bpp). The analysis of experimental results point that the performance of compression method is not altered by additional coding operation of pixel positions. The proposed compression method outperforms traditional JPEG coding. For example, at 35dB (see fig.5), the bit rate of proposed method has been reduced by over 0.4bpp compared to JPEG coding. 50

45

40

35

30 Part of image Lena

Experimental covariances versus distances 1

25

0.8 0.6 0.4 0.2 0 1

2

3

4

5

6

7

Fig.3 Experimental covariance model between pixels on a part of « Lena » picture. 30.5 30

PSNR

1

2

3

4

5

Fig.5 Distortion (dB) versus rate (bpp) (o) JPEG algorithm, (*) compression method (without coding the map of pixel positions), (-) compression method (with codification of the map). Bit rate (bpp)

PSNR (dB)

4 2 0.88 0.39

47.14 39.70 33.91 30.10

Table 1 Results of proposed method.

29.5

Table 1 provides simulation results including the optimization steps. For several fractions of pixels retained on original image (first column), the PSNR is measured and given by the second column.

29 28.5 28 27.5 27

0

0

2

4

6

Iterations

Fig.4 PSNR versus number of iterations.

8

10

Figures 6, 7, 8 and 9, provide sub-sampled and reconstructed image from their sub-sampled version for respectively 50%, 25%, 11% and 4.9% fraction of samples retained on original image. The sub-sampled images show that decimation algorithm assigns more pixels to important regions (edges) of image and more sparsely in areas where image transitions are slow.

5. Conclusion and perspectives In this paper, we have proposed an image compression method based on two mixed transformation techniques. The first one uses an adaptive decimation algorithm for an efficient representation of the original image. It consists in extracting strategically a set of pixels. Extraction method is linked to the reconstruction method. It is a two-dimensional statistical interpolation method depending on the spatial covariance model of original image. Spatial positions, on original image, of

remaining pixels have been encoded efficiently without increasing the codification of pixel values. Experimental results show that the proposed compression method provides competitive compression results in terms of PSNR. In order to improve the performance of our compression method, we suggest in future work, to decompose the original image into several parts such as edges, texture and adapt an efficient reconstruction algorithm for each kind of information.

Fig.6 First image corresponds to the original picture (Lena 256*256), second one to sub-sampled image (50% of pixels kept) and last one to reconstructed image (4 bbp).

Fig.7 First image corresponds to the original picture (Lena 256*256), second one to sub-sampled image (25% of pixels kept) and last one to the reconstructed image (2 bbp).

Fig.8 First image corresponds to the original picture (Lena 256*256), second one to sub-sampled image (11% of pixels kept) and last one to reconstructed image (0.88 bbp).

Fig.9 First image corresponds to the original picture (Lena 256*256), second one to sub-sampled image (4.9% of pixels kept) and last one to reconstructed image (0.39 bbp). References: [1] L. L. Schumaker, Fitting Surfaces to Scattered Data, Lorentz, Georg Gunter, eds. Chui, Schumaker, 1976, pp. 203-268. [2] [1] T. Poggio and F. Girosi, Networks for approximation and learning, proceeding IEEE, Vol.78, No. 9, September 1990, pp. 1481-1496. [3] H. G. Feichtinger, W. Kozek and T. Strohmer. Reconstruction of signals from irregular samples of its short-time Fourier transform, proceeding SPIE95, Sans Diego, July 1995. [4] R. Franke and G. Nielson, Smooth interpolation of large sets of scattered data, International of Numerical Methods in Engineering, Vol. 15, January 1980, pp.1691-1704. [5] C. H. Lee, Image surface approximation with irregular samples, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, number 2, February 1989, pp. 206-212. [6] M. Rauth, Application of 2D methods for scattered data approximation to geophysical data sets, proceedings Conference SampTA-95, Riga/Latvia, 1995. [7] D. Shepard, A two dimensional interpolation function for irregularly spaced data, proceedings 23rd Nat.. conference. ACM, 1968, pp. 517-523. [8] C. H. Lee, Image surface approximation with irregular samples, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, No. 2, February 1989, pp. 206-212. [9] H. Le Floch, C. Labit, Irregular image subsampling and reconstruction by adaptive sampling, proceeding ICIP-96, Vol. 3, pp.379-382. [10] K. W. Chun, J. Byeungwoo and M. J. Jae, Irregular triangular mesh representation based on adaptive control point removal, proceedings. VCIP96, pp. 844-853. [11] P. Chauvet, Aide mémoire de géostatistique linéaire, Ecole des mines de Paris, 1994. [12] D. G. Krige, A statistical approach to some basic

mine valuation problems on the witwaterstand, Journal chem. Metal. Min. S. Afr., S2, pp. 119139, 1951. [13] G. Matheron. Principles of geostatistics, Economic Geology, Vol. 58, 1963, pp. 1246-1266. [14] J. Lefevre, H. Roussel, E. Walter, D. Lecointe, and Tabbara, Prediction from wrong models: the kriging approach, IEEE Antennas and propagation Magazine, Vol. 38, No 4, August 1996, pp. 35-45. [15] J. Sacks, W. J. Welch, T. J. Mitchell and H. P. Wynn, Design and analysis of computer experiments, Statistical Science, Vol. 4, no. 4, 1989, pp. 409-435. [16] A. Zergaïnoh, S. Hadjihassan and J.-P. Astruc, Image Data Reconstruction using Spatial Correlations, Internatinal conference on Multimedia Technology, ICOMT’98, Hungary, pp. 25-30. [17] N. Herodotou, A.N. Venetsanopoulos, L. Onural, Image interpolation using a simple Gibbs random field model, Instrumentation and measurement Technology Conference, proceedings of the 16 th IEEE, Vol. 1, 1999, pp. 156-159. [18] J.P. Hoffbeck, D.A. Landgrebe, Covariance estimation for classifying high dimensional data, Geoscience and remote sensing symposium, 1995, Quantitative remote sensing for science and applications, Vol. 2, 1995, pp. 1023-1025. [19] A. Zergaïnoh, S. Hadjihassan & J.-P. Astruc, Interpolation statistique à partir d’un échantillonnage irrégulier pour la reconstruction d’images, GRETSI 99, Septembre 1999, Vannes (France), pp. 411-414. [20] A. Zergaïnoh, J-P. Astruc, Hybrid method for still image coding, European Signal Processing Conference (EUSIPCO'2000), 5-8 September 2000, Tamper, Finlande.