COMPRESSION ALGORITHMS FOR CLASSIFICATION OF REMOTELY SENSED IMAGES Frank Tintrup, Francesco De Natale, Daniele Giusto Department of Electrical and Electronic Engineering University of Cagliari, Italy Piazza d’Armi, 09123 Cagliari, Italy Tel: +39 - 70 - 675 5862 • E-mail:
[email protected] ABSTRACT The paper presents a comparison of the principal lossy compression algorithms, Vector Quantization (VQ), JPEG and Wavelets (WV) posterior KLT applied to multispectral remotely sensed images and evaluated by the classification algorithm KNN. The main goal of the compression of remotely sensed images is a reduction of the huge requirements for downlink and storage. The Karhunen Loeve Transform first removes the interband correlation to produce the principal components of the image which are then compressed by the principal algorithms. The quality evaluation was done by a supervised classification with the well known algorithm K-NN for remote sensing applications and the MSE for visual aspects. The obtained results of these accurate and particular analysis of the current compression techniques are quite surprisingly compared to other recent works. INTRODUCTION Multispectral remotely-sensed (R-S) images have nowadays huge storage requirements to archive high resolutions and occupy large bandwidth during downlinking. Moreover the wide use of these images for viewing, analyzing over many spectral bands, classification and storing require efficient methods to reduce the redundant information. Data compression plays therefore an important role in analysis and classification of remotely sensed imagery, reducing transmission time, bandwidth and storage requirements. R-S imaging applications include change detection where images of the same ground area are acquired and stored for long periods, earth monitoring and terrain classification, while automated or semi-automated image processing tools are used to identify and classify agriculture areas, urban areas, forests etc. This images which require usually about 150 Mbytes, e.g. Thematic Mapper (TM), are acquired by satellite or aircraft mounted sensors and are in general stored or transmitted to ground stations without using any compression tool, thus requiring a very large bandwidth. Some interesting preliminary results have been archived in the field of lossless compression techniques [6], also extending the JPEG standard to R-S images [9, 3] while good results regarded VQ techniques. For the particular application, automatic classification of compressed multispectral remotely sensed images, there are still no results available in the literature which makes this work of high interest to the reader. METHODS Karhunen Loeve Transform Usually in remotely sensed multispectral images there is a large amount of interband correlation due to the co-located sensors and the spectral overlap of the bands, also the case in Landsat TM images. The most effective technique to exploit this correlation is the application of the Karhunen Loeve Transform (KLT) which produces the principal components of the image. The KLT which is an orthogonal transformation, provides the minimum mean
square error (MSE) during decorrelation by discarding the high index coefficients in the transformed space and maximizes the energy in the fewest number of coefficients. The highest energy is concentrated in the transformed bands corresponding to the largest eigenvalues. The JPEG Algorithm As the content of this paper deals only with single-component (grayscale) images, we consider just the DCT based mode of operation, essentially the compression of a stream of 8x8 pixel blocks. For detailed information on this algorithm consider [13]. Vector Quantization A k-dimensional memoryless vector quantizer (VQ) consists of an encoder α which assigns to each input vector x = (x0, x1, ...., xi-1) a channel symbol α(x) in a specified channel symbol set M, called codebook, and a decoder β assigning to each channel symbol z in M an output value x’ = (x’0, x’1, ...., x’i-1) in a reproduction alphabet A [5]. The generation of the codebook is done by the well known LBG algorithm [8]. In the particular application of image coding the VQ operates on small, 2-dimensional rectangular block samples of usually 3x3 or 4x4 image pixels. The decoded image quality is mainly dependent on the block- and codebook size. Typical characteristics of decoded images are in particular the poorly reproduced edges and the known “blocky” effect due to codeword edges where some particular solutions, e.g. the construction of segmented codes and separated codebooks for edge and texture information have been studied [4, 11]. Wavelet Transform A novel technique for image data compression is based on adaptive vector quantization of wavelet coefficients. This technique promise high compression rates at good image quality while it performs usually better than the JPEG and VQ, both in quantitative (MSE) and qualitative terms, absence of blockness distortion as known from VQ and JPEG. Several methods are presented in literature for wavelet-based image compression while a certain number of approaches propose vector quantization of the wavelet transformed coefficients in different subbands [14, 1]. The wavelet representation of an image is composed by the approximation of the signal at low resolution and a set of details at several resolutions. The image at low resolution is a low-pass version of the original one, while the details contain the information at high frequencies. The signal of each subband is found through an iterative algorithm which decompose the original signal into four more detailed ones where each signal contains the information regarding a particular frequency band and orientation, see figure 1. The reconstruction algorithm is strongly related to the decomposition technique while the complete signal is found again through a pyramidal algorithm, taking into account the lowpass signal and the set of details.
LL
LH2 LH1
HL2
HH2
HL1
HH1
Figure 1: Wavelet representation of first KLT sample The used wavelet representation was studied by Lemarie and Battle [7, 2] and corresponds to a multiresolution approximation constructed from cubic splines. Particular attention has to be paid to the encoding of the low-pass version (LL subband) as introduced errors could propagate in the reconstruction phase, resulting in a worse image quality. The encoding is therefore done by the lossless bidimensional DPCM technique defined in [10] which takes advantage of the correlation between adjacent pixels in all the directions (horizontal, vertical and diagonal). This technique involves initially a 1-D DPCM applied to the first row and first column of the image. Then each pixel value is predicted with a linear combination of its three nearest pixels values where the prediction error is coded. x i , j = 0.75x i −1, j − 0.50 x i −1, j −1 + 0.75x i , j −1
ε xi , j = x i , j − x i , j The statistical distribution of the wavelet coefficients at a fixed resolution and orientation is a symmetric distribution with a nearly zero mean and small variance. It is often modeled as a Laplacian distribution even it falls off more rapidly and is therefore better approximated by the Generalized Gaussian Distribution [15, 12]: r kj p k j = a k j ∗ exp − b k j x 2 2 2 2 with
a2k j b2k j =
=
b2k j r2kj 2Γ(1 / r2kj )
and
k 1/2 1 Γ(3 / r2 j ) σ2kj Γ(1 / r2kj )1/2
where σ k j is the standard deviation of the subband distribution 2
at orientation K, resolution 2j and Γ(⋅) is the Gamma function. This equation contains the Gaussian and the Laplacian PDF as special cases: • for r kj = 2 it is the Gaussian PDF; 2
• for r kj = 1 it is the Laplacian PDF. 2
The property of the nearly Gaussian distribution allows a distinction of zones in each subband, characterized by greater energetic and informative contents, called the active zone where an accurate coding process improves the reconstructed image quality. An extensive empirical analysis of different image types has lead to a heuristic algorithm to identify the active zones where each subband is analyzed accurately regarding the
quantized wavelet coefficients histogram in order to apply a threshold-process to the histogram. Once the process is terminated, a binary-value “mask” of each subband is extracted which contains information regarding the position and shape of the active zone while the coefficients belonging to the non-active zones are not considered anymore due to their very low informative content, they constitute the background of the subband. Each mask is then scaled down and logically summed to the one obtained at the lower adjacent resolution and the same orientation. Though, we obtain three masks where each contain the information regarding the active zones of the subbands at the same orientation, LH, HL and HH. Finally, the masks are logically summed to obtain an unique mask which is encoded by optimized run length coding. The active zone coefficients contain the main part of the energy and information of the relative subband and have to be therefore encoded accurately. In our work we used an adaptive vector quantization where parameters have been chosen depending on energetic considerations. Generally, the HH subbands contain less information than LH and HL while a subband at low resolution contains more energy than subbands at the same orientation and higher resolutions, a reason for a more accurate quantization of the subbands at lower resolution. Moreover the dimensions of codebooks and codevectors were chosen accordingly to the variation of the MSE during the LBG algorithm [12]. Anyway, we generated specific codebooks for each subband at each resolution and orientation from active zones of subbands of the images belonging to the training set. Classification Algorithm K-NN For quality evaluation of the compression algorithms we used a supervised classification by a known PR (Pattern Recognition) nonparametric technique, called K-NN (K-Nearest Neighbor). The k-nearest-neighbor rule can be expressed as follows: Classify x by assigning it the label most frequently represented in the k nearest samples measured by the Euclidean distance. The value of the parameter k (21 in our case) was chosen by the empirical rule k = N c where Nc is the number features, used for the training set. This rule does not describe the optimal value of k but empirically it is known that it will be approximately the optimal value. EXPERIMENTS CARRIED OUT The images used to compare the discussed compression algorithms are the six visible bands of an agriculture area in the UK, acquired by TM sensors, mounted on Landsat satellites and plains. Each used portion of the original remotely sensed bands consists of 250 x 350 pixel, represented by 8 bpp and a total size of 87500 KB. To the six image samples was applied the KLT which removes the interband correlation and produces the principal components of the original samples. Then, each of the six principal components have been compressed by the lossy JPEG algorithm at different compression rates. We used the standard quantization tables, empirically defined by JPEG and appropriate for most applications. Due to the prior application of the KLT, most quantized DCT coefficients of the less significantly image planes result zero and improve therefore the run length coding of the JPEG algorithm which compress therefore these image planes at higher rates
against the more significant images without decreasing their quality. Moreover, the VQ has been applied to the principal components. For each of the decorrelated images have been first computed a separate codebook with 256 vectors from images of the same type and with similar characteristics. For different compression rates the block size have been varied from 2x2 to 5x5 pixel. At least, the more recent adaptive compression of the wavelet coefficients have been applied where the redundant information in the decorrelated image planes is reduced by coding the wavelet coefficients. The coefficient-histograms of each sub-band from the image planes have been analyzed and only the active zones where then quantized. The sub-band LL was losslessly DPCM coded while the wavelet coefficients of all other sub-bands where coded by an adaptive VQ. Finally, to all the decompressed image planes have been applied the inverse KLT to reconstruct the image portions, part of the initially remotely sensed images. The reconstructed images have then been fed to a K-NN classifier where a supervised classification was elaborated and compared to the results on the original images. The obtained rates of correct classification are generally high for this type of classifier as the training set comes from the original image portions which is not suitable for remote sensing applications by the end user. In our case this fact was accepted as the main goal was the evaluation of the compression algorithms by the decreasing rates of correct classification while the total correct classification itself was not of high interest. To compare our results in a second order also with other well known evaluation techniques we applied an algorithm which determines the Mean Square Error (MSE), an visually interesting measure. In figure 3 we show a simple and schematic overview of the work carried out. RESULTS Results are demonstrated graphically in figures 4 and 5. The Compression Rates (CR) of the JPEG and Wavelet algorithm are medium values of the six single image planes as these algorithms compress the less significant KLT transformed image planes much higher than the others. Also the percentages of Correct Classification (CC) are medium values regarding the known five single classes as in our case the detailed results are not of high interest. The total CC of the original samples was 92.26 % while the application of our KLT algorithm did not manipulate this result. The KLT-JPEG algorithm decreases this value to 87.89 % at CR 40 and MSE 7.329 which is in our opinion acceptable for remote sensing applications. Good percentages are obtained until CR 30 with CC 89 % and MSE 3.26. Visually this decoded image portions are quite indistinguishable from the originals while at CR 40 can be noticed already the typical “block effect” of the JPEG algorithm due to the 8x8 DCT. The KLT-VQ algorithm has its compression limit at CR 25 due to the block size, and decreases the CC to 88.28 % with an MSE of 7.318 at this CR. Good results of CC with this algorithm are obtained only until CR 9, 89.42 % and MSE 2.84. The block effect known also from JPEG is at CR 25 strongly present. Instead, the compression of the KLT-Wavelet-Coefficients has good results until CR 10, CC 89.24 % and MSE 5.219 while acceptable results of CC 88 % are obtained at CR 28 and MSE 8.967.
JPEG-enc. TM image samples
KLT image planes
KLT
VQ-enc. WV-enc. data
KNN
WV-dec. reconstr. image samples
KLT-1
reconstr. image planes
MSE
VQ-dec. JPEG-dec.
Figure 3: Schematic overview of the work carried out % 94 90 86 82 78 74 70 0
10
20
30
40
K-JPEG
50
60
K-VQ
70
80
90
100 CR
K-Wavelet
Figure 4: Graphical representation of Correct Classification MSE 21 18 15 12 9 6 3 0 0
10
20
30
K-JPEG
40
50
60
K-VQ
70
80
90 100 CR K-Wavelet
Figure 5: Graphical representation of MSE Note that the decoded “wavelet” images, shown in figure 6 have visually a somewhat better quality against the images from KLTJPEG and KLT-VQ at the same CR even they are classified less accurate and have an higher MSE. This is because of the missing or very low block effect due to the DPCM compression of the sub-band LL. Instead in these images is notable some kind of unsharpness in certain areas. Note that in particular these areas contain not very significant information which has been therefore lost in part during coding of just the active image zones. Anyway, we see that the KLT-JPEG algorithm obtained the best CC and MSE compared to the other techniques at parity of CR. The CC results regarding the images compressed by the “KLTWavelet” algorithm are very similar to those of the KLT-VQ technique due to the VQ application to the wavelet coefficients.
optimization may depend on the characteristics of the remotely sensed images. Anyway, an important role plays the content of the KL-Transformed image planes where the first transformed space contains the most information while the last transformed space is the “poorest” image plane. The main possibilities to obtain this optimization with the proposed Wavelet algorithm are to vary the quantization factor in the pre-coding phase and to adapt additionally the block size for the VQ depending on the analyzed image plane.
The simple VQ is limited to CR 25, determined by the pixel block size
Figure 6: Reconstructed TM image planes, upper images at CR 25, from left KLT-JPEG, KLT-Wavelet, KLT-VQ, lower images at CR 50, from left KLT-JPEG, KLT-Wavelet Just at some compression rates the KLT-Wavelet technique gives better results while this technique permits much higher compression rates against the KLT-VQ algorithm. Generally, the K-NN classifier is less sensitive than the MSE with all used algorithms, notable also by the visual decrease even a quite good CC due to k=21. The classifier uses the nearest 21 pixels around the unknown sample whose may be changed by the algorithms, reason for the increased MSE and visually lower quality but the classifier may classify quite accurate as the nearest 21 samples may contain mainly the right classes. CONCLUSIONS We have seen that from the tested techniques, KLT-JPEG fulfills best the request of compression, minimizing the degradation of CC by K-NN, applied to multispectral remotely sensed TM images. Moreover the other two algorithms need an updating of their codebooks depending on the characteristics of the input image samples, e.g. agriculture areas, urban areas and forests. The obtained results regarding the KLT-Wavelet algorithm are quite surprisingly as this approach is usually more efficient against the other analyzed techniques for many applications. The main reason for the unexpected low efficiency of this recent technique, is that the algorithm is not yet well optimized for automatic classification of remotely sensed images while it performs well for browsing applications and where the visual aspect is the main goal [1]. Usually in these kind of applications, the images are photos of objects or persons and contain therefore many significant and detailed information like for example in “Lena”. Analyzing this image types, the WV algorithm find many active zones which are then coded obtaining good visual qualities also at very high compression rates. Moreover the JPEG algorithm became already a standard based on many years of international research in image processing including operating techniques like DCT and Huffmann while the recent Wavelet compression technique is not yet well established. To take advantage of the capacity of WV coding for these application we propose an adaptive quantization and coding of the KL- and then Wavelet-transformed image samples. This
REFERENCES [1] M. Antonini, M. Barlaud, I. Daubechies, P. Mathieu, “Image coding using vector quantization in the wavelet transform domain”, Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, Albuquerque, NM, pp. 2297-2300, 1990. [2] G. Battle, “A block spin construction of ondelettes - Part I: Lemariè functions”, Communications on Mathematics and Physics, vol. 110, pp. 601-615, 1987. [3] M. Datcu, G. Schwarz, K. Schmidt, C. Reck, ”Quality evaluation of compressed optical SAR images - JPEG vs. wavelets”, Proc. IGARSS, vol. 3, pp. 1687-1689, 1995. [4] A. Gersho and B. Ramamurthi, “Image coding using vector quantization”, Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 428-431, 1982. [5] R.M. Gray, “Vector Quantization”, Vector Quantization, IEEE press, New York, USA, 1990. [6] J. Wang, K. Zhang, “An algorithm for constrained optimal band ordering of multispectral remote sensing images in lossless data compression”, Proc. IGARSS, vol. 3, pp. 1684-1686, 1995. [7] P.G. Lemariè, “Ondelettes à localisation exponentielles”, Journal of pure and applied Mathematics, vol. 67, pp. 227-236, 1988. [8] Y. Linde, A. Buzo and R.M. Gray, “An algorithm for vector quantizer design”, IEEE Transactions on Communications, vol. COM-28, pp. 84-95, 1980. [9] A.J. Maeder, “Lossless JPEG compression of remote sensing imagery”, Proc. SPIE - The International Society for Optical Engineering, vol. 2606, pp. 122-132, 1995. [10] A.N. Netravali, B.G. Haskell, “Digital Picture Representation, Compression and Standards”, Plenum Press, 1995. [11] B. Ramamurthi and A. Gersho, “Image coding using segmented codebooks”, Proc. International Picture Coding Symposium, 1983. [12] M. Vetterli, J. Kovacevic, “Wavelets and subband coding”, Prentice Hall, USA, 1995. [13] G.K. Wallace, “The JPEG Still Picture Compression Standard”, IEEE Transactions on Consumer Electronics, vol. 38, No. 1, pp. xviii-xxxiv, 1992. [14] P.H. Westerink, J. Biemond, D.E. Boekee, J.W. Woods, “Subband coding of images using vector quantization”, IEEE Transactions on Communications, vol. COM-36, No. 6, pp. 713-719, 1988. [15] P.H. Westerink, “Subband coding of images”, Ph.D. Thesis, Delft University of Technology, 1989.