A Comparison of Multispectral Satellite Sequence Compression Approaches Lionel Gueguen∗‡ , Maria Trocan∗, B´eatrice Pesquet-Popescu∗, Alain Giros‡ and Mihai Datcu† ∗ GET-T´el´ecom
Paris, 46 rue Barrault, 75634 Paris, France Email:
[email protected] † German Aerospace Center DLR, Remote Sensing Technology Institute IMF, Oberpfaffenhofen, D-82234 Wessling, Germany ‡ Centre National d’Etudes Spatiales, 18 avenue Edouard Belin, 31401 Toulouse, France Abstract— Several techniques have been proposed in the last few years for the compression of hyperspectral and multispectral images. All of them try to exploit, in various degrees, the particularities of these images, especially of the intraband and interband redundancies they exhibit. In this paper, we consider the compression of multitemporal and multispectral images and we make a comparison between still image and video approaches for the compression of these images acquired by SPOT 1, 2 and 4. The still image compression is realised using the stateof-the-art JPEG2000 codec, while the video one uses an EZBC wavelet-based codec. Both spectral and temporal decorrelation approaches are discussed.
I. I NTRODUCTION Nowadays, optical sensors such as SPOT cover a localized area several times a year. Then, it is essential to provide tools that are able to compress multispectral and multitemporal sequences in order to store large amount of data. This problem started to be adressed since the remote sensing data is significantly increasing. As an example, one multispectral image taken by the Thematic Mapper (TM) sensor from the Landsat V satellite, has the a size greater than 200 MB. New sensors have been built in order to achieve higher radiometric precision and also better spatial and spectral resolutions. Images composed by tens or even hundreds of spectral bands are very important due to the information they provide on the nature of the ground, but also they are very hard to handle, because of the huge storage resources they require. These problems can be significantly reduced by using some form of data compaction or data compression [1]. Lossless coding is a revertible process in which the original data can always be recovered from the encoded data without any loss of information. This type of coding takes advantage of the statistical redundancy of data only and as result, the achieved compression ratio rarely exceeds 3 : 1, which can not solve the data storage issues. A higher compaction is assured by lossy coding, which very often reaches compression ratios of 10 : 1 and more. However, in this case, we have to accept some loss of information in order to take advantage of the increased efficiency: the compression ratio is inversely proportional with the image quality. Compression of hyperspectral and multispectral images have been extensively studied using several techniques, such
as wavelet coding [2], adaptive coding [3], DCT coding [4], DPCM [5], [6], or statistical models [7]. In order to achieve data compression, coding techniques applied to multispectral data take advantage of the presence of two redundancy sources: spatial correlation among neighboring pixels in the same spectral band, and spectral correlation among different bands at the same spatial location. Several strategies can be adopted to separately or jointly exploit these sources of redundancy, but a cascade of spectral and spatial decorrelation is usually preferred, as in the case of three dimensional transform coding [8], which is based on the combination of two transforms that successively exploit inter-band and intra-band correlation. The Karhunen-Loeve Transform (KLT) is usually preferred as a first step, due to its signal decorrelation capability, while for the second step, the Discrete Wavelet Transform and Discrete Cosine Transform (DCT) have been widely studied, also on account of their important role in international image coding standards. In particular, the state-of-the-art JPEG2000 standard can be successfully applied to intra-band coding, since it can heavily compress the less significant image planes generated by the KLT (i.e., those associated to smaller eigenvalues) with a negligible visual distortion. These planes are in fact characterized by very low dynamics, and their number usually increases with the number of bands, thus making the use of multidimensional transform coding attractive in high spectral-resolution imagery. However, when a multitemporal sequence of satellite images is available, the correlation in the temporal direction also needs to be exploited. The subject of this paper is to evaluate the performance on SPOT sequences of still image (JPEG2000) and video (t+2D) compression techniques based on wavelet tools. For the video approach, our framework supposes t+2D wavelet decomposition in the tempo-spatial domain and Karhunen-Loeve decorrelation in the spectral domain. For our experiments, we have used the 5/3 biorthogonal filter bank and DCT for the temporal filtering, due to their good compression efficiency. This paper is organised as follows: Section II describes the compression approaches for multispectral and multitemporal satellite images. The results obtained for our sequence are presented in Section III. Finally, conclusions and future work are drawn in Section IV.
Fig. 1. First component (highest energy) of the Karhunen-Loeve transform.
Fig. 2.
Second component of the Karhunen-Loeve transform.
II. M ULTISPECTRAL AND M ULTITEMPORAL I MAGE C OMPRESSION A. Spectral Decorrelation Each multispectral image is composed of several bands that are correlated. The SPOT images exhibit some correlation among spectral bands. Instead of processing the bands separately, the interband correlation should be exploited to improve the compression. In our scheme, we propose the KLT for the spectral decorrelation of multispectral images. The KLT is applied among the spectral components taking each pixel of the scene as N dimensional vector (N being the number of spectral bands). Let X be the vector containing the N components for a given pixel and U the mean vector U = E[X]. The covariance matrix Cx is defined as: Cx = E[(X − U )(X − U )t ]
(1)
The Karhunen-Loeve Transform (T ) is defined as the one that diagonalizes Cx in the following way: Cy = T C x T t = Λ
(2)
Cy being the covariance of the transformed vector (Y ) and Λ the diagonal matrix representing eigenvalues of Cx . Y can then be obtained by the equation: Y = T (X − U )
(3)
Since the transform diagonalizes the covariance matrix between spectral bands, the spectral correlation of the transformed components is removed. The images in the transformed domain are sorted by decreasing order of importance (value of the eigenvalues). This energy compaction along the spectral axis is suitable for the selection of the main spectral components for analysis as well as for image compression. Figs. 1, 2 and 3 present one example of components obtained after applying the KLT. B. Temporal Decorrelation We consider the sequence of multispectral images that are supposed perfectly registered. The registration [9] is done in order to have precise and constant location for each pixel at any moment. Therefore, we can consider that no displacement between successive images remains in the sequence. In this approach, we treat the spectral bands independently, so there
Fig. 3.
Third component of the Karhunen-Loeve transform.
will be as many sequences as the number of spectral bands. Since after registration there is no displacement between successive input images, we may also consider that the sequences form a hyperspectral image [10], [11]. In the following, the use of a t+2D wavelet decomposition is proposed for each spectral band sequence, in order to exploit both temporal and spatial redundancies, as illustrated in Fig. 4. Due to the previous registration of the satellite sequences, the temporal transform is applied without performing motion estimation. It is already known [12]that long bidirectional filters, like the 5/3 wavelet transform, perform better, in terms of compression efficiency, than the shorter ones (ex. Haar filters). Therefore, we shall use in the following a temporal decomposition based on the 5/3 biorthogonal wavelets. We denote by x2t−1 , x2t , x2t+1 , x2t+2 the frames positioned in time at 2t − 1, 2t, 2t + 1 and respectively 2t + 2 moments. In the 5/3 filtering, the detail frame ht is predicted from its two neighbours, i.e. x2t and x2t+2 ; in a similar way the approximation frame lt is obtained from the frames ht−1 , ht by the so-called update step. For one temporal level, the 5/3 temporal filtering has the following form: 1 ht [n, m] = x2t+1 [n, m] − (x2t [n, m] + x2t+2 [n, m]) 2 lt [n, m] = x2t [n, m] + 1 (ht−1 [n, m] + ht [n, m]) 4 (4) Once the temporal subbands are obtained in the above described manner, they are further spatially decomposed with the 9/7 biorthogonal filter bank (Fig. 4) and coded using an
Rate (bpp) Image 10 Image 15 Image 24 Image 34
no transform 12.10 10.32 10.30 11.75
KLT 11.32 9.99 9.94 11.32
TABLE I JPEG2000: COMPARISON OF COMPONENT DECORRELATION
13 12.5 12
rate(bpp)
11.5 11 10.5 10 9.5 9 8.5 0
KLT no spectral decorrelation 2
4
6
8
10
frames
12
14
16
18
Fig. 5. Rate comparison for spectral decorrelation transform over the multitemporal sequence.
Fig. 4. Spatio-temporal wavelet transform applied on a spectral band sequence.
embedded zero-tree algorithm [13]. C. Joint Spectral and Temporal Decorrelation In the previous paragraphs, separate spectral and temporal decorrelation have been highlighted. However, it is essential to jointly take in consideration both types of correlation. Hence, the idea is to first transform each group of three spectral images at a given temporal moment with the KLT and then the temporal decorrelation is achieved via WT or DCT on the sequences formed by these decorrelated spectral components. The results of this methods are shown in Tab. V.
one used in image compression, and we define it as follows: 10242 P SN R = 10 log10 , MSE where MSE is the mean square quantization error. We have performed the comparisons based on the compression results given by the state-of-the-art still image codec, JPEG200 and the Motion-Compensated Embedded Zero-Tree Block (MC-EZBC) Video Codec. Note that in the last one, even though no motion is estimated, the coding algorithm exploits the 3D nature of the decomposition, both in the bitrate allocation and during the entropy coding step. After the spectral decorrelation, each of the three components of the multiband sequence is processed for compression using JPEG2000, based on EBCOT coding scheme [14], or using the EZBC algorithm.
III. E XPERIMENTAL R ESULTS The data to be compressed are sequences of SPOT images acquired by SPOT 1, 2 and 4. The data have been provided by CNES‡ . Each image is composed of 3 spectral bands (the 4th band of SPOT4 images is not treated here). The sequence has been extracted from images of size 3000 × 2000. We reduced the size of images in order to consider that the signal is stationnary for transforms such as the KLT. The size of images is 300 × 200 and there are 35 images in the considered sequence. A radiometric correction, a sensing intercalibration and a registration have been applied to the sequence. The sequence represents the reflectivity of a geographical location (the countryside at south-east of Bucharest) at several times irregularly sampled over one year. The reflectivity is between 0 and 1, and coded on 10 bits. Consequently, the distortion measure used for the presented results is slightly different from the usual
Size 100 × 100 500 × 500 1000 × 1000
no transform 11.52 10.17 10.14
KLT 10.56 9.75 9.74
gain 0.96 0.42 0.40
TABLE II S IZE INFLUENCE ON KLT AND RATE ( BPP )
Rate (bpp) PSNR (dB) Sequence Band1 Sequence Band2 Sequence Band3 Total
JPEG2000 Lossless 3.25 3.60 5.27 12.12
Intra Video Coding > 60 2.44 2.84 4.65 9.93
TABLE III S EQUENCE COMPRESSION WITHOUT TIME DECORRELATION (JPEG2000 LOSSLESS VS . EZBC).
Rate (bpp) PSNR (dB) Sequence Band1 Sequence Band2 Sequence Band3 Total
JPEG2000 + WT Lossless 3.22 3.55 4.82 11.59
JPEG2000 + DCT > 70 3.78 4.07 4.42 12.27
t+2D WT (EZBC) > 60 2.46 2.84 4.42 9.72
TABLE IV S EQUENCE COMPRESSION WITH TIME DECORRELATION .
160
Gray level of band 2
140 120 100 80 60 40 20 50
Fig. 6.
100
150
200
250
Gray level of band 3
300
350
Joint histogram between the second and third componants
In Tab. I, one can see that the rate of compression is 1:3 by using JPEG2000 in a lossless mode. The KL decorrelation decreases this rate by 0.7 bpp, that is however not significant enough. In Fig. 5, this gain in terms of rate is depicted for 16 images taken from the 35 frames sequence. A possible explanation of the reduced influence of the KLT is that satellite images are highly non-stationnary. Indeed in Tab. II, while the size of image decreases, the gain in terms of rate increases. This is an indicator of the non-stationarity which can not be exploited by a global KLT. The non-stationarity effects are noticeable in the spectral joint histogram between the second and the third band (Fig. 6), in which one can see three “modes” of correlation. In Tab. III, each band is processed separately and no time decorrelation is applied. One can see that EZBC provides better results than EBCOT (2bpp gain), for our images. However, JPEG2000 performs a lossless coding while EZBC does a nearly-lossles coding. In Tab. IV, the t+2D wavelet decomposition is applied. The t+2D wavelet coding procedure based on the EZBC codec has here one decomposition level for the 5/3 filter and the temporal subbands are spatially decomposed over five levels with the biorthogonal 9/7 wavelets. In the first column, the lossless decorrelating transform proposed in JPEG2000 is applied along the temporal direction. Compared to the results Rate (bpp) PSNR (dB) Sequence Band1 Sequence Band2 Sequence Band3 Total
KLT > 70 4.43 3.37 3.66 11.46
KLT and DCT > 70 3.71 2.65 3.47 9.83
KLT and WT > 70 4.68 3.61 4.19 12.48
TABLE V S EQUENCE COMPRESSION WITH TIME AND SPECTRAL DECORRELATION
without any time decorrelation, the gain in terms of rate is higher for EBCOT. Nevertheless, the DCT increases the rate unlike it was expected. Last, the results of JPEG2000 coding of temporaly and spectraly decorrelated sequence are displayed in Tab.V. The use of temporal DCT and spectral KLT provides the best results in terms of bitrate reduction. One reason can be that the kernel of the temporal wavelet transform that was used is shorter than the support of the DCT, not allowing a good energy compaction. IV. C ONCLUSION AND F UTURE W ORK In this paper we have performed a comparison between still-image and video approaches for the compression of the images acquired by SPOT 1, 2 and 4. As it results from our experiments, EZBC intra-coding outperforms EBCOT in terms of rate gain in the case of nearly lossless compression. In addition, it has been highlighted that temporal and spectral decorrelations can be exploited separetly or jointly to improve the compression ratio. As expected, the combination of spectral and temporal decorrelation provides the best results in terms of bitrate reduction. In the future, spatially adaptive methods will be investigated to further exploit the non-stationnarity of such images. R EFERENCES [1] A. N. Netravali and B. G. Haskell, “Digital pictures, representation and compression,” New York: Plenum, 1988. [2] L. Alparone B. Aiazzi and S. Baronti, “Near-Lossless Compression of 3D Optical Data,” IEEE Transctions on Geoscience and Remote Sensing, vol. 39, no. 11, pp. 2547–2558, Nov. 2001. [3] J.-C. Pesquet A. Benazza-Benyahia and M.H. Gharbia, “Adapted Vector-Lifting Schemes for Multiband Textured Image Coding,” in IGARSS, Toulouse, July 2003. [4] M.W. Marcellin G.P. Abousleman and B.R. Hunt, “Compression of hyperspectral imagery using the 3d dct and hybrid dpcm/dct,” IEEE Transactions on Geoscience and Remot Sensing, vol. 33, no. 1, pp. 26– 35, Jan. 1995. [5] D. Coltuc, M. Datcu, and K. Seidel, “Multichannel Compression for Sequential Image Retrieval,” in Operational Remote Sensing for Sustainable Development, 18th EARSeL Symposium 1998, Enschede, Netherlands, G.J.A. Nieuwenhuis, R.A. Vaughan, and M. Molenaar, Eds. 1999, pp. 345–351, A. A. Balkema Rotterdam/Brookfield. [6] F. Rizzo and al, “Low-complexity lossless compression of hyperspectral imagery via linear prediction,” IEEE Signal Processing Letters, vol. 12, pp. 138–141, Feb. 2005. [7] Q. Du and C.-I Chang, “Linear mixture analysis-based compression for hyperspectral image analysis,” IEEE Transactions on Geoscience and Remote Sensing, vol. 42, no. 4, pp. 875–891, April 2004. [8] A. G. Tescher J. A. Saghri and J. T. Reagan, “Practical transform coding of multispectral imagery,” IEEE Signal Processing Magazine, pp. 32–43, January 1995. [9] J. Flusser B. Zitova, “Image registration methods: a survey,” Image and Vision Computing, vol. 21, pp. 977–1000, 2003. [10] C. Serele S.E. Qian, M. Bergeron and A. Hollinger, “Evaluation and Comparison of JPEG 2000 and Vector Quantization Based Onboard Data Compression Algorithm for Hyperspectral Imagery,” in IGARSS, Toulouse, July 2003. [11] D. Coltuc, D. Luca, K. Seidel, and M. Datcu, “3-Dimensional Signal Analysis of Multichannel Spectrometric Imagery,” in Progress in Environmental Research and Applications, 15th EARSeL Symposium 1995, Basel, Switzerland, E. Parlow, Ed., Sept 1996, pp. 155–162. [12] Y. Zhan, M. Picard, B. Pesquet-Popescu, and H. Heijmans, “Long temporal filters in lifting schemes for scalable video coding,” doc. m8680, Klagenfurt MPEG meeting, July 2002. [13] Y. Wu and J. W. Woods, “Recent improvements in mc-ezbc video coder,” Tech. Rep. JTC1/SC29/WG11/M10396, MPEG (Moving Pictures Experts Group), Hawaii, USA, 2003. [14] D. Taubman, “High Scalable Image Compression with EBCOT,” IEEE Transactions on Image Processing, vol. 9, no. 7, pp. 1158–1170, July 2000.