Review of Coding Techniques Applied to Remote Sensing Joan Serra-Sagrista, Francesc Auli, Fernando Garcia, Jorge Gonzalez and Pere Guitart Computer Science Department, ETSE Universitat Autonoma Barcelona E-08193 Cerdanyola del Valles, SPAIN
[email protected]
Abstract. With the aim of obtaining a valid compression method for remote sensing and geographic information systems, and because comparisons among the different available techniques are not always performed in a sufficiently fair manner, we are currently developing a framework for evaluating several still image coding techniques. In addition to properly choose the best suitable technique according to compression factor and quality of recovery, it is expected that this setting will let us introduce the particular functionalities requested by this kind of applications.
1 Introduction High resolution images are becoming a natural source of data for many different applications. On one hand, for Remote Sensing (RS) applications, multi-spectral and hyperspectral images have been successfully used for image classification and segmentation. Since 1925, when metrical cameras appeared, aerial photographies have become an appreciated source of data for territory management; and since 1970, aerospace research, telecommunications and sensor designing have eased the obtaining of periodic images from the earth surface. Today, to name a few examples: in agriculture, RS is critical for the observation and detection of plagues, or for assessing the effectiveness of pesticides; in cartography, it has contributed to the preparation of maps, or to feature extractions; and RS is also used for environment monitoring; or for meteorological predictions. On the other hand, Geographic Information Systems (GIS) applications are especially designed to deal with the great amount of information that RS supplies. Each day more applications of GIS appear, being essential tools for scientists, technicians and people working in this field. Nowadays RS and GIS are the most important tools used to control and monitor the territory. The main source of the information treated by GIS applications is obtained from the images captured by RS sensors, which have an increasing resolution. The QuickBird 2 satellite, for example, launched in October 2001, is capable of taking images of 60 cm of spatial resolution; CASI and AVIRIS sensors are capable of rendering over 200 bands
This work has been supported in part by the Spanish Government and FEDER through MCYT Grant TIC2003-08604-C04-01, and by the Catalan Government DURSI Grant 2001SGR 00219.
2
of spectral resolution; 24 bits per pixel is becoming a more usual pixel resolution than the classical 8 bpp; and, to make figures even larger, the surface of the captured images typically covers tens of kilometers. A brief example will illustrate the huge size of the captured images; let us consider a sensor capturing a 10 km× 10 km surface, with a 10 m spatial resolution, 10 bands per pixel, and 8 bpp, accounting for a total of 10 MB; if 100 bands are used, or if the spatial resolution is set to 1 m, the final size increases to, respectively, 100 MB or 1 GB. This example shows the need for compression of hyperspectral images both for storage and transmission scenarios. However, in the particular case of remote sensing and geographic information systems applications, the final users are interested not only in achieving large compression ratios, but also in some other functionalities that must be preserved by the encoding process. To this end, we are currently working towards the development of an image compression format, both lossy and lossless, that fulfills, among others, the following requirements: 1) availability of compression of both mono-band and multi-band (either multi or hyperspectral images); 2) high speed of data recovering (from the encoded bit stream) in all image regions, considering also embedded transmission; 3) zoom and lateral shift capability; 4) respect of no-data or meta-data regions, which should be maintained at any compression ratio; 5) in the case of lossy compression, lossless encoding of some physical parameters such as temperature, radiance, elevation, etc.; 6) to reach high compression ratios while maintaining the image quality. Therefore, in order to develop an image compression format for these applications, we are now investigating the suitability of some well known image coding techniques based on the wavelet transform. Since these techniques are originally presented by their authors in different settings, and their performance is not always evaluated in a similar scenario, we are developing a unified framework that will allow us not only to compare such techniques in a fair manner, but also to incorporate the particular features required by the RS and GIS applications. This framework is being implemented in JAVA language to permit its dissemination across various platforms. The paper is organized as follows: Section 2 briefly presents the different parts of most still image coding schemes. Section 3 gives a comparison of the experimental results achieved by each wavelet-based coding method when applied to several corpora of images. Last, a conclusions section summarizing the work done and proposing future research ends the paper.
2 Image Coding Systems An encoder in a typical lossy compression system consists mainly of four basic stages. In fact, most systems include a pre-processing stage where, if needed, a color model conversion or a dimension reduction is performed. The four basic stages are: first, a transform is applied to the input data in order to obtain de-correlated coefficients and a higher compactness of energy in a few coefficients; second, a quantization stage removes information considered unnecessary for user purposes; third, a bit plane encoding is applied to account for the significance of the quantized coefficients; fourth and last, an entropy coding scheme is used to reduce the amount of bits needed to send the significant quantized coefficients through the transmission channel.
3
At the receiver side, the decoder performs the inverse operations in reverse order. The overall goal is to produce a recovered image as close as possible to the original image while preserving the bit rate needed to transmit the compressed image as low as possible. For a deeper description of all the stages of an image coding system, particularly of the different bit plane encoding techniques, see [10]. 2.1 Transform The aim of the transform step is to de-correlate the source to eliminate the redundancy among the consecutive samples, and to obtain a more compact representation of the signal. Due to the nature of the given source, natural, synthetic, multi-spectral and hyperspectral images, the discrete wavelet transform (DWT) [1] is preferred over the discrete cosine transform (DCT) –used by the classical JPEG–, because, among other properties, it does not produce blocking artifacts and achieves a higher de-correlation (see the pioneering work of Daubechies [3] and Mallat [5]). 2.2 Quantization Since most of the transforms produce real numbers which are difficult to deal with by computers, a quantization step is then performed. The oldest and simplest example of quantization is rounding off, which is a particular case of scalar quantization, where the discrete resulting quantized coefficients are all integer numbers. 2.3 Bit Plane Encoding After the transformed coefficients have been quantized, a bit plane encoding step is performed to account for the significance of each coefficient or for the significance of a group of coefficients (mostly grouped in a tree structure natural to the hierarchical wavelet transform). A coefficient is said to be significant if its absolute magnitude is equal or higher than a given threshold. The goal of the methods that implement this step is to generate a stream of symbols which allows a progressive transmission (techniques having this property are called embedded coding methods). In order to benefit from the good performance of the bit plane encoding philosophy, a successive approximation approach is taken, so that after a given bit plane has been scanned and the corresponding refinement bits are sent, the threshold is further halved and a new bit plane scanning is begun. The embedded lossy coding techniques which employ a bit plane encoding stage that are reviewed in this paper are: EZW [11], IC [12], SPIHT [7], CCSDS [2] and JPEG2000 [4]. 2.4 Entropy Coding An entropy coder is aimed at reducing the number of bits needed to represent the output stream of the bit plane encoder. This coder considers the different symbols in the output
4
stream and their (expected) probabilities. The reduction of bits is due to the fact that symbols with higher probability are coded with a lower number of bits. This step is a perfect reversible process. The two most popular entropy coders used in image compression are Huffman and arithmetic coding. Even being more expensive computationally, arithmetic coding has become the most useful of them, due to the nature of the symbols to be coded. In general, the data obtained from the bit plane encoder is composed of just a few symbols with clearly differentiated probabilities. Such feature constitutes a good premise to obtain optimal results in arithmetic coding. Moreover, in order to implement adaptive coding, Huffman turns out more complex and expensive in time. Apart from CCSDS that includes its own entropy coding, the entropy coder used in our JAVA application is the arithmetic coding. In some cases, for instance for the IC approach, this entropy coder is required, whereas in other cases, for instance SPIHT, this step may be omitted without significantly degrading its performance.
3 Experimental Results In this section we evaluate the experimental performance of some well known embedded techniques for lossy compression of different corpora of images. In a sense, this work extends the results presented in [8]. The EZW, IC, SPIHT and CCSDS image coding techniques have been implemented in JAVA language and incorporated into the unified framework that we are developing (see [9]). In order to assess the validity of the results obtained with the previous four methods, the results produced by the new Joint Photographic Experts Group 2000 (JasPer JPEG-2000 Encoder, version 1.700.2, compliant with ISO/IEC 15444-1, i.e., JPEG-2000 Part 1, 25 December 2001) are also provided. 3.1 Considered Corpora of Images ISO/CCITT Corpus The ISO/CCITT Corpus is the corpus taken by the Joint Photographic Experts Group to evaluate the performance of JPEG encoding method on still images. It consists of 9 images of size 720×576; we have chosen the following eight for performing the experiments: Barbara1, Barbara2, Board, Boats, Girl, Goldhill, Hotel and Zelda. The original images have been cut to images of size 512×512 (centered in the original image). CASI Image A large multispectral image was provided by the Center for Ecological Research and Forestry Applications (CREAF), a public institute affiliated to the Universitat Autònoma de Barcelona. This image has a size of 710×4558 pixels, each pixel with a spatial resolution of 3.5 meters, and covers a large area including vegetation, cultivated areas and urban areas. The image has been acquired through a CASI (Compact Airborne Spectrographic Imager) sensor using a general configuration of 14 bands, that is, each pixel is represented by a vector of 14 8-bpp gray scale values. (The CASI sensor allows 220 bands, but for this particular application only 14 bands were considered). Each of the original 14 bands has been cut to images of size 128×2048 pixels. Compression experiments have been carried out on all 14 bands.
5
Multispectral Images Two large multispectral images were provided by the Catalan Institute of Cartography (ICC), a public institute depending on the Catalan Government. The first image has a size of 7088×4845 pixels, and the second image has a size of 7109×4864, each pixel with a spatial resolution of 3.5 meters. The images cover a huge area including vegetation, cultivated land and urban areas. Both images have been cut to images of different sizes: 512×512, 1024×1024, 2048×2048 and 4096×4096. Compression experiments have been carried out on all these images. AVIRIS Image The last experiment was performed with an AVIRIS spectrometer image, the Indian Pines 92 from Northern Indiana, taken on June 12, 1992, on a NASA ER2 flight at high altitude, with a ground pixel size of 17m resolution. In fact, the image used is the ground truth data available for this scene, which can be obtained from Landgrebe. The ground truth data image contains elements of 16 classes and has a size of 145×145 pixels. The original image has been cut to an image of size 128×128 (centered in the original image). 3.2 Coding Performance Evaluation of the different coding techniques is performed based on the the trade-off between the compression ratio (given in bits per pixel, bpp), and the quality (given in PSNR). The Peak Signal to Noise Ratio (PSNR), is a measure accounting for the similarity between the original image I and the recovered image I ∗ , given in dB; for B −1)2 images with a B bpp resolution, P SN R = 10 log 10 (2MSE , where the Mean Square Nx Ny 1 1 ∗ 2 Error (MSE) is given by M SE = Nx Ny i j (Iij − Iij ) . Table 1 presents coding results obtained for the several techniques analyzed when applied to some images of the different considered corpora of images.
4 Conclusions and Future Research High resolution images are a growing source of data for applied technologies involving scientists from a broad range of disciplines. Because of the increasing use of these applications, and the huge size of the images they manage, a compression process of such images has to be adopted before transmitting or before storing them. Lossy coding is preferred over lossless coding to account for a higher compression ratio, but assuring at the same time a high quality image recovering. Remote sensing and geographic information systems are examples of such applications, since they use hyperspectral images of huge size and high bit per pixel, spatial, and spectral resolution. These applications have some specific demands that are not addressed by the commonest still image coding techniques, so that new paradigms have to be devised. In addition, the published results on lossy image coding techniques are not always based on a similar framework, so that comparisons among them are rarely fair enough. This paper provides an experimental comparison of some lossy image compression algorithms based on the wavelet transform. Four of these techniques, EZW, IC, SPIHT
6 Table 1. Assessment of coding performance of several coding techniques Image Boats
Size Comp.Ratio 512×512 32:1 128:1 320:1 1024:1 CASI 128×2048 8:1 32:1 128:1 512:1 Multispectral 1024×1024 8:1 32:1 128:1 256:1 AVIRIS 128×128 8:1 32:1 64:1 128:1
EZW 30.5 25.3 23.0 20.7 39.4 32.7 28.9 26.4 22.7 18.6 16.8 16.4 34.5 23.4 21.1 19.0
IC 31.4 26.0 23.4 20.5 39.9 33.4 29.3 26.5 23.6 19.2 17.3 16.6 34.6 23.2 19.9 16.7
SPIHT 31.4 26.2 23.6 21.3 40.0 33.5 29.5 26.7 23.3 19.0 17.2 16.6 35.4 24.7 22.2 19.9
CCSDS JP2 30.2 31.6 23.9 25.8 7.1 22.8 5.9 17.0 39.6 39.4 33.0 33.2 28.1 28.9 4.3 25.6 23.0 23.9 18.5 19.0 14.1 17.2 8.4 16.5 35.0 37.2 23.1 21.5 19.2 15.2 8.9 11.1
and CCSDS-ILDC, have been implemented in the JAVA application that we are currently developing; a fifth technique, JPEG2000, is expected to be added in the near future. Wavelet or subband image coding is an efficient method of image compression, because subbands of the same level have little interband correlation. However, some spatially varying interband energy dependence is often visible in an image subband decomposition across the levels (or scales) of the wavelet pyramid. All reviewed methods are motivated by such significant statistically dependence and all yield an embedded encoder. Although the analyzed wavelet-based methods present important structural differences, experimental results carried out on several corpora of images with different characteristics show that all the techniques produce approximately the same performance from very low bit rate (0.0078125 bpp, compression ratio of 1024:1) to low bit rate (1 bpp, compression ratio of 8:1). In regard to future research, and concerning the reviewed techniques, it is of interest to note that: a) JPEG2000, in particular Part 9 of the proposed standard, seems to fit very nicely with the particular requirements of remote sensing and geographic information systems applications; b) SPIHT is the single technique that produces competitive results without the expensive arithmetic coding step; and c) CCSDS-ILDC could be a good choice if an extension of the method, to deal with wavelet transforms of more levels, was devised. Notice also that we have not addressed here the performance of the currently recognized state-of-the-art still image coding technique [13], nor the impressive results of the recent [6].
7
Acknowledgements The authors would like to thank Gilles Moury, from CNES Toulouse, France, and PenShu Yeh, from Goddard Space Flight Center, NASA, USA, for providing the CCSDS Recommendation reports; and the Center for Ecological Research and Forestry Applications (CREAF, UAB, Spain) (http://www.creaf.uab.es) and the Catalan Institute of Cartography (ICC, Spain) (http://www.icc.es) for providing the multispectral images.
References 1. M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies. Image coding using wavelet transform. IEEE Transactions on Image Processing, 1(2):205–220, April 1992. 2. CCSDS. Image lossy data compression. Technical report, Consultative Committee for Space Data Systems, Toulouse, France, September 2002. White Book, Draft Issue 2b. 3. I. Daubechies. Orthonormal bases of compactly supported wavelets. Communications of Pure Applied Mathemathics, 41:909–996, 1988. 4. ISO/IEC. JPEG2000 image coding system. Technical Report ISO/IEC 15444-1, International Standard Organization / International Electrotechnical Commission, 2000. 5. S. Mallat. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7), July 1989. 6. S.E. Qian, M. Bergeron, C. Serele, I. Cunningham, and A. Hollinger. Evaluation and comparison of JPEG2000 and vector quantization based onboard data compression algorithm for hyperspectral imagery. In Int. Geoscience and Remote Sensing Symposium, volume III of Proceedings of IEEE, IGARSS, pages 1820–1822, Toulouse, France, 2003. 7. Amir Said and William A. Pearlman. A new, fast, and efficient image codec based on set partitioning in hierarchical trees. IEEE Transactions on Circuits and Systems for Video Technology, 6:243–250, June 1996. 8. J. Serra-Sagrista. Hyperspectral image coding for remote sensing applications. SPIE Electronic Imaging Newsletters, 13(1):4–8, January 2003. 9. J. Serra-Sagrista, F. Auli, C. Fernandez, and F. Garcia. A JAVA framework for evaluating still image coders applied to remote sensing applications. In International Geoscience and Remote Sensing Symposium, volume VI of Proceedings of IEEE, IGARSS, pages 3595–3597, Toulouse, France, 2003. 10. J. Serra-Sagrista, C. Fernandez, F. Auli, and F. Garcia. Exploring image coding techniques for remote sensing and geographic information systems. In Touradj Ebrahimi and Thomas Sikora, editors, Visual Communications and Image Processing, volume 5150 of Proceedings of SPIE, VCIP, pages 1470–1480, Lugano, Switzerland, 2003. 11. J. M. Shapiro. Embedded image coding using zerotrees of wavelet coefficients. IEEE Transactions on Signal Processing, 41:3445–3462, 1993. 12. J. Tian and R.O. Wells. A lossy image codec based on index coding. In Proceedings of IEEE Data Compression Conference, Snowbird, Utah, USA, 1996. 13. Z. Xion, K. Ramchandran, and M.T. Orchard. Wavelet packet image coding using spacefrequency quantization. IEEE Transactions on Image Processing, 7(6):892–898, June 1998.