IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 6, NO. 4, OCTOBER 2009
689
Superresolution Reconstruction of Multispectral Data for Improved Image Classification Feng Li, Student Member, IEEE, Xiuping Jia, and Donald Fraser, Senior Member, IEEE
Abstract—In this letter, the application of superresolution (SR) techniques to multispectral image clustering and classification is investigated and tested using satellite data. A set of multispectral images with better spatial resolution is obtained after an SR technique is applied to several data sets recorded within a short period over a study area. Improved clustering and classification performance is demonstrated visually and quantitatively by comparison with the original low-resolution data or enlarged images using a conventional interpolation method. This letter illustrates the possibility and feasibility of the use of SR reconstruction for the classification of remote sensing data, which is encouraging as a means of breaking through current satellite detectors’ resolution limits. Index Terms—Clustering and classification, discrete wavelet transform, maximum a posteriori (MAP), Moderate Resolution Imaging Spectroradiometer (MODIS), superresolution (SR).
I. I NTRODUCTION
F
OR satellite imagery, spatial resolution is one crucial measure of quality. High spatial resolution for multispectral images is required to provide detailed information for landuse and land-cover mapping. There are several factors that limit the spatial resolution of multispectral sensors, such as the following: orbit altitude, instantaneous field of view, detector technology used, speed of the platform and revisit cycle, etc. In general, high spatial resolution is associated with low spectral resolution, and vice versa. That is the reason why hyperspectral data, whose spectral resolution can be as high as 10 nm, have relatively low spatial resolution. Similarly, a panchromatic band covering a wide spectral range of over 300 nm has higher spatial resolution than individual multispectral bands [1]. From the late 1960s, when the first multispectral sensor was used on Apollo-9, improvement of spatial resolution has been achieved progressively, from the 79-m resolution of the multi-spectral scanner (MSS) of Landsat-1 to the 20-m resolution of EO-1 Hyperion. The study of spatial-resolution improvement by postprocessing has also been conducted actively as a significant and economical alternative means, since once a satellite is launched, it is difficult to upgrade its imaging devices. Apart from image-fusion and image-sharpening techniques for spatial-resolution improvement, superresolution (SR) mapping (SRM), which superresolves soft-classification results, has been developed in recent years. SRM can be achieved based Manuscript received January 19, 2009; revised April 2, 2009. First published July 10, 2009; current version published October 14, 2009. The authors are with the School of Information Technology and Electrical Engineering, The University of New South Wales at The Australian Defence Force Academy, Canberra, ACT 2600, Australia (e-mail: feng.li@ student.adfa.edu.au;
[email protected];
[email protected]). Digital Object Identifier 10.1109/LGRS.2009.2023604
on Markov random-field theorem [2] or via a pixel-swapping optimization algorithm [3] in order to generate a realistic spatial structure on the refined thematic map. Alternatively, a geostatistical indicator coKrging has been introduced for a noniterative SRM procedure [4]. The weakness of SRM is that it relies heavily on soft-classification performance of a single low-resolution data set. A different approach from SRM is to use several data sets captured at different dates to reconstruct a set of high-spatialresolution images before a conventional hard classification is performed. We refer to this as the SR reconstruction technique in this letter. It can increase the number of informative pixels by making use of the (slight) ground-footprint mismatch from image to image. Classification can then be carried out on these reconstructed higher spatial resolution images without any limitation to the classification techniques adopted. SR was first introduced for multispectral data processing by Tsai and Huang [5] in the Fourier domain in 1984. Since then, further developments have been made including SR in the wavelet domain, iterative back projection [6] approaches, and a stochastic reconstruction approach such as maximum a posteriori (MAP) [7]–[9]. Recently, a new SR method, MAP based on a universal Hidden Markov Tree model (MAP-uHMT) [10], was proposed that uses HMT theory [11], [12] in the wavelet domain to set up a prior model to regularize the ill-conditioned problem. This method can cope with local geometric distortions, radiometric distortion, and noise contamination and, therefore, performs well on panchromatic images [10]. SR reconstruction has been tested for multispectral sensor Compact High-Resolution Imaging Spectrometer onboard the project for onboard autonomy (CHRIS/Proba) [13], Moderate Resolution Imaging Spectroradiometer (MODIS) [9] and SPOT4 onboard sensors [14]. However, the impact of the reconstructed data toward image classification has not been addressed. In this letter, data reconstruction using the SR technique MAP-uHMT is investigated and tested as a preprocessing stage for multispectral image classification. We show that, by combining several images of multispectral bands captured on different dates, improved spatial resolution is achieved, and it leads to better classification and clustering results both visually and numerically. The rest of this letter is organized as follows. In the following section, the SR method MAP-uHMT is briefly reviewed, and its application on MODIS data is introduced. Then, we demonstrate the good quality of the clustering and classification results after superresolving multispectral images channel by channel in Section III. Finally, conclusions are provided in Section IV.
1545-598X/$26.00 © 2009 IEEE
690
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 6, NO. 4, OCTOBER 2009
II. M ETHODOLOGY A. SR Method: MAP-uHMT First, let us consider the imaging procedure for satellite imagery. The satellite images obtained are likely to suffer from geometric distortion, blurring, and added noise. To simulate these effects, we have gi = DHi Mi z + ni ,
i = 1, . . . , K
(1)
where z can be regarded as an ideal multispectral data set obtained by sampling a continuous scene at high resolution (HR), gi is the ith sensor-recorded low-resolution multispectral image, K is the number of low resolution (LR) multispectral images, Mi is a warping matrix which represents geometric distortion brought on by vibration of the satellite platform and air turbulence, Hi is a blur matrix representing the imaging system’s modulation transfer function and radiometric distortion, such as due to atmospheric scattering and absorption, D is a subsampling matrix reflecting the difference in the resolution expected and the resolution provided by the sensor, and ni represents additive noise of Gaussian distribution with zero mean. The aim of an SR algorithm, then, is to reconstruct the higher resolution image z, which is not achievable by the given sensor, from the set of LR images gi recorded at different times or by different sensors. The process is performed by image postprocessing for each spectral band separately. In practice, the warping matrix Mi must be estimated initially from the LR images between gi and the reference selected from the LR images, so it is impossible to calculate the exact warping matrix Mi . Thus, only an approximation Mi is available. Similarly, only an approximation Hi is available. Then, (1) can be rewritten as gi = DHi Mi z + ni ,
i = 1, . . . , K
(2)
where ni follows a zero-mean Gaussian distribution. Because the imaging procedure is not invertible, (1) and (2) represent a typically ill-conditioned problem. This implies that an exact solution does not exist. A well-posed problem is preferred, and an approximation to the original SR image z is desired. This SR method MAP-uHMT is based on the uHMT theory (described in detail in [10]) in the wavelet domain. The reason that we choose to work in the wavelet domain, rather than in the spatial or frequency domains, is that the probability density function (pdf) of the wavelet coefficients can be well characterized by a mixed Gaussian pdf and the dependences between the coarse-scale wavelet coefficients’ “parent” and the finer scale wavelet coefficients’ “children” can be well described by uHMT in the wavelet domain. Therefore, by using the MAP estimator in the wavelet domain, z∗ is obtained by z) z∗ = arg min L(
(3)
where “ ” denotes the relevant expression of vectors or matrices in the wavelet domain, and L( z) = α
K 2 z M H T Q z gi − D i i +z i=1
(4)
where Q is a prior model which is set up as a diagonal matrix by using uHMT. The parameter α describes a parameter to describe the variance of the noise as well as the error caused by the incorrect motion and blur matrices; and it also balances the contribution of the high-frequency information from other LR images and the prior model. Steepest descent is adopted to calculate the minimum of (4), and the prior model Q is updated for each estimation of z ∗ . After a few iterations, a stable and approximate estimate z ∗ is available for the ill-conditioned problem [10]. B. MAP-uHMT on Multispectral Images Classification is a critical process in converting multispectral images into user-required information. This letter does not cover research on classification methods for multispectral images, which is outside its scope. What we emphasize in this letter is that SR methods can be used to improve the performance of clustering and classification. With low spatial resolution, there are more mixed pixels, and some objects may not be recognized. For this reason, we investigate the use of our SR method as a preprocessing step for remote sensing image classification. The precondition for SR is that several groups of multispectral images should be captured within a short period of time and for the same scene. As more remote sensing platforms become available, images of the Earth’s surface will be collected over a regular period (perhaps every day or every week). They present an opportunity for improving spatial resolution by applying SR techniques. For example, satellites such as Terra and Aqua revisit the entire Earth’s surface every one to two days. Because the revisit cycle is so frequent, we can assume that the ground truth has not changed during such a short period. Even for some long-cycle-period satellites such as Landsat-7, so long as the regions of interest are urban areas, concrete buildings, etc., SR methods can still be used on multispectral classification because these areas will not change much during that period. To eliminate illumination changes during the period, data calibration or normalization can be applied. III. E XPERIMENTAL R ESULTS In this letter, we test our method using MODIS data sets from the satellite Terra. Terra’s orbit around the Earth is timed so that it passes from north to south across the equator in the morning. Although Terra is a Sun-synchronized satellite and its repeat cycle is 16 days, the global-observation cycle is one to two days due to its wide swath. The changes in track of Terra allow the sensor to capture the same ground area from slightly different viewpoints during subsequent passes, which is important for the SR method. We select five different Terra Level-1B MODIS multispectral data groups (within the Canberra area of Australia), captured on August 24 and September 2, 9, 11, and 27, in 2008, as test images. While the global-observation cycle of Terra is one to two days, clouds appeared frequently between August and September in 2008 within the Canberra region which made some date’s data less useful. The five data sets selected were
LI et al.: SUPERRESOLUTION RECONSTRUCTION OF MULTISPECTRAL DATA
691
Fig. 1. One Band 02 MODIS data and its relevant results including PSNR within Canberra, Australia. (a) Actual HR image of Band 02 with spatial resolution of 250 m captured on September 2, 2008. (b) LR image of Band 02 captured at the same time by MODIS but coarser spatial resolution of 500 2×2 m. (c) Enlarged |z − z ∗ |2 ), where z by bilinear interpolation of (b). (d) SR resolved by MAP-uHMT. The PSNR is defined as follows: P SN R = 10 log10 (2562 /(1/N 2 ) is the actual image, z ∗ is the reconstructed image, and N 2 is the number of pixels.
the closest available. Changes of ground cover over this time interval are ignored. The testing site is centered at longitude 149.20 and latitude −35.40. Initially, 2-D correlations are used to compute shifts to allow the source images to be coarsely aligned (only integer pixels shifts are applied). This procedure helps to select the same area in the series of LR images without damaging any detail information. The size of the testing site is 64 km × 64 km. There are 64 × 64 pixels from the coarsest resolution (1 km) multispectral images of the first seven bands (620–670, 841–876, 459–479, 545–565, 1230–1250, 1628–1652, and 2105–2155 nm). Within the same site, we also selected subimages of size 128 × 128 from the 500-m resolution seven bands of the five data groups. In a similar way, the subimages of size 256 × 256 at the 250-m resolution of the five data group were selected. Only two bands are available at that resolution (620–670 and 841–876 nm). The blocks of 256 × 256 pixels and 128 × 128 pixels of Band 02 from resolutions 250 and 500 m, respectively, are shown in Fig. 1(a) and (b), which were captured on September 2, 2008. This data group on September 2, 2008 is regarded as the reference for the SR method; therefore, its clustering and classification results will be regarded as “ground truth.” We conduct two different tests with the subimages of the five data groups. In one test, we apply the SR method MAPuHMT to the resolution of 1 km of the five MODIS data groups channel by channel. The recovered SR images are found for the following parameter set: expansion factor of 2, Gaussian blur kernel size 5 × 5 with standard deviation of 0.8, and balancing parameter α = 0.1, and all intensities of the images are normalized between [0, 1]. Because the LR 1-km resolution multispectral images are 64 × 64 pixels, the recovered seven bands SR images are 128 × 128 pixels with a new resolution of 500 m. In the second experiment, we apply the SR method MAPuHMT to the resolution of 500 m for Bands 01 and 02 of the five MODIS data groups, respectively. The superresolved SR images are based on the same parameter sets used in the first test. Still selecting Band 02 as an example, the bilinear interpolation result shown in Fig. 1(b) is shown in Fig. 1(c). Fig. 1(d) is the reconstructed image of spatial resolution of 250 m of Band 02 using MAP-uHMT. Peak signal-to-noise ratio (PSNR) is selected as an objective numerical evaluation method
for this single-channel reconstruction. As shown in Fig. 1(c) and (d), more than 2-dB improvement is achieved by MAPuHMT as compared with the bilinear interpolation. It can also be seen that the MAP-uHMT superresolved image includes more detail than shown in Fig. 1(c). A critical step in MAP-uHMT method is to calculate Mi (which is image registration). We have adopted an efficient elastic image registration algorithm [15] in lieu of other meth matrix, we can achieve this registration by ods. For the M i warping the image with matrix Mi before transforming back to the wavelet domain and so avoid the need to set up the exact matrices. Note that a very important point is to calculate the Mi for a single channel and apply this Mi to all other channels. Therefore, all reconstructed multispectral channels with higher resolution are still aligned, which is very important for classification. We adopt the Band 02 image as the reference for the following two tests. A. Classification of Superresolved Images In this section, we make comparisons using seven bands of MODIS multispectral data. We test the following data sets: 1) the original set of 500-m resolution captured on September 2, 2008 (the classification results using this data set is treated as “ground truth”); 2) the data set of 1-km resolution; 3) a 2 × 2 expansion by bilinear interpolation of the 1-km resolution set; and 4) a 2 × 2 SR expansion of the 1-km resolution set by our MAP-uHMT method. Thus, three of the multispectral images 1), 3), and 4) have a resolution of 500 m and size of 128 × 128 pixels before classification, while 2) remains at 64 × 64 pixels. A maximum-likelihood supervised method is used to classify the data into six classes: Dry lake, Open farmland, Eucalypt forest, Lake, Shadow, and Urban and small farms from the seven bands of these test images. Training data are selected from data set 1) and then transferred to other data sets. For each image classification, the statistics are updated based on the new training data from the corresponding training fields. The classification results for the four data sets are given in Fig. 2(a)–(d). We can see that Fig. 2(d) provides the closest classification result to 1) as compared to the other results. In particular, the shape of Lake Burley Griffin in Canberra in the upper central areas (described as class “Lake” here) more closely matches the shape in the ground-truth result than do the other classification results.
692
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 6, NO. 4, OCTOBER 2009
Fig. 2. Supervised-classification results of the first seven bands of MODIS. (a) Actual HR. (b) LR. (c) Bilinear. (d) MAP-uHMT. TABLE I C OMPARISON OF S UPERVISED -C LASSIFICATION R ESULTS IN N UMBER OF P IXELS AND ACCURACY
Fig. 3. Unsupervised-classification results of MODIS Band 01 and Band 02. (a) Actual SR images. (b) Result of LR images. (c) Interpolated images. (d) MAP-uHMT superresolved images.
Furthermore, numerical comparisons also support the visual comparisons. In Table I, the numbers in the first row are regarded as the ground truths. The numbers listed in the second, third, and fourth rows are the correctly classified pixel number in the relevant classes using data sets 2)–3). From this table, we can also see that the overall accuracy of the classification for the superresolved images by MAP-uHMT provides the best performance of 69.2% as compared to the bilinear interpolated images with 57.7% and the LR images with 51.6%. Similar results are archived for Kappa coefficient [1] in evaluating the mapping accuracy, which can be seen in the last column in Table I. Note that, although the size of the seven-band LR multispectral image is 64 × 64 pixels, before supervised classification, we have enlarged its classification result by 2 × 2 using nearest neighbor for a better numerical comparison. B. Clustering on Superresolved Images In this section, we make comparisons using two bands of MODIS multispectral data. We test the following data sets: 1) the actual SR set of 250-m resolution captured on September 2, 2008 (the clustering results using this data set is treated as “ground truth”); 2) the data set of 500-m resolution; 3) a 2 × 2 expansion by bilinear interpolation of the 500-m
resolution set; and 4) a 2 × 2 SR expansion of the 500-m resolution set by our MAP-uHMT method. Thus, three of the multispectral images 1), 3), and 4) have a resolution of 250 m and size of 256 × 256 pixels before classification, while 2) remains at 128 × 128 pixels. An iterative self-organizing data-analysis unsupervised method is adopted. We classify all images into three clusters with the minimum cluster size of 30. Clustering results using the data sets 1)–4) are shown in Fig. 3. Visually, the clustering result shown in Fig. 3(d) provides more detailed information, similar to that of Fig. 3(a), showing the superior performance provided by the SR data set. Furthermore, we assess unsupervised clustering results by comparing numerically with the reference shown in Fig. 3(a), as listed in Table II. In order to perform pixel-by-pixel checking, the result from the LR data (128 × 128) was enlarged by a factor of two using nearest neighbor interpolation. As shown in Table II, the overall accuracy of 84.9% is achieved for the unsupervised-classification results on the superresolved images as compared to 76.1% and 75.0% for those bilinear interpolated images and LR images, respectively. Moreover, Kappa coefficient is improved to 0.77 from 0.62 (using the bilinear interpolated data) and 0.64 (using the LR data).
LI et al.: SUPERRESOLUTION RECONSTRUCTION OF MULTISPECTRAL DATA
TABLE II C OMPARISON OF C LUSTERING R ESULTS
Therefore, again, our superresolved multispectral image achieves a better classification both visually and numerically. IV. C ONCLUSION For the first time, SR reconstruction is investigated and tested as a preprocessing stage for multispectral image classification. By combining several images of multispectral bands captured on different dates, improved spatial resolution is achieved satisfactorily by using an SR reconstruction technique, which leads to better classification and clustering results both visually and numerically, as shown in our experiments. As remote sensing data will be collected more and more frequently, SR techniques can be further tested and applied more widely in the near future. ACKNOWLEDGMENT This work was carried out with the support of a University College Postgraduate Research Scholarship (UCPRS) of The University of New South Wales at The Australian Defence Force Academy (UNSW@ADFA). All MODIS data were downloaded from the Australian Centre for Remote Sensing, courtesy of Geoscience Australia. Classification and clustering were performed by MultiSpec, courtesy of Purdue University.
693
R EFERENCES [1] J. Richards and X. Jia, Remote Sensing Digital Image Analysis. New York: Springer-Verlag, 1986. [2] T. Kasetkasem, M. Arora, and P. Varshney, “Super-resolution land cover mapping using a Markov random field based approach,” Remote Sens. Environ., vol. 96, no. 3/4, pp. 302–314, Jun. 2005. [3] P. Atkinson, “Super-resolution target mapping from soft classified remotely sensed imagery,” Photogramm. Eng. Remote Sens., vol. 71, no. 7, pp. 839–846, 2005. [4] A. Boucher and P. Kyriakidis, “Super-resolution land cover mapping with indicator geostatistics,” Remote Sens. Environ., vol. 104, no. 3, pp. 264– 282, Oct. 2006. [5] R. Tsai and T. Huang, “Multi-frame image restoration and registration,” Adv. Comput. Vis. Image Process., vol. 1, pp. 317–339, 1984. [6] M. Irani and S. Peleg, “Improving resolution by image registration,” CVGIP, Graph. Models Image Process., vol. 53, no. 3, pp. 231–239, May 1991. [7] R. R. Schultz and R. L. Stevenson, “A Bayesian approach to image expansion for improved definition,” IEEE Trans. Image Process., vol. 3, no. 3, pp. 233–242, May 1994. [8] R. R. Schultz and R. L. Stevenson, “Extraction of high-resolution frames from video sequences,” IEEE Trans. Image Process., vol. 5, no. 6, pp. 996–1011, Jun. 1996. [9] H. Shen, M. Ng, P. Li, and L. Zhang, “Super-resolution reconstruction algorithm to MODIS remote sensing images,” Comput. J., vol. 52, no. 1, pp. 90–100, Jan. 2009. [10] F. Li, X. Jia, and D. Fraser, “Universal HMT based super resolution for remote sensing images,” in Proc. ICIP, Oct. 2008, pp. 333–336. [11] M. S. Crouse, R. D. Nowak, and R. G. Baraniuk, “Wavelet-based statistical signal processing using hidden Markov models,” IEEE Trans. Signal Process., vol. 46, no. 4, pp. 886–902, Apr. 1998. [12] J. K. Romberg, H. Choi, and R. G. Baraniuk, “Bayesian tree-structured image modeling using wavelet-domain hidden Markov models,” IEEE Trans. Image Process., vol. 10, no. 7, pp. 1056–1068, Jul. 2001. [13] J. Chan, J. Ma, and F. Canters, “A comparison of superresolution reconstruction methods for multi-angle CHRIS/Proba images,” in Proc. SPIE—Image and Signal Processing for Remote Sensing XIV, 2008, vol. 7109, p. 710 904. [14] F. Laporterie-Dejean, G. Flouzat, and E. Lopez-Ornelas, “Multitemporal and multiresolution fusion of wide field of view and high spatial resolution images through morphological pyramid,” Proc. SPIE, vol. 5573, pp. 52– 63, 2004. [15] F. Li, D. Fraser, X. Jia, and A. Lambert, “Improved elastic image registration method for SR in remote sensing images,” presented at the Signal Recovery and Synthesis, Vancouver, BC, Canada, 2007, Paper SMA5.