Can. J. Remote Sensing, Vol. 34, No. 3, pp. 271–286, 2008
Noise estimation in a noise-adjusted principal component transformation and hyperspectral image restoration Bing Xu and Peng Gong Abstract. We apply a noise-adjusted principal component transformation (NAPCT) to an Earth Observing 1 (EO-1) Hyperion image whose noise structure is typically unknown. In this paper, we propose to simulate and estimate the noise covariance structure of either a body of water, such as an ocean or lake, or a horizontal piece-wise delineation along a spatially homogeneous area. The effect is compared to that of the near-neighbor difference method utilized in some of the literature. A strategy is proposed of efficiently and accurately locating the noisy bands, particularly the striping bands and the striping columns. It automates the task of manual examination of each band and is particularly useful for hyperspectral data. We illustrate algorithmically that the implementation of NAPCT can be achieved by application of the procedure in linear discriminant analysis (LDA). The resultant images of NAPCT are compared to those from standard principal component transformation (PCT). By using the first 10 NAPCT bands (almost striping and noise free), which explain 99.8% of total data variability, we can reproject the NAPCT image back onto the original spectral space for visualization and image enhancement. The quality of the restored hyperspectral image is greatly improved. Résumé. Nous appliquons une transformation en composantes principales à bruit ajusté (NAPCT) sur une image de Hyperion de EO-1 (« Earth Observing 1 ») dont la structure du bruit est typiquement inconnue. Dans cet article, nous tentons de simuler et d’estimer la structure de la covariance du bruit soit d’une étendue d’eau, comme l’océan ou un lac, ou d’une délinéation horizontale par segment le long d’une zone spatialement homogène. L’effet est comparé avec celui de la méthode du plus proche voisin utilisée dans certaines études. Une stratégie de localisation efficace et précise des bandes bruitées, particulièrement les bandes et les colonnes affectées par un problème de rayage, est proposée. Celle-ci automatise la tâche de l’examen manuel de chaque bande et elle est particulièrement utile pour les données hyperspectrales. Nous montrons de façon algorithmique que l’implémentation de la NAPCT peut être atteinte par l’application de la procédure en analyse discriminante linéaire (ADL). Les images résultantes de la NAPCT sont comparées avec celles de la méthode de transformation en composantes principales (TCP) standard. En utilisant les dix premières bandes de la NAPCT (les bandes légèrement affectées par un rayage et libres de bruit) qui expliquent 99,8 % de la variabilité totale des données, nous pouvons re-projeter l’image de la NAPCT dans l’espace spectral original pour visualisation et pour les besoins de rehaussement d’image. La qualité de l’image hyperspectrale restaurée est grandement améliorée. [Traduit par la Rédaction]
Introduction
286
The Earth Observing 1 (EO-1) Hyperion is the only hyperspectral sensor operating in space (NASA, 1996). A hyperspectral sensor has contiguous narrow wavelength bands (about 10 nm each) that are able to capture more subtle spectral details of the objects on the ground than a multispectral sensor (about 100 nm each). However, the quality of some of the bands, particularly water vapor absorption bands, may be degraded due to insufficient energy within the narrow wavelength band captured by the sensor or the disturbance of water absorption.
Striping also appears in some of the bands. Here, we generally refer to these factors as “noise.” Such noise is clearly undesirable in image classification and biophysical parameter extraction with Hyperion data (e.g., Gong et al., 2003). Problems are encountered with direct use of either a hyperspectral image or feature-reduced images. Statistical estimation from noisy bands can be unreliable. Reducing bands, a common practice in dealing with high-dimensional data, can be done for the purpose of either image representation or image classification (Fukunaga, 1990). However, the high variability of the noise, especially the striping behavior, is
Received 30 November 2007. Accepted 23 May 2008. Published on the Canadian Journal of Remote Sensing Web site at http://pubs.nrc-cnrc.gc.ca/cjrs on 29 August 2008. B. Xu. Department of Geography, University of Utah, Salt Lake City, UT 84112, USA; and Department of Environmental Science and Engineering, Tsinghua University, Beijing, China. P. Gong.1 Department of Environmental Science, Policy and Management, 137 Mulford Hall, University of California, Berkeley, CA 94720-3114, USA; and State Key Laboratory of Remote Sensing Science, Institute of Remote Sensing Applications of the Chinese Academy of Sciences and Beijing Normal University, Beijing 100101, China. 1
Corresponding author (e-mail:
[email protected]).
© 2008 CASI
271
Vol. 34, No. 3, June/juin 2008
mistakenly treated as variability of the signals and, under the assumption that the noise has typically equal and low variability, preserved within the principal components (PCs). In certain PC bands, artifacts such as vertical stripes and systematic variations in brightness from one portion of the image to another across the lines can be visually observed (Liew et al., 2002). Similar phenomena are found in a transformed discriminant space (Xu and Gong, 2007). Green et al. (1988) propose a transformation called maximum noise fraction (MNF) and later called noise-adjusted principal component transformation (NAPCT) (Lee et al., 1990) that suppresses the effect of noise. This transformation produces component images ordered by the signal-to-noise variance ratio rather than total variance and involves a noise-whitening process and a principal component transformation based on the noisewhitened image (Roger, 1994; Chang and Du, 1999). This twostage process can be implemented through simultaneous diagonalization of the two covariance matrices involved in these two PCTs (Roger, 1994). Chang and Du (1999) propose an interference noise adjusted PCT that suppresses the effects of uninteresting signals and noises to detect targets of interest. However, the NAPCT requires that the noise covariance structure be known or accurately estimated. It is suggested that the sensor dark references that record the systematic noise structure be used, but these may be unavailable for most sensors, particularly the Hyperion sensor. It is the random noise present in the data whose covariance structure needs to be estimated, and this paper presents a method for doing so. The near-neighbor difference method is applied to estimate the noise covariance in GER and HYDICE data (Green et al., 1988; Lee et al., 1990; Chang and Du, 1999). The residualscaled principal component transform was developed and tested using airborne visible/infrared imaging spectrometer (AVIRIS) data by Roger (1996) and Roger and Arnold (1996) and compared with near-neighbor difference estimates (Chang and Du, 1999). The idea of near-neighbor difference is that the spatial auto-covariance of adjacent pixels is supposed to be null if noise is not present and will be treated as noise if it has any value. This measurement is generally applicable to spatially homogeneous areas. However, for spatially heterogeneous areas, the variation of signals will be mistakenly included as noise, especially along densely distributed surface boundaries among various land categories. Therefore, applying the nearneighbor difference method to the whole image to search for noises may not be appropriate. In this paper, we develop a strategy for noise estimation and compare it with the nearneighbor difference method. A number of spectral parameters are derived at or near the water or oxygen absorption bands. Gong et al. (1997; 2001; 2002) measured in situ hyperspectral data to establish a spectral library, recognize different species of conifers, and further extract ecological parameters by data transformation and selection of biophysically sensitive bands. These sensitive bands are important but can be noisy if they fall within the range of absorption by water or oxygen. As the hyperspectral data contain detailed spectral information, we do not want to 272
simply ignore these striping and noisy bands but to preserve and restore as much of the spectral signal as possible. We then ask the following question: Is it possible to restore the Hyperion data to their original spectral space with most of the noise removed? We address this question, compare the restored band with the original band, and compare the correlation matrix of the restored image with that of the original image.
Study scene and data description Four different portions of land surface cover are chosen from a Hyperion scene (256 pixels wide, 6478 lines long, 30 m spatial resolution) and organized into one mosaic image with 256 pixels by 520 lines. A false color display of the mosaic Hyperion image is shown in Figure 1. The image was taken on
Figure 1. False color display of the study scene, a mosaic Hyperion image taken on 17 January 2001. Salt evaporators in different colors with different salt concentrations visibly exhibit a smooth texture. Golf courses and reservation parks are scattered, represented in a homogeneously bright red color. The airport and industrial areas appear bright because of high reflectance of the concrete or sand covers which occupy a considerable amount of space in these areas. The bay shore freeway 101 runs horizontally across the middle of the scene. Forests and shadows combine into a unique texture. Ocean appears homogeneously black. © 2008 CASI
Canadian Journal of Remote Sensing / Journal canadien de télédétection
Figure 2. Band correlation matrix of (a) the original Hyperion data and (b) the restored Hyperion data presented in image form.
17 January 2001. The uppermost portion is the southern city of Fremont, California, containing an industrial area and salt evaporators at the southern end of San Francisco Bay. The second portion is the city of Mountain View across San Francisco Bay to the south and includes a naval air station at Moffett Field, golf courses, reservation parks, and urban residential areas. The third portion runs through a small range of the Santa Cruz Mountains. The lowest portion displays a slice of the Pacific Ocean. The diverse landscapes in this scene range from colored salt evaporators and San Francisco Bay in the north, to airport and the urban area in the middle, and to the mountainous regions and the Pacific Ocean. The Hyperion data have 242 bands (22 bands with overlapped wavelengths) with 12 bit quantization and a spectral range covering 400–2500 nm in wavelength. The spatial resolution is 30 m. Empty bands (38 in number) have been removed, so 204 bands are used for analysis. Figure 2a shows the band correlation matrix of the Hyperion data. The diagonal line indicates the highest correlation (i.e., 1), which is represented in white. The darker gray tone corresponds to lower absolute values of the correlation. We can see that the contiguous bands along the diagonal line appear “in white blocks” showing a high correlation among them. The black stripes across the whole range of the bands are generally noisy or striping bands caused mainly by water vapor absorption.
Strategy of noise estimation Target selection as noise We need to know where noise is in the image and how to find it before we can remove its effects. In this study, we look for suitable targets in the image itself to accurately estimate the noise covariance structure. The noise is usually supposed to have a zero mean and unity variance and to be uncorrelated © 2008 CASI
between bands. However, here we relaxed these requirements. Objects spatially occupying a relatively large area (to avoid spectrally mixed pixels) will be ideal targets. Nonzero mean is not a problem because here we only consider the covariance structure. Oceans with no wave flakes, silent bays, or deep lakes are good choices. Other spatially homogeneous objects such as grassland, rangeland, cleared land, or open concrete sites are also good targets if water is not available in the image. When choosing sites to use among the suitable targets, we bear in mind that we want to remove the effects of both striping and low-variance noise. Noise with low variance can be uniformly distributed over the whole image. However, fallingoff and striping may occur only in certain columns of certain bands of the image. If we select sample sites where the striping does not exist but where it appears in some other locations, we will not be able to capture the striping behavior of that particular band, and consequently it will not be included in the estimation of the noise covariance matrix. We assume that falling-off and striping happen in a vertical fashion and run through all rows in the image. We made this assumption after examining the striping behavior of the Hyperion image and finding it to be reasonable. One simple way to deal with this problem is to piece-wise select suitable targets along the horizontal direction of the image so that striping columns will not be missed. However, this method of selection may not be practical, especially when there is a lack of suitable targets in the image scene. Therefore, the question of where to select among the suitable targets becomes how to locate the fallingoff and striping columns with high variances which transmit through the suitable targets. The first step is to locate the striping bands. The logarithm of the total variance for each band is shown in Figure 3. The abnormally high variance bands are the striping and falling-off bands, and they are found to be true after checking them individually. The thresholds for high and low variance are 273
Vol. 34, No. 3, June/juin 2008
Figure 3. Logarithm of the total variance for each band. High- and low-variance bands are labeled and identified and are striping and noisy bands, respectively.
visually set to be 5.8 and 2.7, respectively. High variance occurs at bands 54, 73, 78, 95, 147, 148, 169, 179, 180, and 182 with central wavelengths of 892, 1084, 1134, 1306, 1830, 1840, 2052, 2153, 2163, and 2183 nm, respectively. The lowvariance bands generally contain low information contents, and the noise is usually spatially uniform. For later comparison, we identified those noisy bands from the variance plot at bands of 100–105, 145, 146, 149–160, 164, and 165, with central wavelengths of 1356–1406, 1810, 1820, 1850–1961, 2002, and 2012 nm. These are mainly water absorption bands. The second step is to locate all the striping columns in the image scene from these striping bands. We simply add the gray values of these striping bands and then take the average so that we get a new image scene with all striping columns displayed in one band only (Figure 4). This serves as a guideline for selection of suitable noise targets. The final step is to piecewise delineate the noise targets so that they cover all the striping columns in the scene. In this way we will not lose the striping information for a particular band and include this information in the noise structure estimation. Noise reduction as proposed in the original NAPCT was for suppression of noises caused by the sensor. If that is the case, noise structure determination needs to be done only once with an ideal Hyperion imagery as long as the sensor properties do not change with time. This would require an analysis of many Hyperion images and selecting the most ideal one that best represents the statistical structure of the noises produced by the sensor. Hyperion images of homogeneous surfaces are usually the best choice for doing so, but in this research the noise structure inevitably includes the contribution of noises that are 274
not from the sensor itself. In this sense, the method applied in this research is somewhat scene dependant. It would permit us to reduce the sensor-caused noise and possibly noises caused by the atmosphere and other unknown sources during the data processing. Methods of noise estimation Within-site covariance matrix We selected five salt evaporator sites including three sites (s1, s2, and s3) in the right half of the image scene and two sites (s4 and s5) in the left half. Two sites of ocean surface (o1 and o2) covering the horizontal direction, with one site in the left half and the other in the right half of the study scene, are also delineated as areas for noise estimation. The locations of five water sites are shown in Figure 4 under polygons coloured differently. The logarithm with a base of 10 of the variance for each site is plotted against the band number (Figures 5a–5c). The mean logarithm of the variance of all seven sites is displayed in Figure 5d. We can see from the plot that the sites located on the left half and right half of the image have different striping behaviors, indicating incomplete striping information captured by either half of the scene. The mean logarithm variance curve is almost the same as that for the ocean, as the sample sizes of the two ocean sites are much larger than those of the other five salt evaporator sites. Thus, these two sites are given higher weights in contributing to the mean variance (Table 1). We also visualize the mean within-site covariance matrix and the covariance matrix obtained from each selected site to determine which water site better describes noise and also for later comparison (Figure 6). If one suitable noise target © 2008 CASI
Canadian Journal of Remote Sensing / Journal canadien de télédétection C
⌺W = ∑ π c c =1
1 Nc
Nc
∑ [ dn n − ( c)] [dn n − ( c)]T
(1)
n =1
where πc is the proportional sample size at site c relative to the total number of samples, and Nc is the sample size of site c. The formulation is exactly the same as that for the within-class covariance matrix except that we focus on site rather than class, hoping to capture the spatial variability of the noise along the horizontal direction and enable this information to be included in the spectral covariance matrix. Figure 7 depicts the logarithm of the variance along the bands of each site and the mean variance weighted by all selected sites. These sites are shadow, forest, lawn, park, airport, ocean, and salt evaporators. We observe that the variance plot of the water site is different from that of the other sites. The variance for water is generally lower than for other sites, especially in certain visible and near-infrared bands. We believe that the variance among selected homogeneous sites is due not only to noise but also to the heterogeneity in the nature of the signals. This is addressed in more detail in the section entitled Hyperspectral image restoration. Spatial autocovariance matrix The near-neighbor difference method captures the spatial autocovariance of any given pair of bands. Let us denote Cov(i, j) as the autocovariance of the selected sample area between band i and band j: Cov(i, j) =
Figure 4. Striping columns are displayed and serve as a site guideline for the selection of noise targets. The coloured polygons are selected water sites for noise estimation.
such as ocean or lake is well stretched horizontally, the best choice is to simply use this site to simulate the noise. We end up using the mean covariance matrix because it resembles the scaled sum of the two ocean sites and seems to capture all abnormal behavior of the individual bands. In case water is not available in the scene, we need to piecewise choose spatially homogeneous sites as suitable targets. Let us denote the digital number of a pixel as dni,n for band i and pixel n (i = 1, ÿ, I and n = 1, ÿ, N, where I and N are the total number of bands and pixels, respectively, in an image). The vector representation for pixel n is dnn = (dn1,n, ÿ, dnI,n)T. The estimations of the noise samples for the mean vector and within-site covariance matrix are denoted by ( c) and ⌺W (c = 1, ÿ, C, where C is the total number of sites). Here, © 2008 CASI
1 N −1 ∑ (dn i, n − dn i, n+1 )( dn j , n − dn j , n+1 ) N −1 n=1
(2)
The same objects should have zero autocovariance. However, the same object may have a nonzero covariance if either noise or signal is present. As discussed in the Introduction, the standard strategy of using the whole image scene to calculate the near-neighbor difference may not be reasonable, particularly along the boundaries of different surface covers or within the same cover type when a complicated structure exists. Figures 8a and 8c illustrate the spatial autovariance of the whole image and the ocean site, respectively; and Figures 8b and 8d display the spatial autocovariance matrix of the whole image and the ocean site, respectively. We observe the existence of the higher variance and covariance obtained using the whole image scene. Again this may be due not only to noise but also to signals.
NAPCT and its efficiency NAPCT The NAPCT involves two steps of transformation (Green et al., 1988; Lee et al., 1990; Roger, 1994; Chang and Du, 1999; Geng and Zhao, 2007). The observation model is as follows: z = s+ n
(3) 275
Vol. 34, No. 3, June/juin 2008
Figure 5. Logarithm of variance of selected water samples for each band.
Table 1. Training sample sizes of water. Site
Training sample size (pixels)
Salt 1 (s1) Salt 2 (s2) Salt 3 (s3) Salt 4 (s4) Salt 5 (s5) Ocean 1 (o1) Ocean 2 (o2)
956 1 284 722 417 418 11 346 10 502
whitened image or, in other words, that applies the standard principal component transformation: NAPCT , G TG = I G T Σ NA z G = Λz
where z is the observed data vector, s is the signal vector, and n is the noise vector. The covariance structure Σz is estimated from the entire scene, and Σn is estimated from the noise. The first step is to look for a transformation matrix F that whitens the noise, i.e., unifies the covariance matrix of the noise. The whitening concept has been widely introduced and explained (Fukunaga, 1990; Duda et al., 2001): F TΣnF = I
(5)
The final transformation matrix will be H = FG. Projecting the original image by H, the variability of the noise will be suppressed, as HT Σ n H = I. The signals will be ordered according to their variability. A fast implementation of NAPCT is proposed by Roger (1994). From the perspective of image representation, NAPCT maximizes the signal-to-noise ratio (Lee et al., 1990). It looks for a projection matrix HNAPCT = (h1, ÿ, hI), whose jth column vector hj produces the jth component image. To maximize the ratio gNAPCT = σ 2z / σ 2n , where σ 2z = hT Σ z h and σ 2n = hT Σ n h by definition, we differentiate gNAPCT with respect to h and set it to 0, yielding Σ −n 1 Σ z h = g NAPCT h
(6)
(4)
where h are the eigenvectors corresponding to the nonzero eigenvalues of Σ −1 n Σ z . Such unconstrained maximization is
When we diagonalize the estimated noise covariance matrix, we get ET Σ n E = ∆n and ETE = I, where E is the eigenvector matrix, and ∆n are the diagonalized eigenvalues of Σ n, so F = E∆n–1/2. Then we find the covariance matrix of the noise-whitened image to be Σ NA = F T Σ z F. z The second step of NAPCT is to look for another transformation matrix G that orders the variance of the noise-
equivalent to maximizing hT Σ z h subject to hT Σ n h = I. NAPCT resembles the formula for linear discriminant analysis (LDA). From the image classification point of view, LDA searches for successive linear combinations of the data such that the class means are spread out as much as possible relative to the within-class variation (Yu et al., 1999). From the image representation point of view, NAPCT looks for the linear
276
© 2008 CASI
Canadian Journal of Remote Sensing / Journal canadien de télédétection
Figure 6. Mean within-site covariance matrix and the covariance matrix for each of the seven water sample sites presented in image form.
Figure 7. Logarithm of variance of samples collected from homogeneous sites for each band.
combinations of the data such that the total variation (in a sense of signal variation) is stretched out as much as possible relative to the noise variation. The way of solving NAPCT does not differ from that of implementing an LDA. The intrinsic dimension of the covariance matrix of the entire scene Σ z corresponding to the number of nonzero eigenvalues will be much higher than that of between-class covariance ΣB in LDA. © 2008 CASI
The procedure for implementing LDA can be applied to the NAPCT except that we replace the between-class covariance with total covariance and replace the within-class covariance with the within-site covariance as the noise structure. Withinsite covariance can be obtained by selecting suitable targets as noise in the same way as that for within-class covariance. Similar to σ 2W in LDA, σ 2n in NAPCT is also in the 277
Vol. 34, No. 3, June/juin 2008
Figure 8. (a, c) Spatial autovariance plot of (a) the whole image and (c) the ocean site. (b, d) Spatial autocovariance matrix presented in image form of (b) the whole image and (d) the ocean site.
denominator, which is the factor that needs to be suppressed and minimized. LDA has been widely applied as a common function. Any existing software package with an LDA function will be applicable to the implementation of NAPCT. Efficiency of NAPCT Due to the fact that most of the hyperspectral space was redundant, the useful information could be kept in a lowdimensional space. The intrinsic dimensionality was generally less than 10 (Harsanyi and Chang, 1994). Nine features extracted from the hyperspectral data give the best accuracy in an urban land-use classification experiment (Xu and Gong, 2007). In this study, we use the first 10 features for visualization. For comparison, we display the first 10 component images obtained by direct use of the PCT (Figure 9). Both serious stripes and noises are evident in the resultant first 10 images and are also found in most of the other bands. It is obvious that the resultant images are forcibly transformed along the direction of the noise due to its high variability. Therefore, noise removal is necessary. 278
We test the two methods of noise estimation. For each method we test the NAPCT on two scenarios that select different sites as noise samples (Table 2). As discussed in the section entitled Methods of noise estimation, we select water sites including five salt evaporators and two ocean areas to estimate the within-site covariance as the noise structure (scenario 1). Figure 10 shows the first 10 NAPCT resultant component images. They greatly improve upon the image quality obtained by direct PCT application. We wonder how much data variability these first 10 features explain with the noise whitened. In other words, what is the efficiency of this transformation? Recalling Equation (5), we get HT Σ z H = ΛNAPCT . Each diagonal element of ΛNAPCT z z divided by the sum of all the diagonal elements represents the percentage of signal variance that the transformed component image explains. We notice that the variance is the total variance with noise normalized, as seen in the following equation, and we regard it as signal variance: H T H = ( FG) T FG = G T ∆−n1G = ∆−n1
(7) © 2008 CASI
Canadian Journal of Remote Sensing / Journal canadien de télédétection
Figure 9. Display of the first 10 PCT resultant images.
Table 2. Two methods of noise estimation. Method
Site Variation (%)
Within-site covariance
Spatial autocovariance
Scenario 1
Scenario 2
Scenario 3
Scenario 4
Water 99.8
Homogeneous 74.0
Ocean 99.1
Whole 44.5
Note: Each method tests two scenarios and reports the percentage of the variation that the first 10 NAPCT component images explain.
From Figure 11a, we can see that the variability percentage curve has a turning point at about component 10. The less the component number needed to reach a turning point or a plateau, the more efficient are the newly transformed component images. Therefore, we preserve the first 10 components. Figure 11a shows that under scenario 1 the first 10 NAPCT component images (cumulative variance) are able to capture 99.8% of the total signal variability. To get the within-site covariance matrix for the estimation of noise, scenario 2 delineates spatially homogeneous sites. Figure 12 shows the first 10 image components for scenario 2. We notice gray tone imbalance from the left edge to the right edge of the eighth component and noisy surface on the last two component images. The variability of the data explained by the first 10 components is 74% (Figure 11b; Table 2). From an efficiency point of view, scenario 1 is more efficient than scenario 2, as the same number of components capture 25.8% more data variability. The near-neighbor difference method is applied to the ocean surface (scenario 3) and to the whole scene of the image © 2008 CASI
(scenario 4) to extract the spatial autocovariance, which will be used to simulate the noise structure. Figure 13 displays the first 10 NAPCT resultant images from scenario 3. A gray tone imbalance problem exists on the last two component bands. These 10 component images explain 99.1% of data variability (Figure 11c; Table 2). When we use the whole image scene to get the spatial autocovariance (scenario 4), we get the resultant component bands shown in Figure 14. We observe both gray tone imbalance phenomena in components 6 and 10 and a noise problem in components 9 and 10. Only 44.5% of data variation has been captured in these 10 bands (Figure 11d; Table 2). The NAPCT performs more efficiently on the water sites than on other homogeneous sites. Applied to water sites, the transformation captures most of the data variability with limited components, as the variation curve quickly reaches its plateau. Using the whole scene to do the estimation may not be workable because we want to suppress the effect of high noise variance. However, by choosing spatially heterogeneous sites, we have whitened out not only the noise but also the signal. As a result, part of the signal information will be lost and not included in the transformed bands. For this reason, noise estimation has a direct effect on the efficiency of NAPCT performance.
Hyperspectral image restoration Formulation The second question we posed in the Introduction was as follows: Is it possible to restore the Hyperion data to their 279
Vol. 34, No. 3, June/juin 2008
Figure 10. The first 10 NAPCT resultant images obtained using the within-site covariance of water (scenario 1).
original spectral space with noise removed? Recall that the projection matrix H transforms the observed image data z into zNAPCT in the NAPCT space, i.e., zNAPCT = HTz. With zNAPCT, we can remove the low-variance bands that are supposed to consist mainly of noise. Then we can transform the remaining signals back into the original spectral space by simply inverting the projection matrix H. By this process, the signals will be restored. It is formulated as follows: ~z = H − T ~z NAPCT
(8)
where H–T = (HT)–1, ~z NAPCT = ( z1NAPCT ,z 2NAPCT , K, z NAPCT , 0, K, 0) T , p and p is the number of the first set of components selected for the image restoration. For easier implementation we have ~z NAPCT = ~I z NAPCT
(9)
0 Ip ~ where I = O M , an I × I symmetric matrix with I being 0 L 0 the number of bands in total; and Ip is a p-dimensional identity matrix. The covariance matrix of the NAPCT image can be obtained by 280
~ ~ ~T T ~ = I T Σ NAPCT I = I H Σ z HI Σ NAPCT ~ z z
(10)
We can calculate it either by using the diagonal matrix of or by using the covariance matrix of the original ∆NAPCT z observed data Σ z: ~ ~ ~T ~ Σ NAPCT = I T ∆NAPCT I = H Σ zH ~ z z
(11)
~ ~ where H = HI . Now we are able to derive the covariance matrix of the restored image using the NAPCT projection matrix H and the covariance structure of the original data Σ z. In this way, we do not have to recalculate the covariance matrix from the newly projected imagery but can derive it directly: ~ ~ H −1 = H − T I T H T Σ z HIH −1 Σ ~z = H − T Σ NAPCT ~ z
(12)
Comparison to the original image Applying the mean within-site covariance estimation of the water, the NAPCT achieves the highest efficiency compared with those of the other three scenarios of noise estimation. We are able to reconstruct the image according to Equations (8) and (9) using the first 10 features, which account for 99.8% of the signal variability. As an example, we display bands 103 and 95 with central wavelengths of 1387 and 1306 nm, respectively, both before and after image restoration (Figure 15). From © 2008 CASI
Canadian Journal of Remote Sensing / Journal canadien de télédétection
Figure 11. Cumulative variation curve in percentage for NAPCT component images for (a) scenario 1, (b) scenario 2, (c) scenario 3, and (d) scenario 4. The turning point of the curve highlighted by a vertical line indicates the variation accounted for by the first 10 NAPCT components.
Figure 3 we observe that band 103 is a low-variability water vapor absorption band and band 95 has high variability due to striping. The image quality has been greatly improved by restoration. We directly calculate the covariance matrix of the restored image by Equation (13). We then compare the correlation matrix before (Figure 2a) and after (Figure 2b) the image restoration. The restored correlation matrix looks much cleaner. Most of the dark, straight lines and stripes that previously went through the whole range of bands have disappeared, illustrating efficient removal of the noisy bands. At the same time other shades have been well preserved. The restored spectrum of selected samples including airport, forest, salt evaporator, and bush are plotted and compared with the original spectrum in Figure 16. In the restored spectral curve we notice lower gray values in the visible bands and smoother spectra in the bands that originally had abnormally low gray values. The restored spectrum appears to have an effect of atmospheric correction. However, further evaluation is necessary. We also calculate the mean water site covariance matrix and the covariance matrix for each of the seven restored water sites by Equation (12). These are visually represented in Figure 17. © 2008 CASI
Comparing the original covariance matrix (Figure 6) with the covariance structure of each restored water site, the high covariance with white lines is reduced and the structure is well preserved. The water sites are then treated as noise, and the variability of the water sites is whitened out to extract and arrange the signals. The logarithm with base 10 of the variance of the restored water site is plotted in Figure 18. Again, the restored variance curve is smoother and the high-variability spikes are suppressed.
Summary and conclusions We reported and compared different ways to accurately estimate the noise structure for successful and efficient application of noise-adjusted principal component transformation (NAPCT) to Earth Observing 1 (EO-1) Hyperion data. It is possible to improve the quality of the original Hyperion data. Based on our exploratory analysis and experimentation, we summarize and conclude as follows. (1) A semi-automatic way of locating noise is introduced. Accurate noise estimation is important, as it directly influences the performance of NAPCT. 281
Vol. 34, No. 3, June/juin 2008
Figure 12. The first 10 NAPCT resultant images obtained using the within-site covariance of some spatially homogeneous sites (scenario 2).
Figure 13. The first 10 NAPCT resultant images obtained using the spatial autocovariance of the ocean (scenario 3).
282
© 2008 CASI
Canadian Journal of Remote Sensing / Journal canadien de télédétection
Figure 14. The first 10 NAPCT resultant images obtained using the spatial autocovariance of the whole image (scenario 4).
Figure 15. (a, c) Original spectral image of (a) band 103 with a central wavelength of 1387 nm and (c) band 95 with a central wavelength of 1306 nm. (b, d) Corresponding restored image of (b) band 103 and (d) band 95.
(2) We may estimate noise by generating a within-site covariance matrix. Noise estimation utilizing the water sites is more efficient than estimation utilizing other homogeneous sites in terms of explaining the signal variation using the same number of NAPCT components. © 2008 CASI
(3) For generating a spatial autocovariance matrix, noise estimation utilizing the ocean is more efficient than estimation utilizing the whole image scene. (4) It is best to select water sites for noise estimation if water bodies are available in the image scene. 283
Vol. 34, No. 3, June/juin 2008
Figure 16. Spectrum before and after hyperspectral image restoration for selected sites.
Figure 17. Mean within-site covariance matrix and the covariance matrix for each of the seven water sample sites in the restored image.
(5) If only the sensor noise covariance structure is to be used to suppress the sensor noise, then it is not image dependent. A strategy to use any image from the most homogeneous areas is suggested for deriving the noise covariance structure for the sensor. 284
(6) The NAPCT resultant images improve upon the direct principal component transformation (PCT) resultant images. (7) The restored images improve upon the quality of the original images, particularly the quality of the water vapor absorption bands. © 2008 CASI
Canadian Journal of Remote Sensing / Journal canadien de télédétection
Figure 18. Logarithm of variance of selected samples for each band at the restored water sites.
(8) Covariance structure of the restored images can be directly derived from the covariance matrix of the original data, making recalculation from the newly transformed image unnecessary.
Acknowledgements We are grateful for a major grant from the National Natural Science Foundation of China (grant 30590370).
References Chang, C.-I., and Du, Q.1999. Interference and noise-adjusted principal components analysis. IEEE Transactions on Geoscience and Remote Sensing, Vol. 37, No. 5, Part 2, pp. 2387–2396. Duda, R.O., Hart, P.E., and Stork, D.G. 2001. Pattern classification. 2nd ed. John Wiley & Sons, Inc., New York. 654 pp. Fukunaga, K. 1990. Introduction to statistical pattern recognition. 2nd ed. Academic Press, San Diego, Calif. 591 pp.
Gong, P., Pu, R., and Heald, R.C. 2002. Analysis of in situ hyperspectral data for nutrient estimation of giant sequoia. International Journal of Remote Sensing, Vol. 23, No. 9, pp. 1827–1850. Gong, P., Pu, R., Biging, G.S., and Larrieu, M. 2003. Estimation of forest leaf area index using vegetation indices derived from Hyperion hyperspectral data. IEEE Transactions on Geoscience and Remote Sensing, Vol. 41, No. 6, pp. 1355–1362. Green, A.A., Berman, M., Switzer, P., and Craig, M.D. 1988. A transformation for ordering multispectral data in terms of image quality with implications for noise removal. IEEE Transactions on Geoscience and Remote Sensing, Vol. 26, No. 1, pp. 65–74. Harsanyi, J.C., and Chang, C.-I. 1994. Hyperspectral image classification and dimensionality reduction: an orthogonal subspace projection approach. IEEE Transactions on Geoscience and Remote Sensing, Vol. 32, No. 4, pp. 779–785. Lee, J.B., Woodyatt, A.S., and Berman, M. 1990. Enhancement of high spectral resolution remote sensing data by a noise-adjusted principal components transform. IEEE Transactions on Geoscience and Remote Sensing, Vol. 28, No. 3, pp. 295–304.
Geng, X.R., and Zhao, Y.C. 2007. Principle of small target detection for hyperspectral imagery. Science in China Series D, Earth Sciences, Vol. 50, No. 8, pp. 1225–1231.
Liew, S.C., Chang, C.W., and Lim, K.H. 2002. Hyperspectral land cover classification of EO-1 Hyperion data by principal component analysis and pixel unmixing. In IGARSS’02, Proceedings of the International Geoscience and Remote Sensing Symposium, 24–28 June 2002, Toronto, Ont. IEEE, Piscatawy, N.J. pp. 3111–3113.
Gong, P., Pu, R., and Yu, B. 1997. Conifer species recognition: an exploratory analysis of in situ hyperspectral data. Remote Sensing of Environment, Vol. 62, No. 2, pp. 189–200.
NASA. 1996. NASA New Millennium Program (NMP). Goddard Space Flight Center (GSFC), National Aeronautics and Space Administration (NASA), Beltsville, Md. Available from eo1.gsfc.nasa.gov.
Gong, P., Pu, R., and Yu, B. 2001. Conifer species recognition: effects of data transformation. International Journal of Remote Sensing, Vol. 22, No. 17, pp. 3471–3481.
Roger, R.E. 1994. A fast way to compute the noise-adjusted principal components transform matrix. IEEE Transactions on Geoscience and Remote Sensing, Vol. 32, No. 6, pp. 1194–1196.
© 2008 CASI
285
Vol. 34, No. 3, June/juin 2008 Roger, R.E. 1996. Principal components transform with simple, automatic noise adjustment. International Journal of Remote Sensing, Vol. 17, No. 14, pp. 2719–2727. Roger, R.E., and Arnold, J.F. 1996. Reliably estimating the noise in AVIRIS hyperspectral images. International Journal of Remote Sensing, Vol. 17, No. 10, pp. 1951–1962. Xu, B., and Gong, P. 2007. Land use/cover classification with multispectral and hyperspectral EO-1 data: a comparison. Photogrammetric Engineering & Remote Sensing, Vol. 73, No. 8, pp. 955–965. Yu, B., Ostland, M., Gong, P., and Pu, R. 1999. Penalized linear discriminant analysis for conifer species recognition. IEEE Transactions on Geoscience and Remote Sensing, Vol. 37, No. 5, pp. 2569–2577.
286
© 2008 CASI