An Iterative Haze Optimized Transformation for ... - IEEE Xplore

4 downloads 0 Views 7MB Size Report
Abstract—Most previous haze/cloud detection methods for. Landsat imagery, e.g., haze optimized transformation (HOT), can- not adequately suppress land ...
2682

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 54, NO. 5, MAY 2016

An Iterative Haze Optimized Transformation for Automatic Cloud/Haze Detection of Landsat Imagery Shuli Chen, Xuehong Chen, Jin Chen, Pengfei Jia, Xin Cao, and Canyou Liu

Abstract—Most previous haze/cloud detection methods for Landsat imagery, e.g., haze optimized transformation (HOT), cannot adequately suppress land surface information and, in particular, often overestimate haze thickness over bright surfaces. This paper proposes an iterative HOT (IHOT) for improving haze detection with the help of a corresponding clear image. With an iterative procedure of regressions among HOT, the reflectance difference at the top of atmosphere (TOA) between hazy and clear images, and TOA reflectances of hazy and clear images, the land surface information can be removed, and the iterative HOT (IHOT) result is derived to spatially characterize the haze contamination in the Landsat images. A group of Landsat images that were acquired in different landscapes and seasons were used to test IHOT. Visual comparisons indicate that IHOT performed better than previous haze detection methods for images that were acquired in diverse landscapes and also performed robustly for hazy images that were acquired at different seasons when using the same reference clear image. Additionally, two indirect quantitative validations were used to illustrate that IHOT can provide the best transformation for accurately determining haze information. Therefore, it is expected that the proposed IHOT method will be used for automatic cloud/haze detection for large numbers of Landsat images if data sets of clear Landsat imagery are available. Index Terms—Haze detection, haze optimized transformation (HOT), haze thickness, iterative HOT (IHOT), Landsat imagery.

I. I NTRODUCTION

L

ANDSAT data have made large contributions to diverse investigations of the Earth’s surface, including urbanization, disaster monitoring, deforestation, and other natural/ human-induced land cover changes [1]–[5]. However, many Landsat images are inevitably contaminated by clouds, aerosol layers, and other haze, which presents a large obstacle for the rapid selection and full use of Landsat imagery [6]–[9]. For example, the brightening effect of clouds causes many problems, including inaccurate atmospheric corrections, biased estimations of normalized difference vegetation index values, Manuscript received May 19, 2015; revised September 18, 2015; accepted November 4, 2015. Date of publication December 18, 2015; date of current version March 25, 2016. This work was supported in part by the National Natural Science Foundation of China under Grant 41301352 and in part by the 863 Project under Grant 2013AA122802. S. Chen, X. Chen, J. Chen, and X. Cao are with the State Key Laboratory of Earth Surface Processes and Resource Ecology, Beijing Normal University, Beijing 100875, China (e-mail: [email protected]). P. Jia is with the College of Global Change and Earth System Science, Beijing Normal University, Beijing 100875, China. C. Liu is with the State Key Laboratory of Astronautic Dynamics, Xi’an Satellite Control Center, Xi’an 710043, China. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TGRS.2015.2504369

mistaken land cover classifications, and false detections of land cover changes [9]. Therefore, effective detection and isolation of spatially varying haze is critically needed for data selection, atmospheric correction, and the further analysis of Landsat imagery [10]–[14]. Over the years, much effort has been devoted to spatially mapping clouds and haze in Landsat imagery. In general, prior methods can be categorized into two types, namely, binary and quantitative mapping, based on different objectives. Binary mapping is used to produce cloud masks and overall assessments of scenes. For example, the automated cloud cover assessment system [8], [11] is used for routine cloud estimation of Landsat data. This method screens a cloud distribution based on a group of empirical spectral transformations and thresholds and works well for estimating the overall percentage of clouds. Considering possible confusion between clouds and snow/ice, Choi and Bindschadler [15] suggested a cloud detection method over ice sheets by using a shadow matching technique and an automatic normalized difference snow index threshold. By combining earlier empirical methods, Zhu and Woodcock [9] proposed another object-based cloud and cloud shadow mask approach that performed better than previous methods. However, the binary mapping methods did not estimate the thickness of the haze and, consequently, cannot be used for correction or compensation of thin haze contamination. The quantitative mapping methods provide more complete information of the haze and support thin haze removal. The haze optimized transformation (HOT) [16] is a typical quantitative method that has been widely used because of its robustness and simplicity [17]–[21]. HOT only uses Landsat’s blue and red bands and works effectively over vegetated areas, but the method fails over snow/ice, water, and bright surfaces [16]. HOT was further improved by Liu et al. [22], who integrated more spectral bands and introduced a more complicated supervised procedure. However, the issue of spectral confusion between bright surfaces and haze/clouds is a challenge that must still be addressed. The reflectance difference between hazy and clear Landsat images could be helpful for improving the performance of haze detection and, in particular, could be used to avoid confusion between bright surfaces and haze/clouds. However, it has received little attention in previous studies. Considering that more database collections of clear Landsat images have been built [23]–[25], this solution becomes more promising. In this paper, we therefore propose an iterative HOT (IHOT) for quantitatively mapping the haze thickness of Landsat imagery by using both hazy and corresponding clear images. It is expected to

0196-2892 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

CHEN et al.: IHOT FOR AUTOMATIC CLOUD/HAZE DETECTION OF LANDSAT IMAGERY

2683

image as secondary data to help detect clouds/haze. The TOA reflectance difference between hazy and clear-sky images contains the haze information, which could help to accurately detect haze over land cover types whose spectra violate the HOT’s assumption. Therefore, we define temporal HOT (THOT) as a linear combination of the reflectance differences of all spectral bands, i.e., THOT =

n 

ki ΔRi + c

i=1

=

n 

ki (Rhi − Rci ) + c

(2)

i=1

Fig. 1. Illustration of HOT.

work well in various landscapes and to overcome the spectral confusion between bright surfaces and haze/clouds. II. M ETHODOLOGY A. HOT HOT [16] assumes that the spectral responses of the red and blue bands to diverse surface cover classes are highly correlated under clear atmospheric conditions. Then, a corresponding linear spectral transformation can be found to reflect the haze signal and simultaneously suppress land surface information. Based on that assumption, the haze-free areas (or clear-sky image) are first selected, and a “clear line” in the blue–red spectral space is defined by a linear regression fitting to data from the haze-free areas (or clear-sky image if available). The hazy pixels tend to deviate from the clear line; thus, the distance to the clear line is measured as an HOT in response to the increased haze thickness (see Fig. 1). HOT is thus expressed as HOT = sin θ · R1 − cos θ · R3

(1)

where R1 and R3 are reflectances at the top of atmosphere (TOA) of the blue and red bands, which correspond to bands 1 and 3 for Landsat TM/ETM+ and to bands 2 and 4 for Landsat OLI, respectively; and θ is the angle of the clear line. This index was designed to detect haze signals and suppress all of the land cover information under the haze. However, it has been found that haze thickness over impervious surfaces, snow cover, and water bodies is usually overestimated, whereas that over bare soil is easily underestimated because their spectra in the blue–red band space deviate from the ideal clear line.

where Rhi and Rci denote the TOA reflectances of band i for a certain pixel at the hazy time and the clear-sky time, respectively; ΔRi denotes the difference between Rhi and Rci ; n is the number of bands of the Landsat sensor at 30-m resolution, which is equal to 6 for Landsat TM/ETM+ and 7 for Landsat OLI. The cirrus band of Landsat 8 OLI is excluded here because it contains little land surface information [26], and no further adjustment is necessary. ki is the coefficient of ΔRi , and c is a constant. Apart from haze information, the TOA reflectance difference ΔR also contains other information from land cover changes or phenological differences. A set of suitable coefficients, i.e., ki , should help to suppress this unwanted information. To achieve this objective, the coefficients (ki and c) are determined by a multivariate regression between HOT and ΔRi , i.e., HOT =

n 

ki ΔRi + c + ε

(3)

i=1

where ε is the regression residual. By using the coefficients determined by this regression model, THOT defined by (2) corresponds to the modeled value of HOT in (3). Because the information of land cover change and phenological difference in ΔR could be regarded as independent with HOT, this unwanted information will be removed from the regression model. Similarly, as the error of HOT could be regarded as independent with ΔR, it will be also partly removed from the regression model. Therefore, THOT, as the modeled value of HOT, will reflect only the haze information (mathematical instruction in detail is presented in the Appendix). For example, HOT tends to overestimate the haze magnitude over bright surfaces because they have HOT values that are similar to haze pixels. THOT can address this issue effectively because the spectral confusion between cloud and bright surfaces will not exist in ΔR (cloud has a high ΔR, whereas bright surfaces without cloud has a low ΔR). However, the performance of THOT relies on the performance of its input (HOT), and still inevitably contains some land cover change or phenological difference information. C. Improved HOT

B. THOT Considering that an increasing amount of clear-sky Landsat images are available [25], it is possible to employ a clear-sky

To avoid interference from land cover change or phenological differences, we further proposed a transformation of TOA reflectance R, rather than a transformation of the TOA

2684

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 54, NO. 5, MAY 2016

reflectance difference ΔR, to reflect the haze information. An improved HOT (iHOT) is defined as a linear combination of TOA reflectance of hazy pixel, i.e., iHOT =

n 

ki Ri + c

(4)

i=1

where Ri denotes the TOA reflectance of band i for a Landsat pixel, ki is the coefficient of Ri , and c is a constant. iHOT should be well correlated with THOT for the hazy image and should be constant for the clear image. Therefore, the coefficients (ki and c ) are determined by the multivariate regression between THOT and the TOA reflectance by using the hazy and clear-sky image data, i.e., THOT =

n 

ki Ri + c + ε

(5)

i=1

where ε is the regression residual. Notice that THOT for the clear-sky image data is equal to the constant c based on (2) because ΔR is equal to zero. Thus, iHOT defined by (4) corresponds to the modeled value of THOT in (5). As the THOT error is mainly induced by land cover change or phenological difference, it is independent with TOA reflectance on a single time (R). Consequently, the error in THOT will be removed from the regression model, and iHOT could better reflect the haze information (mathematical instruction in detail is presented in the Appendix). D. IHOT To reduce the dependence of the initial HOT input and improve the results, an iterative procedure is proposed. As shown in Fig. 2, the iteration step includes the following steps. 1) Set iHOT(t=0) as HOT. 2) Calculate THOT(t) using (2) with the coefficients determined by the multivariate regression between iHOT(t) and ΔRi [see (3)]. 3) Calculate iHOT(t+1) using (4) with the coefficients determined by the multivariate regression between THOT(t) and Ri [see (5)]. 4) Repeat steps 2 to 3 until the correlation between THOT and iHOT stops increasing (the change is less than 0.001). When THOT and iHOT more closely approach each other, it is reasonable to consider that their error is also reduced. IHOT is thus defined as the final iterative value of iHOT. After the iterations, IHOT is not strongly dependent on the performance of the initial HOT. Additionally, IHOT is more applicable for different landscapes than HOT because the coefficients are adjusted based on all the spectral bands. III. E XPERIMENTS A. IHOT Process To preliminarily test the effectiveness of the proposed algorithm and demonstrate the IHOT process, a subset image

Fig. 2. IHOT flowchart.

(400 × 400) of Landsat 8 hazy imagery [see Fig. 3(a)] and a corresponding clear image [see Fig. 3(b)], which were acquired on April 22, 2014 and March 21, 2014, respectively, over Shandong Province, China, were chosen (see Fig. 3). First, the initial haziness index, i.e., HOT [see Fig. 3(c)], was applied to the hazy image [see Fig. 3(a)] after the clear line was defined from the clear image [see Fig. 3(b)]. Next, THOT [see Fig. 3(d)] and iHOT were generated successively using multivariate regressions. Finally, IHOT was generated after the iterations described in Section II-C. As shown in Fig. 3(c), HOT failed to suppress the land surface information and, in particular. overestimated the haze thickness over the urban area, but it underestimated the haze thickness over bare cropland. THOT [see Fig. 3(d)] greatly improved the result because of its improved suppression of the background land surface information. However, THOT still produced some excessively low outliers, particularly in the area that is marked by the red rectangle. As shown in Fig. 3(e), IHOT further improved THOT and produced the best haze detection result, which contained very little background information. Fig. 3(f) shows that the correlation between iHOT and THOT increased with the iterations, which indicates that the iterations work effectively in reducing the error of iHOT and THOT.

CHEN et al.: IHOT FOR AUTOMATIC CLOUD/HAZE DETECTION OF LANDSAT IMAGERY

2685

Fig. 3. IHOT process. (a) Haze image. (b) Clear image. (c) HOT image. (d) THOT image. (e) IHOT image. (f) Change in the correlations between iHOT and THOT by iteration. The area marked by the red rectangle suggests that IHOT detected the haze signal and suppressed the background noise better than either HOT or THOT. TABLE I S UMMARIES OF THE L ANDSAT S CENES FOR THE F OUR D IFFERENT L ANDSCAPES

B. Performance for Different Landscapes To test IHOT and demonstrate its robustness and effectiveness, Landsat images captured over diverse landscapes, including cropland, urban, glacier/snow, and desert, were chosen. Table I shows the Landsat image information that was used in the experiments. HOT and the “background suppressed haze thickness index” (BSHTI) [22] were also calculated for comparison. The first pair of Landsat OLI images (122/34) were captured over a cropland-dominated area located in Shandong Province, which is a typical major agricultural province in China.

On the whole, IHOT performed better than HOT and BSHTI [see Fig. 4(a)]. The HOT results generally failed to suppress the background noise. BSHTI inhibited the cropland information well; however, it overestimated the haze over bright surfaces. Fig. 4(b) and (c) shows detailed comparisons of the cloud detection methods. HOT obviously overestimated the haze thickness over impervious surfaces and underestimated the haze thickness over bare soil and turbid water, which indicates that the noise caused by the impervious surfaces, bare soil, and turbid water could not be suppressed by HOT. BSHTI performed better at suppressing the majority of the background noise; however,

2686

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 54, NO. 5, MAY 2016

Fig. 4. Haze detection of Landsat image (122/34) captured over cropland. (a) Whole image. (b) Enlarged images of the subset area in the blue rectangle. (c) Enlarged images of the subset area in the red rectangle.

it failed to distinguish the haze and bright surfaces. BSHTI, in particular, estimated very high values over the blue bright surface. In comparison, IHOT recognized the haze signal well and suppressed all of the land surface information. There was very little overestimation or underestimation of the thickness of the haze over any of the land cover types, including the cropland, bare soil, turbid water, and impervious surfaces. The second pair of Landsat TM images (122/44) were captured over Guangzhou, China. The hazy image was captured in 2004, and the clear image was obtained in 2011. In this developing area, the land cover greatly changed over the long time interval. Such pairs of images can be used to test the adaptability of the new method in the case of large land cover changes between the hazy and clear images. Additionally, the blue band tended to saturate over the thick clouds in the hazy image. To address the saturation issue, it was decided beforehand that the pixels with digital number equal to 255 in the blue

band recorded the heaviest haze. As shown in Fig. 5(a), BSHTI performed better than HOT, and of all of the methods, IHOT performed best. As shown in the detailed results [see Fig. 5(b) and (c)], HOT largely overestimated the haze thickness over the impervious surfaces and water bodies. BSHTI underestimated the haze thickness over the relatively thin clouds. However, in the case of the urban landscape, IHOT strongly adapted and recognized the haze/cloud information well despite the large changes in the land cover between the acquisition times of the cloudy and clear-sky images. The third pair of Landsat OLI images (140/41) were captured over the Himalayas. The surfaces were mainly covered with glacier and snow, which are often mistakenly detected as cloud/ haze by the previously discussed cloud detection methods [9]. Although there are a number of methods that are designed to distinguish clouds and glaciers/snow [15], [27]–[29], they are only suitable for specific cases because the parameters or

CHEN et al.: IHOT FOR AUTOMATIC CLOUD/HAZE DETECTION OF LANDSAT IMAGERY

2687

Fig. 5. Haze detection of Landsat image (122/44) captured over an urban area. (a) Whole image. (b) Enlarged images of the subset area in the blue rectangle. (c) Enlarged images of the subset area in the red rectangle.

thresholds are determined using priori knowledge. Hence, it is necessary to test the proposed method under such conditions and ensure that it can distinguish between clouds and glaciers/ snow. As shown in Fig. 6(a), IHOT distinguished the glaciers/ snow and clouds and can accurately reflect the cloud thickness, whereas HOT and BSHTI produced very poor results. Detailed comparisons are shown in Fig. 6(b) and (c). In the case of HOT, the glaciers/snow were confused with clouds, and the vegetated areas with clouds were unexpectedly identified as cloudless. BSHTI performed even more poorly than HOT in this case and estimated a greater haze thickness for the glacier/ snow pixels than for the cloud pixels. However, the proposed IHOT adaptively extracted the cloud information and eliminated the interference from the confusion between clouds and glaciers/snow. Although the effects of shadows on the topography were not completely eliminated, the topographic variance was small compared with the cloud information.

The last pair of Landsat OLI images (143/31) were captured over a desert because deserts are among the most typical landscapes throughout the world and occupy almost 21% of the world’s land surface. HOT and BSHTI performed better with the desert images than with the other landscape images because desert land cover is relatively homogeneous [see Fig. 7(a)]. Nevertheless, both methods overestimated the thickness of the haze over turbid water bodies that had high reflectance in the blue band [see Fig. 7(b)]. On the other hand, IHOT had the best performance and suppressed all of the land surface information, including the turbid water. Fig. 8 shows the IHOT coefficients for the Landsat images of the four typical landscapes (cropland, urban areas, glaciers/ snow, and desert). The IHOT coefficients for the four different landscapes are quite different, which suggests that IHOT can adaptively find suitable transformations to detect clouds/haze over diverse landscapes.

2688

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 54, NO. 5, MAY 2016

Fig. 6. Haze detection of Landsat image (140/41) captured over glaciers and snow. (a) Whole image. (b) Enlarged images of the subset area in the blue rectangle. (c) Enlarged images of the subset area in the red rectangle.

C. Performance of IHOT on Images With Different Phenology Generally, limited amounts of clear-sky data are available; more often, data are contaminated by clouds or haze. It is desirable that a single clear-sky image can be used to detect haze in multiple hazy images that were acquired at different seasons. Therefore, it is necessary to test whether IHOT can adapt to the case where there are phenology differences between the clear-sky and hazy images. Four hazy images that were acquired at different seasons were selected for the experiments (see Table II). The single ancillary clear-sky image captured in March 2014 was used for IHOT calculation, and large phenological differences can be observed between the clear-sky image and the hazy images. In this experiment, another multitemporal method, i.e., multitemporal cloud detection (MTCD) [30], and an operational method used in Landsat surface re-

flectance product, i.e., Fmask [9], [31], were employed for the comparison, although they are binary cloud detection method. As shown in Fig. 9, IHOT was able to distinguish the cloud/haze signal and effectively suppress land surface information for all the four images acquired at different seasons. In contrast, MTCD failed to recognize most of the thin clouds. Fmask performed much better than MTCD, although some thin clouds are still missing. However, as shown in Fig. 9(d), it marked much turbid water as clouds mistakenly. Therefore, it can be concluded that IHOT performs better than MTCD and Fmask. However, on careful inspection of IHOT images in Fig. 9(a), (b), and (d), some striping noise can be observed. It was probably caused by sensor defects (http://landsat.usgs.gov/ calibration_notices.php) because the stripes have the same

CHEN et al.: IHOT FOR AUTOMATIC CLOUD/HAZE DETECTION OF LANDSAT IMAGERY

2689

Fig. 7. Haze detection of Landsat image (143/31) captured over a desert. (a) Whole image. (b) Enlarged images of the subset area in the blue rectangle. TABLE II S UMMARY OF THE L ANDSAT S CENES T HAT W ERE A CQUIRED AT D IFFERENT S EASONS (PATH /ROW: 122/34; S IZE : 7551 × 7411)

Fig. 8. Comparison of the IHOT coefficients for different landscapes.

position and shape in different images. As shown in Fig. 10, the absolute values of the coefficients of the short-blue and blue bands for hazy images in Fig. 9(a), (b), and (d) are large, and the values have opposite signs. IHOT can remove the land surface information well because the spectral responses of the various land cover types in the short-blue and blue bands are usually highly correlated. However, the method can also enlarge the effects of Landsat 8 OLI sensor hardware defects. In general, such striping noise is not very serious, and IHOT performed robustly with the images that were acquired at different seasons. D. Indirect Quantitative Evaluation Apart from the visual comparisons among the results of the three cloud/haze detection methods that were described

earlier, quantitative validations also were required to confirm the accuracy of the proposed approach. However, because it is impossible to know the true haze thickness, a direct validation was not possible for the cloud detection methods. Therefore, two indirect methods were used to quantitatively evaluate the accuracy of the cloud detection results. First, the relationship between the cloud/haze detection results and the cirrus band (OLI band 9) [32] was evaluated. We selected a Landsat 8 OLI subset image (143/31; see Fig. 7), which was contaminated only by cirrus clouds, for the validation. This ensured a rational correlation between the haze detection results and the cirrus band. Fig. 11(a) shows the cirrus band, and Fig. 11(b)–(d) show the HOT, BSHTI, and IHOT images, respectively. The turbid water body is clearly observed in the HOT and BSHTI results, whereas IHOT effectively removed the land surface information. Fig. 11(e)–(g) show scatterplots of the cirrus band with the HOT, BSHTI, and IHOT results, respectively. The cirrus band shared the highest correlation (0.9753) with the IHOT results, which was much higher than the correlations from the HOT and BSHTI results.

2690

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 54, NO. 5, MAY 2016

Fig. 9. Comparison of the cloud detection results of IHOT, MTCD, and Fmask for the images acquired at different seasons. (a) April 22, 2014. (b) June 6, 2013. (c) August 25, 2013. (d) December 15, 2013.

Fig. 10. Comparison of the IHOT coefficients for different phenology.

Second, cloud removal experiments were conducted using dark-object subtraction (DOS) [16] after the HOT, BSHTI, and IHOT haze detections, and the restored images were compared with a reference image acquired in the similar season. The Landsat 8 OLI subset image (122/34) that was acquired on April 22, 2014 was selected for the DOS correction. A corresponding clear-sky image acquired on May 6, 2014 was used as the reference data for the validation of the restored images. This validation assumes that the phenology difference is negligible between the two images. As shown in Fig. 12, the clouds/haze in the restored images from HOT and BSHTI were not completely removed, and the images have inaccurate visual colors. However, the IHOT restored image most accurately captured the reflectance and textures compared with the reference image.

CHEN et al.: IHOT FOR AUTOMATIC CLOUD/HAZE DETECTION OF LANDSAT IMAGERY

2691

Fig. 11. Quantitative validations between the cloud/haze detection results and the cirrus band. (a) Subset image in the cirrus band. (b)–(d) HOT, BSHTI, and IHOT results, respectively, for the tested area. (e)–(g) Scatterplots of the band 9 cirrus with the HOT, BSHTI, and IHOT results, respectively.

Fig. 12. Comparisons between the restored images and the corresponding clear-sky image that were acquired in the same season. (a) Haze image and (b) clear-sky data that were acquired on May 6, 2014. (c)–(e) DOS restored images using HOT, BSHTI, and IHOT, respectively.

Table III shows the quantitative accuracies of root-mean-square error (rmse). The IHOT restored image has the lowest rmse with the reference clear-sky data in most spectral bands,

except band 6. The least improvement was in the infrared bands because the atmospheric effect is weak in the long-wavelength bands.

2692

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 54, NO. 5, MAY 2016

TABLE III RMSE B ETWEEN THE R EFERENCE I MAGE AND THE R ESTORED I MAGES U SING HOT, BSHTI, IHOT

IV. D ISCUSSION AND C ONCLUSION This paper has proposed an automated self-adapting method, i.e., IHOT, for estimating the thickness of haze in Landsat based on information in a corresponding clear-sky image that was acquired at a different time. The experiments that were conducted for different areas showed that IHOT performed much better than other general cloud/haze detection methods such as HOT and BSHTI. The overestimation of the thickness of haze in the previous methods, which is caused by the spectral confusion between bright surfaces and haze/clouds, can be considerably lessened by using the reflectance difference between hazy and clear images. A potential problem that accompanies the use of reflectance differences is the presence of noise artifacts that are caused by phenology differences and land cover changes that result from time between the capture of the hazy and clear image data. This noise is also effectively suppressed in IHOT by using the regressions among HOT, THOT, and the multispectral data. Moreover, to reduce the dependence on the initial HOT input, the algorithm uses an iterative procedure to search for the best coefficients for the IHOT transformation based on the data rather than an empirically fixed band selection. The experiments confirmed that IHOT worked well in diverse landscapes, including cropland, urban areas, glaciers/snow, and desert. In addition, the computation cost of IHOT is low. It takes less than 3 min for processing one Landsat scene in our experimental platform. Our experimental program was coded in Interactive Data Language 8.0 and runs on a desktop computer with Intel Core i5-4590 central processing unit and 32-GB random access memory. Given the good performance and efficiency aforementioned, it is expected that IHOT will be used for automatic cloud/haze detection for a large number of Landsat images if data sets of clear Landsat images are available. At present, IHOT has some limitations. First, the effects of hill shade were not adequately eliminated [see Fig. 6(a)] because the hill-shade areas in the hazy and clear-sky images were different, which was caused by the differences in solar altitude at the different times. This issue might be addressed by terrain correction preprocessing. Second, some of the experimental results contained the striping noise (see Fig. 9) when the absolute values of the short-blue and blue band coefficients were large and the values had opposite signs. Because the spectral responses of land cover in the short-blue and blue bands are quite highly correlated, IHOT can remove the land surface

information and, consequently, enlarge Landsat 8 OLI sensor hardware defects. This problem may be avoided given future improvements in sensor hardware and image preprocessing technology. Third, it is true that IHOT may not work that well if there is great land cover changes between clear-sky and hazy images. However, usually, the change areas account for a small proportion of the whole image, which can be tolerated in IHOT. Fourth, IHOT only estimates the haze thickness without haze removal. The cloud removal experiment in Section III-D shows that DOS correction method can work well for the thin cloud. However, more advanced cloud removal algorithm for thick cloud should be further developed. In conclusion, the proposed IHOT method can selfadaptively find the best transformation to reflect haze signal in Landsat images acquired over diverse landscapes. As an increasing number of clear-sky images become available, IHOT potentially can be used for routinely preprocessing large amounts of Landsat data. Similarly, IHOT methods for other remote sensors with similar spectral bands could probably be developed in the future.

A PPENDIX A. THOT More Closely Approaches to Haze Information Than HOT TOA reflectance difference ΔR of a certain pixel between the hazy time and the clear-sky time can be assumed to consist of two parts, namely, the part induced by haze contamination (denoted by ΔHi for band i) and the part induced by phenological difference or land cover change (denoted by ΔDi for band i). It is expressed as ΔRi = ΔHi + ΔDi .

(A1)

Assuming that there is an ideal haze index (IHI), HOT could be expressed as HOT = IHI + εHOT

(A2)

where εHOT is the error of HOT. Thus, the regression model of (3) can be written as IHI + εHOT =

n 

ki (ΔHi + ΔDi ) + c + ε.

(A3)

ki (ΔHi + ΔDi ) + c − IHI + ε.

(A4)

i=1

The error of HOT is εHOT =

n  i=1

In addition, the error of THOT εTHOT is εTHOT = THOT − IHI =

n 

ki (ΔHi + ΔDi ) + c − IHI.

(A5)

i=1

As ε is the regression residual that is independent with the other terms, εHOT will be larger than εTHOT . It indicates that THOT more closely approaches to IHI than HOT.

CHEN et al.: IHOT FOR AUTOMATIC CLOUD/HAZE DETECTION OF LANDSAT IMAGERY

B. iHOT More Closely Approaches to Haze Information Than THOT The regression model of (5) can be written as IHI + εTHOT =

n 

ki Ri + c + ε .

(A6)

i=1

Thus, the error of THOT εTHOT is εTHOT =

n 

ki Ri + c + ε − IHI.

(A7)

i=1

In addition, the error of iHOT εiHOT is εiHOT = iHOT − IHI =

n 

ki Ri + c − IHI.

(A8)

i=1

As ε is the regression residual that is independent with the other terms, εTHOT will be larger than εiHOT . It indicates that iHOT more closely approaches to IHI than THOT. R EFERENCES [1] C. Huang et al., “An automated approach for reconstructing recent forest disturbance history using dense Landsat time series stacks,” Remote Sens. Environ., vol. 114, no. 1, pp. 183–198, Jan. 2010. [2] C. Huang et al., “Automated masking of cloud and cloud shadow for forest change analysis using Landsat images,” Int. J. Remote Sens., vol. 31, no. 20, pp. 5449–5464, Oct. 2010. [3] R. E. Kennedy, W. B. Cohen, and T. A. Schroeder, “Trajectory-based change detection for automated characterization of forest disturbance dynamics,” Remote Sens. Environ., vol. 110, no. 3, pp. 370–386, Oct. 2007. [4] J. E. Vogelmann, B. Tolk, and Z. Zhu, “Monitoring forest changes in the southwestern United States using multitemporal Landsat data,” Remote Sens. Environ., vol. 113, no. 8, pp. 1739–1748, Aug. 2009. [5] Z. Zhu, C. E. Woodcock, and P. Olofsson, “Continuous monitoring of forest disturbance using all available Landsat imagery,” Remote Sens. Environ., vol. 122, pp. 75–91, Jul. 2012. [6] G. P. Asner, “Cloud cover in Landsat observations of the Brazilian Amazon,” Int. J. Remote Sens., vol. 22, no. 18, pp. 3855–3862, 2001. [7] J. Dozier, “Spectral signature of alpine snow cover from the Landsat Thematic Mapper,” Remote Sens. Environ., vol. 28, pp. 9–22, Apr.–Jun. 1989. [8] R. R. Irish, J. L. Barker, S. N. Goward, and T. Arvidson, “Characterization of the Landsat-7 ETM+ Automated Cloud-Cover Assessment (ACCA) algorithm,” Photogramm. Eng. Remote Sens., vol. 72, no. 10, pp. 1179–1188, Oct. 2006. [9] Z. Zhu and C. E. Woodcock, “Object-based cloud and cloud shadow detection in Landsat imagery,” Remote Sens. Environ., vol. 118, pp. 83–94, Mar. 2012. [10] T. Arvidson, J. Gasch, and S. N. Goward, “Landsat 7’s long-term acquisition plan—An innovative approach to building a global imagery archive,” Remote Sens. Environ., vol. 78, no. 1/2, pp. 13–26, Oct. 2001. [11] R. R. Irish, “Landsat 7 automatic cloud cover assessment,” in Proc. Int. Soc. Opt. Photon. AeroSense, 2000, pp. 348–355. [12] J. J. Simpson and J. R. Stitt, “A procedure for the detection and removal of cloud shadow from AVHRR data over land,” IEEE Trans. Geosci. Remote Sens., vol. 36, no. 3, pp. 880–897, May 1998. [13] H. Li, L. Zhang, and H. Shen, “A principal component based haze masking method for visible images,” IEEE Trans. Geosci. Remote Sens. Lett., vol. 11, no. 5, pp. 975–979, May 2014. [14] H. Shen, H. Li, Y. Qian, L. Zhang, and Q. Yuan, “An effective thin cloud removal procedure for visible remote sensing images,” ISPRS J. Photogramm. Remote Sens., vol. 96, pp. 224–235, Oct. 2014. [15] H. Choi and R. Bindschadler, “Cloud detection in Landsat imagery of ice sheets using shadow matching technique and automatic normalized difference snow index threshold value decision,” Remote Sens. Environ., vol. 91, no. 2, pp. 237–242, May 2004. [16] Y. Zhang, B. Guindon, and J. Cihlar, “An image transform to characterize and compensate for spatial variations in thin cloud contamination of Landsat images,” Remote Sens. Environ., vol. 82, no. 2/3, pp. 173–187, Oct. 2002.

2693

[17] X. Y. He, J. B. Hu, W. Chen, and X. Li, “Haze removal based on Advanced Haze-Optimized Transformation (AHOT) for multispectral imagery,” Int. J. Remote Sens., vol. 31, no. 20, pp. 5331–5348, Jun. 2010. [18] G. D. Moro and L. Halounova, “Haze removal for high-resolution satellite data: A case study,” Int. J. Remote Sens., vol. 28, no. 10, pp. 2187–2205, May 2007. [19] I. Olthof, D. Pouliot, R. Fernandes, and R. Latifovic, “Landsat-7 ETM+ radiometric normalization comparison for northern mapping applications,” Remote Sens. Environ., vol. 95, no. 3, pp. 388–398, Apr. 2005. [20] Y. Zhang and B. Guindon, “Quantitative assessment of a haze suppression methodology for satellite imagery: Effect on land cover classification performance,” IEEE Trans. Geosci. Remote Sens., vol. 41, no. 5, pp. 1082–1089, May 2003. [21] Y. Zhang, B. Guindon, and X. Li, “A robust approach for object-based detection and radiometric characterization of cloud shadow using haze optimized transformation,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 9, pp. 5540–5547, Sep. 2014. [22] C. Liu, J. Hu, Y. Lin, S. Wu, and W. Huang, “Haze detection, perfection and removal for high spatial resolution satellite imagery,” Int. J. Remote Sens., vol. 32, no. 23, pp. 8685–8697, 2011. [23] J. Chen et al., “Global land cover mapping at 30 m resolution: A POK-based operational approach,” ISPRS J. Photogramm. Remote Sens., vol. 103, pp. 7–27, May 2015. [24] C. Homer et al., “Completion of the 2001 national land cover databased for the conterminuous United States,” Photogramm. Eng. Remote Sens., vol. 73, pp. 337–341, Apr. 2007. [25] B. L. Markham, J. C. Storey, D. L. Williams, and J. R. Irons, “Landsat sensor performance: History and current status,” IEEE Trans. Geosci. Remote Sens., vol. 42, no. 12, pp. 2691–2694, Dec. 2004. [26] K. Jia et al., “Land cover classification of finer resolution remote sensing data integrating temporal features from time series coarser resolution data,” ISPRS J. Photogramm. Remote Sens., vol. 93, pp. 49–55, Jul. 2014. [27] E. Ebert, “A pattern recognition technique for distinguishing surface and cloud types in the polar regions,” J. Climate Appl. Meteorol., vol. 26, no. 10, pp. 1412–1427, Oct. 1987. [28] K. V. Khlopenkov and A. P. Trishchenko, “SPARC: New cloud, snow, and cloud shadow detection scheme for historical 1-km AVHHR data over Canada,” J. Atmos. Ocean. Technol., vol. 24, no. 3, pp. 322–343, Mar. 2007. [29] X. Li, R. T. Pinker, M. M. Wonsick, and Y. Ma, “Toward improved satellite estimates of shortwave radiative fluxes—Focus on cloud detection over snow: 1. Methodology,” J. Geophys. Res.: Atmos., (1984–2012), vol. 112, no. D07208, Apr. 2007. [30] O. Hagolle, M. Huc, D. Villa Pascual, and G. Dedieu, “A multitemporal method for cloud detection, applied to FORMOSAT-2, VENμS, LANDSAT and SENTINEL-2 images,” Remote Sens. Environ., vol. 114, no. 8, pp. 1747–1755, Aug. 2010. [31] “Product Guide: Provisional Landsat 8 Surface Reflectance Product Version 1.6,” U.S. Geological Survey, Reston, VA, USA, 2015. [Online]. Available: http://landsat.usgs.gov/documents/provisional_l8sr_product_ guide.pdf [32] B. Markham et al., “Landsat-8 operational land imager radiometric calibration and stability,” Remote Sens., vol. 6, no. 12, pp. 12275–12308, 2014. Shuli Chen received the B.S. degree in geographical information science from Sun Yat-sen University, Guangzhou, China, in 2014. She is currently working toward the Master’s degree in the State Key Laboratory of Earth Surface Processes and Resource Ecology, Beijing Normal University, Beijing, China. Her research interests include cloud detection and removal for satellite images.

Xuehong Chen received the B.S. degree in physics and the M.S. degree in civil engineering from Beijing Normal University, Beijing, China, in 2006 and 2009, respectively, and the Ph.D. degree in earth and environmental science from Nagoya University, Nagoya, Japan, in 2012. He is currently an Associate Professor with the State Key Laboratory of Earth Surface Processes and Resource Ecology, Beijing Normal University. His research interests include cloud removal of satellite images and land cover mapping by remote sensing.

2694

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 54, NO. 5, MAY 2016

Jin Chen received the B.S. and M.S. degrees in geography from Beijing Normal University, Beijing, China, in 1989 and 1992, respectively, and the Ph.D. degree in civil engineering from Kyushu University, Fukuoka, Japan, in 2000. From 2000 to 2001, he was a Postdoctoral Fellow with the University of California, Berkeley, CA, USA, and from 2001 to 2004, he was with the National Institute of Environmental Studies, Tsukuba, Japan. He is currently a Professor with the State Key Laboratory of Earth Surface Processes and Resource Ecology, Beijing Normal University. His research interests include remote sensing modeling and vegetation parameter retrieval through the inversion of remote sensing model.

Pengfei Jia received the B.S. and M.S. degrees in mathematics from Beijing Normal University, Beijing, China, in 2010 and 2013, respectively. He is currently working toward the Ph.D. degree in the College of Global Change and Earth System Science, Beijing Normal University. His research interests include health-related temporal and spatial analyses and mathematic modeling in public health problem, such as mosquito dynamic population or dengue fever transportation under the background of climate change.

Xin Cao received the B.S. and M.S. degrees in geography from Beijing Normal University, Beijing, China, in 2002 and 2005, respectively, and the Ph.D. degree in environmental engineering from Nagoya University, Nagoya, Japan, in 2008. In 2008–2009, he was a Researcher with Nagoya University. He is currently an Associate Professor with the State Key Laboratory of Earth Surface Processes and Resource Ecology, Beijing Normal University. His research interests include urban remote sensing and remote sensing modeling in grassland.

Canyou Liu received the B.E., M.E., and D.E. degrees from PLA Information Engineering University, Zhengzhou, China, in 2007, 2010, and 2013, respectively, all in geographic information system (GIS). He is currently an Engineer with Xi’an Satellite Control Centre, Xi’an, China. His research interests include cloud computing; image processing; GIS; and spacecraft tracking, telemetering, and control.